Esther Pacitti – författare
Big Social Data and Urban Computing
First Workshop, BiDU 2018, Rio de Janeiro, Brazil, August 31, 2018, Revised Selected Papers
565 kr
Skickas inom 10-15 vardagar
708 kr
Läs direkt efter köp
677 kr
Skickas inom 10-15 vardagar
375 kr
Skickas inom 10-15 vardagar
865 kr
Läs direkt efter köp
Workflows may be defined as abstractions used to model the coherent flow of activities in the context of an in silico scientific experiment. They are employed in many domains of science such as bioinformatics, astronomy, and engineering. Such workflows usually present a considerable number of activities and activations (i.e., tasks associated with activities) and may need a long time for execution. Due to the continuous need to store and process data efficiently (making them data-intensive workflows), high-performance computing environments allied to parallelization techniques are used to run these workflows. At the beginning of the 2010s, cloud technologies emerged as a promising environment to run scientific workflows. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines.
More recently, Data-Intensive Scalable Computing (DISC) frameworks (e.g., Apache Spark and Hadoop) and environments emerged and are being used to execute data-intensive workflows. DISC environments are composed of processors and disks in large-commodity computing clusters connected using high-speed communications switches and networks. The main advantage of DISC frameworks is that they support and grant efficient in-memory data management for large-scale applications, such as data-intensive workflows. However, the execution of workflows in cloud and DISC environments raise many challenges such as scheduling workflow activities and activations, managing produced data, collecting provenance data, etc.
Several existing approaches deal with the challenges mentioned earlier. This way, there is a real need for understanding how to manage these workflows and various big data platforms that have been developed and introduced. As such, this book can help researchers understand how linking workflow management with Data-Intensive Scalable Computing can help in understanding and analyzing scientific big data.
In this book, we aim to identify and distill the body of work on workflow management in clouds and DISC environments. We start by discussing the basic principles of data-intensive scientific workflows. Next, we present two workflows that are executed in a single site and multi-site clouds taking advantage of provenance. Afterward, we go towards workflow management in DISC environments, and we present, in detail, solutions that enable the optimized execution of the workflow using frameworks such as Apache Spark and its extensions.
441 kr
Läs direkt efter köp
509 kr
Skickas inom 10-15 vardagar
629 kr
Läs direkt efter köp
This book is dedicated to exploring and explaining time series event detection in databases. The focus is on events, which are pervasive in time series applications where significant changes in behavior are observed at specific points or time intervals. Event detection is a basic function in surveillance and monitoring systems and has been extensively explored over the years, but this book provides a unified overview of the major types of time series events with which researchers should be familiar: anomalies, change points, and motifs. The book starts with basic concepts of time series and presents a general taxonomy for event detection. This taxonomy includes (i) granularity of events (punctual, contextual, and collective), (ii) general strategies (regression, classification, clustering, model-based), (iii) methods (theory-driven, data-driven), (iv) machine learning processing (supervised, semi-supervised, unsupervised), and (v) data management (ETL process). This taxonomy is weaved throughout chapters dedicated to the specific event types: anomaly detection, change-point, and motif discovery. The book discusses state-of-the-art metric evaluations for event detection methods and also provides a dedicated chapter on online event detection, including the challenges and general approaches (static versus dynamic), including incremental and adaptive learning. This book will be of interested to graduate or undergraduate students of different fields with a basic introduction to data science or data analytics.
509 kr
Skickas inom 10-15 vardagar
1 123 kr
Skickas inom 10-15 vardagar
1 408 kr
Läs direkt efter köp
565 kr
Skickas inom 10-15 vardagar
718 kr
Läs direkt efter köp
The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments.
This, the 33rd issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains five revised selected regular papers. Topics covered include distributed massive data streams, storage systems, scientific workflow scheduling, cost optimization of data flows, and fusion strategies.
565 kr
Skickas inom 10-15 vardagar
708 kr
Läs direkt efter köp
This, the 51st issue of Transactions on Large-Scale Data and Knowledge-Centered Systems, contains five fully revised selected regular papers. Topics covered include data anonyomaly detection, schema generation, optimizing data coverage, and digital preservationwith synthetic DNA.