Acceder

Bibliografías temáticas / Distributed Stream Processing Systems / Artículos de revistas

Artículos de revistas sobre el tema "Distributed Stream Processing Systems"

Siga este enlace para ver otros tipos de publicaciones sobre el tema: Distributed Stream Processing Systems.

Autor: Grafiati

Publicado: 6 de septiembre de 2023

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "Distributed Stream Processing Systems".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.

1

K, Sornalakshmi. "Dynamic Operator Scaling for Distributed Stream Processing Systems for Fluctuating Streams". Journal of Advanced Research in Dynamical and Control Systems 12, SP7 (25 de julio de 2020): 2815–21. http://dx.doi.org/10.5373/jardcs/v12sp7/20202422.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Wei, Xiaohui, Yuan Zhuang, Hongliang Li y Zhiliang Liu. "Reliable stream data processing for elastic distributed stream processing systems". Cluster Computing 23, n.º 2 (21 de mayo de 2019): 555–74. http://dx.doi.org/10.1007/s10586-019-02939-9.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Shuiying Yu, Shuiying Yu, Yinting Zheng Shuiying Yu, Fan Zhang Yinting Zheng, Hanhua Chen Fan Zhang y Hai Jin Hanhua Chen. "TriJoin: A Time-Efficient and Scalable Three-Way Distributed Stream Join System". 網際網路技術學刊 24, n.º 2 (marzo de 2023): 475–85. http://dx.doi.org/10.53106/160792642023032402024.

Texto completo

Resumen

<p>Stream join is one of the most fundamental operations in data stream processing applications. Existing distributed stream join systems can support efficient two-way join, which is a join operation between two streams. Based the two-way join, implementing a three-way join require to be split into double two-way joins, where the second two-way join needs to wait for the join result transmitted from the first two-way join. We show through experiments that such a design raises prohibitively high processing latency. To solve this problem, we propose TriJoin, a time-efficient three-way distributed stream join system. We design a symmetric wait-free structure by symmetrically partitioning tuples and reused join. TriJoin utilizes reused join to join each new tuple with the intermediate result of the other two streams and stored tuples locally. For a new tuple, TriJoin only joins it with the intermediate result to generate the final result without waiting, greatly reducing the processing latency. In TriJoin, we design two partitioning and storage schemes according to two different forms of three-way stream join. We implement TriJoin and conduct comprehensive experiments to evaluate the performance using real-world traces. Results show that TriJoin significantly reduces the processing latency by up to 68%, compared to existing designs.</p> <p> </p>

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Shukla, Anshu y Yogesh Simmhan. "Model-driven scheduling for distributed stream processing systems". Journal of Parallel and Distributed Computing 117 (julio de 2018): 98–114. http://dx.doi.org/10.1016/j.jpdc.2018.02.003.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Bernardelli de Moraes, Matheus y André Leon Sampaio Gradvohl. "Evaluating the impact of a coordinated checkpointing in distributed data streams processing systems using discrete event simulation". Revista Brasileira de Computação Aplicada 12, n.º 2 (19 de mayo de 2020): 16–27. http://dx.doi.org/10.5335/rbca.v12i2.10295.

Texto completo

Resumen

Data Streams Processing systems process continuous flows of data under Quality of Service requirements. Data streams often contain critical information which requires real-time processing. To guarantee systems' dependability and avoid information loss, one must use a fault-tolerance strategy. However, there are several strategies available, and the proper evaluation of which mechanism is better for each system architecture is challenging, especially in large-scale distributed systems. In this paper, we propose a discrete simulation model for investigating the impacts of the Coordinated Checkpoint fault tolerance strategy imposes on Data Stream Processing Systems. Results show that this strategy critically affects stream processing in failure-prone situations due to an increase in latency up to 120% and information loss, reaching 95% of the processing window in the worst case.

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Tran, Tri Minh y Byung Suk Lee. "Distributed stream join query processing with semijoins". Distributed and Parallel Databases 27, n.º 3 (6 de marzo de 2010): 211–54. http://dx.doi.org/10.1007/s10619-010-7062-7.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Hildrum, Kirsten, Fred Douglis, Joel L. Wolf, Philip S. Yu, Lisa Fleischer y Akshay Katta. "Storage optimization for large-scale distributed stream-processing systems". ACM Transactions on Storage 3, n.º 4 (febrero de 2008): 1–28. http://dx.doi.org/10.1145/1326542.1326547.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Eskandari, Leila, Jason Mair, Zhiyi Huang y David Eyers. "I-Scheduler: Iterative scheduling for distributed stream processing systems". Future Generation Computer Systems 117 (abril de 2021): 219–33. http://dx.doi.org/10.1016/j.future.2020.11.011.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Liu, Xunyun y Rajkumar Buyya. "Resource Management and Scheduling in Distributed Stream Processing Systems". ACM Computing Surveys 53, n.º 3 (5 de julio de 2020): 1–41. http://dx.doi.org/10.1145/3355399.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Shukla, Anshu, Shilpa Chaturvedi y Yogesh Simmhan. "RIoTBench: An IoT benchmark for distributed stream processing systems". Concurrency and Computation: Practice and Experience 29, n.º 21 (4 de octubre de 2017): e4257. http://dx.doi.org/10.1002/cpe.4257.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

11

Valeev, S. S., N. V. Kondratyeva, A. S. Kovtunenko, M. A. Timirov y R. R. Karimov. "Distributed stream data processing system in multi-agent safety system of infrastructure objects". Information Technology and Nanotechnology, n.º 2416 (2019): 324–31. http://dx.doi.org/10.18287/1613-0073-2019-2416-324-331.

Texto completo

Resumen

The solution of the problem of resource management in distributed computing systems of processing stream data in safety systems of distributed objects is considered. The tasks of streaming data processing in a multi-level multi-agent evacuation system in an infrastructure object are considered. The features of the mathematical model of a distributed stream data processing system are discussed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

12

EITER, THOMAS, PAUL OGRIS y KONSTANTIN SCHEKOTIHIN. "A Distributed Approach to LARS Stream Reasoning (System paper)". Theory and Practice of Logic Programming 19, n.º 5-6 (septiembre de 2019): 974–89. http://dx.doi.org/10.1017/s1471068419000309.

Texto completo

Resumen

AbstractStream reasoning systems are designed for complex decision-making from possibly infinite, dynamic streams of data. Modern approaches to stream reasoning are usually performing their computations using stand-alone solvers, which incrementally update their internal state and return results as the new portions of data streams are pushed. However, the performance of such approaches degrades quickly as the rates of the input data and the complexity of decision problems are growing. This problem was already recognized in the area of stream processing, where systems became distributed in order to allocate vast computing resources provided by clouds. In this paper we propose a distributed approach to stream reasoning that can efficiently split computations among different solvers communicating their results over data streams. Moreover, in order to increase the throughput of the distributed system, we suggest an interval-based semantics for the LARS language, which enables significant reductions of network traffic. Performed evaluations indicate that the distributed stream reasoning significantly outperforms existing stand-alone LARS solvers when the complexity of decision problems and the rate of incoming data are increasing.

Los estilos APA, Harvard, Vancouver, ISO, etc.

13

Xiao, Fuyuan, Cheng Zhan, Hong Lai, Li Tao y Zhiguo Qu. "New parallel processing strategies in complex event processing systems with data streams". International Journal of Distributed Sensor Networks 13, n.º 8 (agosto de 2017): 155014771772862. http://dx.doi.org/10.1177/1550147717728626.

Texto completo

Resumen

Sensor network–based application has gained increasing attention where data streams gathered from distributed sensors need to be processed and analyzed with timely responses. Distributed complex event processing is an effective technology to handle these data streams by matching of incoming events to persistent pattern queries. Therefore, a well-managed parallel processing scheme is required to improve both system performance and the quality-of-service guarantees of the system. However, the specific properties of pattern operators increase the difficulties of implementing parallel processing. To address this issue, a new parallelization model and three parallel processing strategies are proposed for distributed complex event processing systems. The effects of temporal constraints, for example, sliding windows, are included in the new parallelization model to enable the processing load for the overlap between windows of a batch induced by each input event to be shared by the downstream machines to avoid events that may result in wrong decisions. The proposed parallel strategies can keep the complex event processing system working stably and continuously during the elapsed time. Finally, the application of our work is demonstrated using experiments on the StreamBase system regardless of the increased input rate of the stream or the increased time window size of the operator.

Los estilos APA, Harvard, Vancouver, ISO, etc.

14

Balazinska, Magdalena, Hari Balakrishnan, Samuel R. Madden y Michael Stonebraker. "Fault-tolerance in the borealis distributed stream processing system". ACM Transactions on Database Systems 33, n.º 1 (marzo de 2008): 1–44. http://dx.doi.org/10.1145/1331904.1331907.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

15

Cardellini, Valeria, Vincenzo Grassi, Francesco Lo Presti y Matteo Nardelli. "Optimal Operator Replication and Placement for Distributed Stream Processing Systems". ACM SIGMETRICS Performance Evaluation Review 44, n.º 4 (10 de mayo de 2017): 11–22. http://dx.doi.org/10.1145/3092819.3092823.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

16

Repantis, T., Xiaohui Gu y V. Kalogeraki. "QoS-Aware Shared Component Composition for Distributed Stream Processing Systems". IEEE Transactions on Parallel and Distributed Systems 20, n.º 7 (julio de 2009): 968–82. http://dx.doi.org/10.1109/tpds.2008.165.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

17

Rank, Johannes, Jonas Herget, Andreas Hein y Helmut Krcmar. "Evaluating Task-Level CPU Efficiency for Distributed Stream Processing Systems". Big Data and Cognitive Computing 7, n.º 1 (10 de marzo de 2023): 49. http://dx.doi.org/10.3390/bdcc7010049.

Texto completo

Resumen

Big Data and primarily distributed stream processing systems (DSPSs) are growing in complexity and scale. As a result, effective performance management to ensure that these systems meet the required service level objectives (SLOs) is becoming increasingly difficult. A key factor to consider when evaluating the performance of a DSPS is CPU efficiency, which is the ratio of the workload processed by the system to the CPU resources invested. In this paper, we argue that developing new performance tools for creating DSPSs that can fulfill SLOs while using minimal resources is crucial. This is especially significant in edge computing situations where resources are limited and in large cloud deployments where conserving power and reducing computing expenses are essential. To address this challenge, we present a novel task-level approach for measuring CPU efficiency in DSPSs. Our approach supports various streaming frameworks, is adaptable, and comes with minimal overheads. This enables developers to understand the efficiency of different DSPSs at a granular level and provides insights that were not previously possible.

Los estilos APA, Harvard, Vancouver, ISO, etc.

18

Akanbi, Adeyinka y Muthoni Masinde. "A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring". Sensors 20, n.º 11 (3 de junio de 2020): 3166. http://dx.doi.org/10.3390/s20113166.

Texto completo

Resumen

In recent years, the application and wide adoption of Internet of Things (IoT)-based technologies have increased the proliferation of monitoring systems, which has consequently exponentially increased the amounts of heterogeneous data generated. Processing and analysing the massive amount of data produced is cumbersome and gradually moving from classical ‘batch’ processing—extract, transform, load (ETL) technique to real-time processing. For instance, in environmental monitoring and management domain, time-series data and historical dataset are crucial for prediction models. However, the environmental monitoring domain still utilises legacy systems, which complicates the real-time analysis of the essential data, integration with big data platforms and reliance on batch processing. Herein, as a solution, a distributed stream processing middleware framework for real-time analysis of heterogeneous environmental monitoring and management data is presented and tested on a cluster using open source technologies in a big data environment. The system ingests datasets from legacy systems and sensor data from heterogeneous automated weather systems irrespective of the data types to Apache Kafka topics using Kafka Connect APIs for processing by the Kafka streaming processing engine. The stream processing engine executes the predictive numerical models and algorithms represented in event processing (EP) languages for real-time analysis of the data streams. To prove the feasibility of the proposed framework, we implemented the system using a case study scenario of drought prediction and forecasting based on the Effective Drought Index (EDI) model. Firstly, we transform the predictive model into a form that could be executed by the streaming engine for real-time computing. Secondly, the model is applied to the ingested data streams and datasets to predict drought through persistent querying of the infinite streams to detect anomalies. As a conclusion of this study, a performance evaluation of the distributed stream processing middleware infrastructure is calculated to determine the real-time effectiveness of the framework.

Los estilos APA, Harvard, Vancouver, ISO, etc.

19

XIAO, Fuyuan, Teruaki KITASUKA y Masayoshi ARITSUGI. "Economical and Fault-Tolerant Load Balancing in Distributed Stream Processing Systems". IEICE Transactions on Information and Systems E95-D, n.º 4 (2012): 1062–73. http://dx.doi.org/10.1587/transinf.e95.d.1062.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

20

Li, Yue-jie. "Data Stream of Wireless Sensor Networks Based on Deep Learning". International Journal of Online Engineering (iJOE) 12, n.º 11 (24 de noviembre de 2016): 22. http://dx.doi.org/10.3991/ijoe.v12i11.6232.

Texto completo

Resumen

The sensor data in wireless sensor networks are continuously arriving in multiple, rapid, time varying, possibly unpredictable, unbounded streams, and no record of historical information is kept. These limitations make conventional Database Management Systems and their evolution unsuitable for streams. Thereby there is a need to build a complete Data Streaming Management System (DSMS), which could process streams and perform dynamic continuous query processing. In this paper, a framework for Adaptive Distributed Data Streaming Management System (ADDSMS) is presented, which operates as streams control interface between arrays of distributed data stream sources and end-user clients who access and analyze these streams. Simulation results show that the proposed method can thus improve overall system performance substantially.

Los estilos APA, Harvard, Vancouver, ISO, etc.

21

Henning, Sören y Wilhelm Hasselbring. "Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures". Big Data Research 25 (julio de 2021): 100209. http://dx.doi.org/10.1016/j.bdr.2021.100209.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

22

Ivanov, Yurii, Borys Sharov, Nazar Zalevskyi y Ostap Kernytskyi. "Software System for End-Products Accounting in Bakery Production Lines Based on Distributed Video Streams Analysis". Advances in Cyber-Physical Systems 7, n.º 2 (16 de diciembre de 2022): 101–7. http://dx.doi.org/10.23939/acps2022.02.101.

Texto completo

Resumen

Among the main requirements of modern surveillance systems are stability in the face of negative influences and intellectualization. The purpose of intellectualization is that the surveillance system should perform not only the main functions such as monitoring and stream recording but also have to provide effective stream processing. The requirement for this processing is that the system operation has to be automated, and the operator's influence should be minimal. Modern intelligent surveillance systems require the development of grouping methods. The context of the grouping method here is associated with a decomposition of the target problem. Depending on the purpose of the system, the target problem can represent several subproblems, each of which usually accomplishes by artificial intelligence or data mining methods.

Los estilos APA, Harvard, Vancouver, ISO, etc.

23

Bordin, Maycon Viana, Dalvan Griebler, Gabriele Mencagli, Claudio F. R. Geyer y Luiz Gustavo L. Fernandes. "DSPBench: A Suite of Benchmark Applications for Distributed Data Stream Processing Systems". IEEE Access 8 (2020): 222900–222917. http://dx.doi.org/10.1109/access.2020.3043948.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

24

Llaves, Alejandro, Oscar Corcho, Peter Taylor y Kerry Taylor. "Enabling RDF Stream Processing for Sensor Data Management in the Environmental Domain". International Journal on Semantic Web and Information Systems 12, n.º 4 (octubre de 2016): 1–21. http://dx.doi.org/10.4018/ijswis.2016100101.

Texto completo

Resumen

This paper presents a generic approach to integrate environmental sensor data efficiently, allowing the detection of relevant situations and events in near real-time through continuous querying. Data variety is addressed with the use of the Semantic Sensor Network ontology for observation data modelling, and semantic annotations for environmental phenomena. Data velocity is handled by distributing sensor data messaging and serving observations as RDF graphs on query demand. The stream processing engine presented in the paper, morph-streams++, provides adapters for different data formats and distributed processing of streams in a cluster. An evaluation of different configurations for parallelization and semantic annotation parameters proves that the described approach reduces the average latency of message processing in some cases.

Los estilos APA, Harvard, Vancouver, ISO, etc.

25

Yang, Dingyu, Jianmei Guo, Zhi-Jie Wang, Yuan Wang, Jingsong Zhang, Liang Hu, Jian Yin y Jian Cao. "FastPM: An approach to pattern matching via distributed stream processing". Information Sciences 453 (julio de 2018): 263–80. http://dx.doi.org/10.1016/j.ins.2018.04.031.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

26

Pishgoo, Boshra, Ahmad Akbari Azirani y Bijan Raahemi. "A hybrid distributed batch-stream processing approach for anomaly detection". Information Sciences 543 (enero de 2021): 309–27. http://dx.doi.org/10.1016/j.ins.2020.07.026.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

27

Kim, Yoon-Ki y Yongsung Kim. "DiPLIP: Distributed Parallel Processing Platform for Stream Image Processing Based on Deep Learning Model Inference". Electronics 9, n.º 10 (13 de octubre de 2020): 1664. http://dx.doi.org/10.3390/electronics9101664.

Texto completo

Resumen

Recently, as the amount of real-time video streaming data has increased, distributed parallel processing systems have rapidly evolved to process large-scale data. In addition, with an increase in the scale of computing resources constituting the distributed parallel processing system, the orchestration of technology has become crucial for proper management of computing resources, in terms of allocating computing resources, setting up a programming environment, and deploying user applications. In this paper, we present a new distributed parallel processing platform for real-time large-scale image processing based on deep learning model inference, called DiPLIP. It provides a scheme for large-scale real-time image inference using buffer layer and a scalable parallel processing environment according to the size of the stream image. It allows users to easily process trained deep learning models for processing real-time images in a distributed parallel processing environment at high speeds, through the distribution of the virtual machine container.

Los estilos APA, Harvard, Vancouver, ISO, etc.

28

Xiao, Fuyuan y Masayoshi Aritsugi. "An Adaptive Parallel Processing Strategy for Complex Event Processing Systems over Data Streams in Wireless Sensor Networks". Sensors 18, n.º 11 (2 de noviembre de 2018): 3732. http://dx.doi.org/10.3390/s18113732.

Texto completo

Resumen

Efficient matching of incoming events of data streams to persistent queries is fundamental to event stream processing systems in wireless sensor networks. These applications require dealing with high volume and continuous data streams with fast processing time on distributed complex event processing (CEP) systems. Therefore, a well-managed parallel processing technique is needed for improving the performance of the system. However, the specific properties of pattern operators in the CEP systems increase the difficulties of the parallel processing problem. To address these issues, a parallelization model and an adaptive parallel processing strategy are proposed for the complex event processing by introducing a histogram and utilizing the probability and queue theory. The proposed strategy can estimate the optimal event splitting policy, which can suit the most recent workload conditions such that the selected policy has the least expected waiting time for further processing of the arriving events. The proposed strategy can keep the CEP system running fast under the variation of the time window sizes of operators and the input rates of streams. Finally, the utility of our work is demonstrated through the experiments on the StreamBase system.

Los estilos APA, Harvard, Vancouver, ISO, etc.

29

Alshamrani, Sultan, Hesham Alhumyani, Quadri Waseem y Isbudeen Noor Mohamed. "High availability of data using Automatic Selection Algorithm (ASA) in distributed stream processing systems". Bulletin of Electrical Engineering and Informatics 8, n.º 2 (1 de junio de 2019): 690–98. http://dx.doi.org/10.11591/eei.v8i2.1414.

Texto completo

Resumen

High Availability of data is one of the most critical requirements of a distributed stream processing systems (DSPS). We can achieve high availability using available recovering techniques, which include (active backup, passive backup and upstream backup). Each recovery technique has its own advantages and disadvantages. They are used for different type of failures based on the type and the nature of the failures. This paper presents an Automatic Selection Algorithm (ASA) which will help in selecting the best recovery techniques based on the type of failures. We intend to use together all different recovery approaches available (i.e., active standby, passive standby, and upstream standby) at nodes in a distributed stream-processing system (DSPS) based upon the system requirements and a failure type). By doing this, we will achieve all benefits of fastest recovery, precise recovery and a lower runtime overhead in a single solution. We evaluate our automatic selection algorithm (ASA) approach as an algorithm selector during the runtime of stream processing. Moreover, we also evaluated its efficiency in comparison with the time factor. The experimental results show that our approach is 95% efficient and fast than other conventional manual failure recovery approaches and is hence totally automatic in nature.

Los estilos APA, Harvard, Vancouver, ISO, etc.

30

Jayasekara, Sachini, Aaron Harwood y Shanika Karunasekera. "A utilization model for optimization of checkpoint intervals in distributed stream processing systems". Future Generation Computer Systems 110 (septiembre de 2020): 68–79. http://dx.doi.org/10.1016/j.future.2020.04.019.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

31

Ye, Qian y Minyan Lu. "s2p: Provenance Research for Stream Processing System". Applied Sciences 11, n.º 12 (15 de junio de 2021): 5523. http://dx.doi.org/10.3390/app11125523.

Texto completo

Resumen

The main purpose of our provenance research for DSP (distributed stream processing) systems is to analyze abnormal results. Provenance for these systems is not nontrivial because of the ephemerality of stream data and instant data processing mode in modern DSP systems. Challenges include but are not limited to an optimization solution for avoiding excessive runtime overhead, reducing provenance-related data storage, and providing it in an easy-to-use fashion. Without any prior knowledge about which kinds of data may finally lead to the abnormal, we have to track all transformations in detail, which potentially causes hard system burden. This paper proposes s2p (Stream Process Provenance), which mainly consists of online provenance and offline provenance, to provide fine- and coarse-grained provenance in different precision. We base our design of s2p on the fact that, for a mature online DSP system, the abnormal results are rare, and the results that require a detailed analysis are even rarer. We also consider state transition in our provenance explanation. We implement s2p on Apache Flink named as s2p-flink and conduct three experiments to evaluate its scalability, efficiency, and overhead from end-to-end cost, throughput, and space overhead. Our evaluation shows that s2p-flink incurs a 13% to 32% cost overhead, 11% to 24% decline in throughput, and few additional space costs in the online provenance phase. Experiments also demonstrates the s2p-flink can scale well. A case study is presented to demonstrate the feasibility of the whole s2p solution.

Los estilos APA, Harvard, Vancouver, ISO, etc.

32

Ni, Xiang, Jing Li, Mo Yu, Wang Zhou y Kun-Lung Wu. "Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 01 (3 de abril de 2020): 857–64. http://dx.doi.org/10.1609/aaai.v34i01.5431.

Texto completo

Resumen

This paper considers the problem of resource allocation in stream processing, where continuous data flows must be processed in real time in a large distributed system. To maximize system throughput, the resource allocation strategy that partitions the computation tasks of a stream processing graph onto computing devices must simultaneously balance workload distribution and minimize communication. Since this problem of graph partitioning is known to be NP-complete yet crucial to practical streaming systems, many heuristic-based algorithms have been developed to find reasonably good solutions. In this paper, we present a graph-aware encoder-decoder framework to learn a generalizable resource allocation strategy that can properly distribute computation tasks of stream processing graphs unobserved from training data. We, for the first time, propose to leverage graph embedding to learn the structural information of the stream processing graphs. Jointly trained with the graph-aware decoder using deep reinforcement learning, our approach can effectively find optimized solutions for unseen graphs. Our experiments show that the proposed model outperforms both METIS, a state-of-the-art graph partitioning algorithm, and an LSTM-based encoder-decoder model, in about 70% of the test cases.

Los estilos APA, Harvard, Vancouver, ISO, etc.

33

Poźniak, Krzysztof. "Modeling of Synchronous Data Streams Processing in the RPC Muon Trigger System of the CMS Experiment". International Journal of Electronics and Telecommunications 56, n.º 4 (1 de noviembre de 2010): 489–502. http://dx.doi.org/10.2478/v10177-010-0067-3.

Texto completo

Resumen

Modeling of Synchronous Data Streams Processing in the RPC Muon Trigger System of the CMS ExperimentThis paper presents signal synchronization aspects in a large, distributed, multichannel RPC Muon Trigger system in the CMS experiment. The paper is an introduction to normalized structure analysis methods of such systems. The method introduces a general model of the system, presented in a form of a network of distributed, synchronous, pipeline processes. The model is based on a definition of a synchronous data stream and its formal, fundamental properties. Theoretical considerations are supported by a practical application of synchronous streams and processes management. The following processes were modeled and implemented in hardware: window synchronization, derandomization, data concentration and generation of test pulses. There are presented chosen results of the model application in the CMS experiment.

Los estilos APA, Harvard, Vancouver, ISO, etc.

34

Imran, Muhammad, Gábor E. Gévay, Jorge-Arnulfo Quiané-Ruiz y Volker Markl. "Fast datalog evaluation for batch and stream graph processing". World Wide Web 25, n.º 2 (20 de enero de 2022): 971–1003. http://dx.doi.org/10.1007/s11280-021-00960-w.

Texto completo

Resumen

AbstractImplementing complex algorithms for big data, artificial intelligence, and graph processing requires enormous effort. Succinct, declarative programs to solve complex problems that can be efficiently executed for batching and streaming data are in demand. This paper presents Nexus, a distributed Datalog evaluation system. It evaluates Datalog programs using the semi-naive algorithm for batch and streaming data using incremental and asynchronous iteration. Furthermore, we evaluate Datalog programs with aggregates to determine the advantages of implementing the semi-naive algorithm using incremental iteration on its performance. Our experimental results show that Nexus significantly outperforms acyclic dataflow-based systems.

Los estilos APA, Harvard, Vancouver, ISO, etc.

35

Ye, Qian y Minyan Lu. "SPOT: Testing Stream Processing Programs with Symbolic Execution and Stream Synthesizing". Applied Sciences 11, n.º 17 (30 de agosto de 2021): 8057. http://dx.doi.org/10.3390/app11178057.

Texto completo

Resumen

Adoption of distributed stream processing (DSP) systems such as Apache Flink in real-time big data processing is increasing. However, DSP programs are prone to be buggy, especially when one programmer neglects some DSP features (e.g., source data reordering), which motivates development of approaches for testing and verification. In this paper, we focus on the test data generation problem for DSP programs. Currently, there is a lack of an approach that generates test data for DSP programs with both high path coverage and covering different stream reordering situations. We present a novel solution, SPOT (i.e., Stream Processing Program Test), to achieve these two goals simultaneously. At first, SPOT generates a set of individual test data representing each path of one DSP program through symbolic execution. Then, SPOT composes these independent data into various time series data (a.k.a, stream) in diverse reordering. Finally, we can perform a test by feeding the DSP program with these streams continuously. To automatically support symbolic analysis, we also developed JPF-Flink, a JPF (i.e., Java Pathfinder) extension to coordinate the execution of Flink programs. We present four case studies to illustrate that: (1) SPOT can support symbolic analysis for the commonly used DSP operators; (2) test data generated by SPOT can more efficiently achieve high JDU (i.e., Joint Dataflow and UDF) path coverage than two recent DSP testing approaches; (3) test data generated by SPOT can more easily trigger software failure when comparing with those two DSP testing approaches; and (4) the data randomly generated by those two test techniques are highly skewed in terms of stream reordering, which is measured by the entropy metric. In comparison, it is even for test data from SPOT.

Los estilos APA, Harvard, Vancouver, ISO, etc.

36

Russo Russo, Gabriele, Matteo Nardelli, Valeria Cardellini y Francesco Lo Presti. "Multi-Level Elasticity for Wide-Area Data Streaming Systems: A Reinforcement Learning Approach". Algorithms 11, n.º 9 (7 de septiembre de 2018): 134. http://dx.doi.org/10.3390/a11090134.

Texto completo

Resumen

The capability of efficiently processing the data streams emitted by nowadays ubiquitous sensing devices enables the development of new intelligent services. Data Stream Processing (DSP) applications allow for processing huge volumes of data in near real-time. To keep up with the high volume and velocity of data, these applications can elastically scale their execution on multiple computing resources to process the incoming data flow in parallel. Being that data sources and consumers are usually located at the network edges, nowadays the presence of geo-distributed computing resources represents an attractive environment for DSP. However, controlling the applications and the processing infrastructure in such wide-area environments represents a significant challenge. In this paper, we present a hierarchical solution for the autonomous control of elastic DSP applications and infrastructures. It consists of a two-layered hierarchical solution, where centralized components coordinate subordinated distributed managers, which, in turn, locally control the elastic adaptation of the application components and deployment regions. Exploiting this framework, we design several self-adaptation policies, including reinforcement learning based solutions. We show the benefits of the presented self-adaptation policies with respect to static provisioning solutions, and discuss the strengths of reinforcement learning based approaches, which learn from experience how to optimize the application performance and resource allocation.

Los estilos APA, Harvard, Vancouver, ISO, etc.

37

Park, Jun Pyo, Chang-Sup Park y Yon Dohn Chung. "Energy and Latency Efficient Access of Wireless XML Stream". Journal of Database Management 21, n.º 1 (enero de 2010): 58–79. http://dx.doi.org/10.4018/jdm.2010112303.

Texto completo

Resumen

In this article, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data that enables energy and latencyefficient broadcasting of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains the tag name, attributes, and text content of an element, as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the stream with shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing in the mobile clients. Through extensive experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

Los estilos APA, Harvard, Vancouver, ISO, etc.

38

Rajaguru D., Puviyarasi T. y Vengattaraman T. "Malicious Data Stream Identification to Improve the Resource Elasticity of Handheld Edge Computing System". International Journal of Handheld Computing Research 8, n.º 4 (octubre de 2017): 30–39. http://dx.doi.org/10.4018/ijhcr.2017100103.

Texto completo

Resumen

This article lights the need for the identification of resource elasticity in handheld edge computing systems and its related issues. Under a few developing application situations, for example, in urban areas, operational checking of huge foundations, wearable help, and the Internet of Things, nonstop information streams must be prepared under short postponements. A few arrangements, including various programming motors, have been created for handling unbounded information streams in an adaptive and productive way. As of late, designs have been proposed to utilize edge processing for information stream handling. This article reviews the cutting-edge stream preparing motors and systems for misusing asset versatility which highlights distributed computing in stream preparation. Asset flexibility takes into consideration an application or administration to scale out/in as per fluctuating requests. Flexibility turns out to be much more difficult in conveyed conditions involving edge and distributed computing assets. Device security is one of the real difficulties for fruitful execution of the Internet of Things and fog figuring conditions in the current IT space. Specialists and information technology (IT) associations have investigated numerous answers for shield frameworks from unauthenticated device assaults. Fog registering utilizes organize devices (e.g. switch, switch and center) for dormancy mindful handling of gathered information utilizing IoT. This article concludes with the various process for improvising the resource elasticity of handheld devices for leading the communication to the next stage of computing.

Los estilos APA, Harvard, Vancouver, ISO, etc.

39

Cheng, Zhinan, Qun Huang y Patrick P. C. Lee. "On the performance and convergence of distributed stream processing via approximate fault tolerance". VLDB Journal 28, n.º 5 (3 de septiembre de 2019): 821–46. http://dx.doi.org/10.1007/s00778-019-00565-w.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

40

Ageev, Aleksey Vladimirovich, Andrey Alexandrovich Boguslavskiy y Sergey Mikhailovich Sokolov. "Task scheduling in the onboard computer system". Keldysh Institute Preprints, n.º 43 (2023): 1–27. http://dx.doi.org/10.20948/prepr-2023-43.

Texto completo

Resumen

The problem of rational resource allocation in the on-board computing system of a robotic complex is considered. As a first step, the possibility of using online scheduling algorithms without preemptive for distributed systems, the Round Robin cyclic algorithm, is analyzed. To demonstrate the basic capabilities of the developed scheduler, a video stream segmentation task is used. The peculiarities of task processing for real-time vision systems are demonstrated. The problem of inter-node synchronization of sensor data is solved. A feature of on-board robotics resources, such as the need for a linking software in the form of Robot Operation System, is taken into account. To develop the task scheduler, the C++ programming language and the ROS2 framework, which provides asynchronous networking, are used. A scheduling model and software implementing this model are being built to perform tasks in a distributed environment in order to control the processing of video streams in a vision system.

Los estilos APA, Harvard, Vancouver, ISO, etc.

41

Qu, Zhijian, Hanxin Liu, Hanlin Wang, Xinqiang Chen, Rui Chi y Zixiao Wang. "Cluster equilibrium scheduling method based on backpressure flow control in railway power supply systems". PLOS ONE 15, n.º 12 (9 de diciembre de 2020): e0243543. http://dx.doi.org/10.1371/journal.pone.0243543.

Texto completo

Resumen

The purpose of the study is to solve problems, i.e., increasingly significant processing delay of massive monitoring data and imbalanced tasks in the scheduling and monitoring center for a railway network. To tackle these problems, a method by using a smooth weighted round-robin scheduling based on backpressure flow control (BF-SWRR) is proposed. The method is developed based on a model for message queues and real-time streaming computing. By using telemetry data flow as input data sources, the fields of data sources are segmented into different sets by using a distributed model of stream computing parallel processing. Moreover, the round-robin (RR) scheduling method for the distributed server is improved. The parallelism, memory occupancy, and system delay are tested by taking a high-speed train section of a certain line as an example. The result showed that the BF-SWRR method for clusters can control the delay to within 1 s. When the parallelism of distributed clusters is set to 8, occupancy rates of the CPU and memory can be decreased by about 15%. In this way, the overall load of the cluster during stream computing is more balanced.

Los estilos APA, Harvard, Vancouver, ISO, etc.

42

Wang, Yongheng, Xiaozan Zhang y Zengwang Wang. "A Proactive Decision Support System for Online Event Streams". International Journal of Information Technology & Decision Making 17, n.º 06 (noviembre de 2018): 1891–913. http://dx.doi.org/10.1142/s0219622018500463.

Texto completo

Resumen

In-stream big data processing is an important part of big data processing. Proactive decision support systems can predict future system states and execute some actions to avoid unwanted states. In this paper, we propose a proactive decision support system for online event streams. Based on Complex Event Processing (CEP) technology, this method uses structure varying dynamic Bayesian network to predict future events and system states. Different Bayesian network structures are learned and used according to different event context. A networked distributed Markov decision processes model with predicting states is proposed as sequential decision making model. A Q-learning method is investigated for this model to find optimal joint policy. The experimental evaluations show that this method works well for congestion control in transportation system.

Los estilos APA, Harvard, Vancouver, ISO, etc.

43

Афанасьев, В. В., О. А. Бебенина y И. И. Ветров. "APPROACHES TO METHOD OF BINARY CLASSIFICATION SELECTING FOR TEXT MESSAGES IN STREAMING DISTRIBUTED SYSTEMS OF INFORMATION PROCESSING". СИСТЕМЫ УПРАВЛЕНИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ, n.º 4(82) (1 de diciembre de 2020): 43–46. http://dx.doi.org/10.36622/vstu.2020.48.26.010.

Texto completo

Resumen

В статье представлен вариант решения задачи выбора метода двоичной (двухклассовой) классификации потока текстовых сообщений (на примере новостных источников - сервиса RSS), поступающих на вход распределенной системы обработки информации. Рассматриваются вопросы, как выбора рационального, с точки зрения потоковой обработки, метода двоичной классификации для подсистемы обработки текстовых сообщений на основе технологии машинного обучения, так и формирования обучающей и тестовой выборок массива текстовых сообщений, необходимых на этапе обучения двоичного классификатора. Представлены результаты экспериментальной проверки полученных значений указанных выборок применительно к рассмотренным методам двоичной классификации. Приводится подход к процессу обучения классификатора для подсистемы обработки текстовых сообщений в распределенной системе потоковой обработки информации. The article presents a variant of solving problem of a method for binary (two-class) classification choosing of a text messages stream (using the example of news sources - the RSS service) entering on input of a distributed information processing system. The issues of both the choice of a rational binary classification method for a text message, from the point of view of stream processing for subsystem based on machine learning technology, and the formation of training and test samples of an array of text messages required at the stage of training a binary classifier are considered. The results of experimental verification of the obtained values ??of these samples are presented in relation to the considered methods of binary classification. An approach to the process of classifier training for text message processing subsystem in a distributed system of streaming information processing is presented.

Los estilos APA, Harvard, Vancouver, ISO, etc.

44

Boyarshin, Igor, Anna Doroshenko y Pavlo Rehida. "REQUEST BALANCING METHOD FOR INCREASING THEIR PROCESSING EFFICIENCY WITH INFORMATION REPLICATION IN A DISTRIBUTED DATA STORAGE SYSTEM". TECHNICAL SCIENCES AND TECHNOLOGIES, n.º 2(24) (2021): 75–82. http://dx.doi.org/10.25140/2411-5363-2021-2(24)-75-82.

Texto completo

Resumen

The article describes a new method of improving efficiency of the systems that deal with storage and providing access of shared data of many users by utilizing replication. Existing methods of load balancing in data storage systems are described, namely RR and WRR. A new method of request balancing among multiple data storage nodes is proposed, that is able to adjust to input request stream intensity in real time and utilize disk space efficiently while doing so.

Los estilos APA, Harvard, Vancouver, ISO, etc.

45

Chavarria, J. Andres, Todd Bown, Paul Clarkson, Simon Watson y Chris Minto. "Digitalization of asset surveillance through distributed fiber-optic sensing: Geophysics and engineering diagnostics and streaming". Leading Edge 41, n.º 9 (septiembre de 2022): 636–40. http://dx.doi.org/10.1190/tle41090636.1.

Texto completo

Resumen

Fiber-optic distributed acoustic sensing (DAS) can listen to a wide range of signals. This listening takes place at high sampling rates with fine spatial resolution, resulting in large data volumes. Data streaming solutions are available but result in large transmission and storage costs. In this paper, we describe strategies to convert large data streams from DAS interrogator units to diagnostics or processed products. Optimizing DAS systems results in higher signal-to-noise ratio for signals while extracting diagnostic features out of the noise that could be related to production or well engineering. DAS has sensitivity to diverse signals, and the first goal of edge processing is to separate them for consumption by various disciplines. Focusing the processing on specific aspects in the DAS recordings provides data products that are streamed in efficient ways. We show how DAS processing can deploy fast algorithms so that data diagnostics are sent to remote locations. This enables real-time-diagnostics and event-detection tools. By providing the bulk of computing in the field, data upload to remote servers is efficient and targeted. We show how this managed data stream enables digitalization of engineering and geoscience assets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

46

Theodorakis, Georgios, Fotios Kounelis, Peter Pietzuch y Holger Pirk. "Scabbard". Proceedings of the VLDB Endowment 15, n.º 2 (octubre de 2021): 361–74. http://dx.doi.org/10.14778/3489496.3489515.

Texto completo

Resumen

Single-node multi-core stream processing engines (SPEs) can process hundreds of millions of tuples per second. Yet making them fault-tolerant with exactly-once semantics while retaining this performance is an open challenge: due to the limited I/O bandwidth of a single-node, it becomes infeasible to persist all stream data and operator state during execution. Instead, single-node SPEs rely on upstream distributed systems, such as Apache Kafka, to recover stream data after failure, necessitating complex cluster-based deployments. This lack of built-in fault-tolerance features has hindered the adoption of single-node SPEs. We describe Scabbard, the first single-node SPE that supports exactly-once fault-tolerance semantics despite limited local I/O bandwidth. Scabbard achieves this by integrating persistence operations with the query workload. Within the operator graph, Scabbard determines when to persist streams based on the selectivity of operators: by persisting streams after operators that discard data, it can substantially reduce the required I/O bandwidth. As part of the operator graph, Scabbard supports parallel persistence operations and uses markers to decide when to discard persisted data. The persisted data volume is further reduced using workload-specific compression: Scabbard monitors stream statistics and dynamically generates computationally efficient compression operators. Our experiments show that Scabbard can execute stream queries that process over 200 million tuples per second while recovering from failures with sub-second latencies.

Los estilos APA, Harvard, Vancouver, ISO, etc.

47

De Pauw, Wim y Henrique Andrade. "Visualizing Large-Scale Streaming Applications". Information Visualization 8, n.º 2 (22 de enero de 2009): 87–106. http://dx.doi.org/10.1057/ivs.2009.5.

Texto completo

Resumen

Stream processing is a new and important computing paradigm. Innovative streaming applications are being developed in areas ranging from scientific applications (for example, environment monitoring), to business intelligence (for example, fraud detection and trend analysis), to financial markets (for example, algorithmic trading systems). In this paper we describe Streamsight, a new visualization tool built to examine, monitor and help understand the dynamic behavior of streaming applications. Streamsight can handle the complex, distributed and large-scale nature of stream processing applications by using hierarchical graphs, multi-perspective visualizations, and de-cluttering strategies. To address the dynamic and adaptive nature of these applications, Streamsight also provides real-time visualization as well as the capability to record and replay. All these features are used for debugging, for performance optimization, and for management of resources, including capacity planning. More than 100 developers, both inside and outside IBM, have been using Streamsight to help design and implement large-scale stream processing applications.

Los estilos APA, Harvard, Vancouver, ISO, etc.

48

Kalyaev, A. I. "APPLICATION OF DISTRIBUTED COMPUTING SYSTEMS FOR IMAGE PROCESSING IN ORDER TO SEARCH FOR UNMANNED AERIAL VEHICLES". Vestnik komp'iuternykh i informatsionnykh tekhnologii, n.º 208 (octubre de 2021): 46–53. http://dx.doi.org/10.14489/vkit.2021.10.pp.046-053.

Texto completo

Resumen

This article describes an approach to solving the problem of searching, identifying and tracking UAVs (Unmanned Aerial Vehicles) using a distributed computing system for processing images from multiple surveillance cameras. Today, the problem of finding UAVs is becoming especially relevant due to their widespread distribution and low cost, which gives a wide scope for illegal use: the implementation of terrorist attacks in crowded places and critical infrastructure, as well as unauthorized tracking of specially protected areas. At the same time, modern radars have low efficiency for searching for UAVs, so today visual detection tools are used, for which effective work requires complex calculations. In this article, it is proposed to use distributed computing systems to solve these complex problems of processing a video stream for the purpose of searching, identifying and tracking objects (UAVs). For this, the author of the article, proceeding from the potential areas of application of such systems, decided to apply a multiagent approach, which makes it possible to create fault-tolerant and scalable systems. In the course of work on the article, softwarefor a distributed computing system for image processing in order to search for unmanned aerial vehicleswas created, a hardware stand was assembled to test it. While performing tests, it was concluded that the proposed method can be applied to implement high-resolution video processing and frame rate in a distributed computing system.

Los estilos APA, Harvard, Vancouver, ISO, etc.

49

Alaasam, Ameer Basim Abdulameer, Gleb Igorevich Radchenko, Andrei Nikolaevitch Tchernykh y José Luis González-Compeán González-Compeán. "Stateful Stream Processing Containerized as Microservice to Support Digital Twins in Fog Computing". Proceedings of the Institute for System Programming of the RAS 33, n.º 1 (2021): 65–80. http://dx.doi.org/10.15514/ispras-2021-33(1)-5.

Texto completo

Resumen

Digital twins of processes and devices use information from sensors to synchronize their state with the entities of the physical world. The concept of stream computing enables effective processing of events generated by such sensors. However, the need to track the state of an instance of the object leads to the impossibility of organizing instances of digital twins as stateless services. Another feature of digital twins is that several tasks implemented on their basis require the ability to respond to incoming events at near-real-time speed. In this case, the use of cloud computing becomes unacceptable due to high latency. Fog computing manages this problem by moving some computational tasks closer to the data sources. One of the recent solutions providing the development of loosely coupled distributed systems is a Microservice approach, which implies the organization of the distributed system as a set of coherent and independent services interacting with each other using messages. The microservice is most often isolated by utilizing containers to overcome the high overheads of using virtual machines. The main problem is that microservices and containers together are stateless by nature. The container technology still does not fully support live container migration between physical hosts without data loss. It causes challenges in ensuring the uninterrupted operation of services in fog computing environments. Thus, an essential challenge is to create a containerized stateful stream processing based microservice to support digital twins in the fog computing environment. Within the scope of this article, we study live stateful stream processing migration and how to redistribute computational activity across cloud and fog nodes using Kafka middleware and its Stream DSL API.

Los estilos APA, Harvard, Vancouver, ISO, etc.

50

RIESCO, A. y J. RODRÍGUEZ-HORTALÁ. "Property-Based Testing for Spark Streaming". Theory and Practice of Logic Programming 19, n.º 04 (19 de febrero de 2019): 574–602. http://dx.doi.org/10.1017/s1471068419000012.

Texto completo

Resumen

AbstractStream processing has reached the mainstream in the last years, as a new generation of open-source distributed stream processing systems, designed for scaling horizontally on commodity hardware, has brought the capability for processing high-volume and high-velocity data streams to companies of all sizes. In this work, we propose a combination of temporal logic and property-based testing (PBT) for dealing with the challenges of testing programs that employ this programming model. We formalize our approach in a discrete time temporal logic for finite words, with some additions to improve the expressiveness of properties, which includes timeouts for temporal operators and a binding operator for letters. In particular, we focus on testing Spark Streaming programs written with the Spark API for the functional language Scala, using the PBT library ScalaCheck. For that we add temporal logic operators to a set of new ScalaCheck generators and properties, as part of our testing library sscheck.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!