Dissertations / Theses on the topic 'Stream Processing System'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Stream Processing System.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Wladdimiro, Cottet Daniel. "Dynamic adaptation in Stream Processing Systems." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS028.
Full textThe amount of data produced by today’s web-based systems and applications increases rapidly, due to the many interactions with users (e.g. real-time stock market transactions, multiplayer games, streaming data produced by Twitter, etc.). As a result, there is a growing demand, particularly in the fields of commerce, security and research, for systems capable of processing this data in real time and providing useful information in a short space of time. Stream processing systems (SPS) meet these needs and have been widely used for this purpose. The aim of SPSs is to process large volumes of data in real time by housing a set of operators in applications based on Directed acyclic graphs (DAG). Most existing SPSs, such as Flink or Storm, are configured prior to deployment, usually defining the DAG and the number of operator replicas in advance. Overestimating the number of replicas can lead to a waste of allocated resources. On the other hand, depending on interaction with the environment, the rate of input data can fluctuate dynamically and, as a result, operators can become overloaded, leading to a degradation in system performance. These SPSs are not capable of dynamically adapting to operator workload and input rate variations. One solution to this problem is to dynamically increase the number of resources, physical or logical, allocated to the SPS when the processing demand of one or more operators increases. This thesis presents two SPSs, RA-SPS and PA-SPS, reactive and predictive approach respectively, for dynamically modifying the number of operator replicas. The reactive approach relies on the current state of operators computed on multiple metrics, while the predictive model is based on input rate variation, operator execution time, and queued events. The two SPSs extend Storm SPS to dynamically reconfigure the number of copies without having to downtime the application. They also implement a load balancer that distributes incoming events fairly among operator replicas. Experiments on the Google Cloud Platform (GCP) were carried out with applications that process Twitter data, DNS traffic, or logs traces. Performance was evaluated with different configurations and the results were compared with those of running the same applications on the original Storm as well as with state-of-the-art work such as SPS DABS-Storm, which also adapt the number of replicas. The comparison shows that both RA-SPS and PA-SPS can significantly improve the number of events processed, while reducing costs
Hongslo, Anders. "Stream Processing in the Robot Operating System framework." Thesis, Linköpings universitet, Artificiell intelligens och integrerad datorsystem, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-79846.
Full textKakkad, Vasvi. "Curracurrong: a stream processing system for distributed environments." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/12861.
Full textTokmouline, Timur. "A signal oriented stream processing system for pipeline monitoring." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/37106.
Full textIncludes bibliographical references (p. 115-117).
In this thesis, we develop SignalDB, a framework for composing signal processing applications from primitive stream and signal processing operators. SignalDB allows the user to focus on the signal processing task and avoid needlessly spending time on learning a particular application programming interface (API). We use SignalDB to express acoustic and pressure transient methods for water pipeline monitoring as query plans consisting of signal processing operators.
by Timur Tokmouline.
M.Eng.
Robakowski, Mikolaj. "Comparison of State Backends for Modern Stream Processing System." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-290597.
Full textDistribuerad strömbehandling är ett mycket populärt dataparadigm som användsi olika moderna datorsystem. En viktig aspekt av distribuerad strömbearbetningssystem är hur de hanterar data som är större än system minne.Detta löses ofta genom användning av en backend – en databas, vanligtvis eninbäddad, som hanterar lagringen. Detta gör dock att hela systemets prestandablir beroende av databasens prestanda för den angivna arbetsbelastningen.Loggstrukturerad merge-tree-baserade lösningar används ofta i strömbehandlingssystemsom en backend för alla typer av belastningar. Vi postulerar attanvända olika backends för olika arbetsbelastningar ger mycket bättre prestanda.I det här arbetet implementerar vi flera backends för Arcon, en modernströmbehandlings runtime skriven i Rust och utvecklad vid KTH. Avhandlingengår över implementeringsprocessen och gränssnittet för backends med flerakonkreta implementationer. Vi utvärderar experimentellt implementationernamot varandra och visar att vissa presterar bättre än andra beroende på arbetsbelastningen.I synnerhet visar vi att under läs-tungt arbete, så ser vi att sled,en inbäddad Bw-Tree databas skriven i Rust presterar bättre än den vanligaLSM-baserade RocksDB.
Mousavi, Bamdad. "Scalable Stream Processing and Management for Time Series Data." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42295.
Full textBalazinska, Magdalena. "Fault-tolerance and load management in a distributed stream processing system." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/35287.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 187-199).
Advances in monitoring technology (e.g., sensors) and an increased demand for online information processing have given rise to a new class of applications that require continuous, low-latency processing of large-volume data streams. These "stream processing applications" arise in many areas such as sensor-based environment monitoring, financial services, network monitoring, and military applications. Because traditional database management systems are ill-suited for high-volume, low-latency stream processing, new systems, called stream processing engines (SPEs), have been developed. Furthermore, because stream processing applications are inherently distributed, and because distribution can improve performance and scalability, researchers have also proposed and developed distributed SPEs. In this dissertation, we address two challenges faced by a distributed SPE: (1) faulttolerant operation in the face of node failures, network failures, and network partitions, and (2) federated load management. For fault-tolerance, we present a replication-based scheme, called Delay, Process, and Correct (DPC), that masks most node and network failures.
(cont.) When network partitions occur, DPC addresses the traditional availability-consistency trade-off by maintaining, when possible, a desired availability specified by the application or user, but eventually also delivering the correct results. While maintaining the desired availability bounds, DPC also strives to minimize the number of inaccurate results that must later be corrected. In contrast to previous proposals for fault tolerance in SPEs, DPC simultaneously supports a variety of applications that differ in their preferred trade-off between availability and consistency. For load management, we present a Bounded-Price Mechanism (BPM) that enables autonomous participants to collaboratively handle their load without individually owning the resources necessary for peak operation. BPM is based on contracts that participants negotiate offline. At runtime, participants move load only to partners with whom they have a contract and pay each other the contracted price. We show that BPM provides incentives that foster participation and leads to good system-wide load distribution. In contrast to earlier proposals based on computational economies, BPM is lightweight, enables participants to develop and exploit preferential relationships, and provides stability and predictability.
(cont.) Although motivated by stream processing, BPM is general and can be applied to any federated system. We have implemented both schemes in the Borealis distributed stream processing engine. They will be available with the next release of the system.
by Magdalena Balazinska.
Ph.D.
Ahmed, Abdulbasit. "Online network intrusion detection system using temporal logic and stream data processing." Thesis, University of Liverpool, 2013. http://livrepository.liverpool.ac.uk/12153/.
Full textAl-Sinayyid, Ali. "JOB SCHEDULING FOR STREAMING APPLICATIONS IN HETEROGENEOUS DISTRIBUTED PROCESSING SYSTEMS." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/dissertations/1868.
Full textAddimando, Alessio. "Progettazione di un intrusion detection system su piattaforma big data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16755/.
Full textSlater, Alicia Adell. "Recovery of community structure and leaf processing in a headwater stream following use of a wetland passive treatment system to abate copper pollution." Thesis, This resource online, 1996. http://scholar.lib.vt.edu/theses/available/etd-08222008-063653/.
Full textWigent, Mark A., Andrea M. Mazzario, and Scott M. Matsumura. "Use of Multi-Threading, Modern Programming Language, and Lossless Compression in a Dynamic Commutation/Decommutation System." International Foundation for Telemetering, 2011. http://hdl.handle.net/10150/595662.
Full textThe Spectrum Efficient Technology Science and Technology (SET S&T) Program is sponsoring the development of the Dynamic Commutation and Decommutation System (DCDS), which optimizes telemetry data transmission in real time. The goal of DCDS is to improve spectrum efficiency - not through improving RF techniques but rather through changing and optimizing contents of the telemetry stream during system test. By allowing the addition of new parameters to the telemetered stream at any point during system test, DCDS removes the need to transmit measured data unless it is actually needed on the ground. When compared to serial streaming telemetry, real time re-formatting of the telemetry stream does require additional processing onboard the test article. DCDS leverages advances in microprocessor technology to perform this processing while meeting size, weight, and power constraints of the test environment. Performance gains of the system have been achieved by significant multi-threading of the application, allowing it to run on modern multi-core processors. Two other enhancing technologies incorporated into DCDS are the Java programming language and lossless compression.
Carter, Bruce, and Troy Scoughton. "Low-cost, short-term development or high-data-rate, multi-stream, mulit-data type telemetry acquisition/processing system using an off-the-shelf integrated Telemetry Front End." International Foundation for Telemetering, 1989. http://hdl.handle.net/10150/614541.
Full textThis paper explores the effects the new breed of off-theshelf integrated telemetry front end (TFE) packages have on the cost and schedule of the development cycle associated with real-time telemetry acquisition/processing systems. A case study of an actual project involving replacement of the Holloman AFB sled track telemetry processing system (TPS) with a system capable of simultaneously supporting up to twenty (20) asynchronous data streams is profiled. Notable among the capabilities of the system are; support for PCM, PAM, FM, IRIG and Local time streams; incoming data rates up to 10 Megabits/sec/stream; data logging rates over 16 MegaBytes/sec and the use of local area networks for distribution of data to real-time displays. To achieve these requirements within a manageable cost/schedule framework, the system was designed around an integrated TFE sub-system. Comparisons are drawn between several aspects of this projects development and that of an earlier developmental system which was completed by PSL within the last 16 months.
Aved, Alexander. "Scene Understanding for Real Time Processing of Queries over Big Data Streaming Video." Doctoral diss., University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5597.
Full textPh.D.
Doctorate
Computer Science
Engineering and Computer Science
Computer Science
Braik, William. "Détection d'évènements complexes dans les flux d'évènements massifs." Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0596/document.
Full textPattern detection over streams of events is gaining more and more attention, especially in the field of eCommerce. Our industrial partner Cdiscount, which is one of the largest eCommerce companies in France, aims to use pattern detection for real-time customer behavior analysis. The main challenges to consider are efficiency and scalability, as the detection of customer behaviors must be achieved within a few seconds, while millions of unique customers visit the website every day,thus producing a large event stream. In this thesis, we present Auros, a system for large-scale an defficient pattern detection for eCommerce. It relies on a domain-specific language to define behavior patterns. Patterns are then compiled into deterministic finite automata, which are run on a BigData streaming platform. Our evaluation shows that our approach is efficient and scalable, and fits the requirements of Cdiscount
Kammoun, Abderrahmen. "Enhancing Stream Processing and Complex Event Processing Systems." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSES012.
Full textAs more and more connected objects and sensory devices are becoming part of our daily lives, the sea of high-velocity information flow is growing. This massive amount of data produced at high rates requires rapid insight to be useful in various applications such as the Internet of Things, health care, energy management, etc. Traditional data storage and processing techniques are proven inefficient. This gives rise to Data Stream Management and Complex Event Processing (CEP) systems.This thesis aims to provide optimal solutions for complex and proactive queries. Our proposed techniques, in addition to CPU and memory efficiency, enhance the capabilities of existing CEP systems by adding predictive feature through real-time learning. The main contributions of this thesis are as follows:We proposed various techniques to reduce the CPU and memory requirements of expensive queries. These operators result in exponential complexity both in terms of CPU and memory. Our proposed recomputation and heuristic-based algorithm reduce the costs of these operators. These optimizations are based on enabling efficient multidimensional indexing using space-filling curves and by clustering events into batches to reduce the cost of pair-wise joins.We designed a novel predictive CEP system that employs historical information to predict future complex events. We proposed a compressed index structure, range query processing techniques and an approximate summarizing technique over the historical space.The applicability of our techniques over the real-world problems presented has produced further customize-able solutions that demonstrate the viability of our proposed methods
Maurer, Simon. "Analysis and coordination of mixed-criticality cyber-physical systems." Thesis, University of Hertfordshire, 2018. http://hdl.handle.net/2299/21094.
Full textAlves, Francisco Marco Morais. "Framework for location based system sustained by mobile phone users." Master's thesis, Universidade de Aveiro, 2017. http://hdl.handle.net/10773/23817.
Full textVivemos na era da informação e da Internet das coisas e por isso nunca antes a informação teve tanto valor, ao mesmo tempo nunca existiu tão elevada troca de informação. Com toda esta quantidade de dados e com o aumento substancial do poder computacional, tem-se assistido a uma explosão de ferramentas para o processamento destes dados em tempo real. Um novo paradigma também emergiu, pelo facto de que muita dessa informação tem meta informação da qual é possível extrair conhecimento adicional quando enriquecida. No caso dos operadores de telecomunicações existem vários fluxos de informação trocados entre dispositivos dos clientes, utilizadores de redes móveis e as antenas. Como exemplos são os casos dos pacotes Radius, Call Detail Records CDR’s e os Event Detail Records EDR’s que servem para o controlo de tráfego e para outros tipos de controlo e configurações. Em muitos destes pacotes vem incluída informação geográfica e temporal. Depressa se torna claro que a partir desta informação geográfica é possível extrair conhecimento e por isso valor adicional para os detentores da informação. Esta dissertação recorre a fluxos devidamente anonimizados que possuem informação de antenas (id e por isso posição e distância ao dispositivo). Neste trabalho é apresentada uma solução escalável e fiável que num ambiente de streaming determina a posição dos utilizadores de redes móveis, através de triangulação. A solução também determina métricas relativas a áreas geográficas. Devido a dificuldades externas, estes fluxos (dados) tiveram de ser simulados. As áreas são definidas e introduzidas por utilizadores da aplicação de forma a saberem as entradas e saídas, bem como o tempo de permanência em uma determinada área. Sendo o processamento realizado em ambiente de streaming, a solução desenvolvida tem de ser capaz de recuperar de falhas quando elas existirem de uma forma coerente e consistente.
The time we live in is the time of information and the time of the Internet of Things. So, never before information had so much value. On the other hand, the volume of information exchange grows exponentially day by day. With all this amount of data as well with the computational power available nowadays, real time data processing tools emerge every day. A new paradigm emerges because there is a lot of meta information in this data exchange. With the enrichment of this meta information, it is possible to extract additional knowledge. From a telecommunication company point of view, there is a lot of exchanged data flows between clients’ devices and the Base Transceiver Station (BTS) such as, Radius packets, Call Detail Records (CDR) and Event Detail Records (EDR). Frequently, these flows are for control and configurations purposes. But in many cases, it also contains geographical and time information. Soon was clear that it is possible to perform data enrichment on this geographical information, in order to extract additional knowledge. In other words, additional value for the telecommunication company. This dissertation through data flows previously anonymized, that contain BTS’s information (e.g. position and distance from the client mobile), grants one scalable and reliable solution on a streaming environment that determines multiple metrics related to geographical areas. Due to external difficulties, it was necessary to simulate all the data flows. These areas are inputted by application user clients in order to know the number of people that get in or out of these areas as well the time spent inside. Since the work is done on streaming environment, the solution presented is able to recover from failures and fault tolerant in a consistent and coherent manner.
Vijayakumar, Nithya Nirmal. "Data management in distributed stream processing systems." [Bloomington, Ind.] : Indiana University, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3278228.
Full textSource: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6093. Adviser: Beth Plale. Title from dissertation home page (viewed May 9, 2008).
Mei, Haitao. "Real-time stream processing in embedded systems." Thesis, University of York, 2017. http://etheses.whiterose.ac.uk/19750/.
Full textDrougas, Ioannis. "Rate allocation in distributed stream processing systems." Diss., [Riverside, Calif.] : University of California, Riverside, 2008. http://proquest.umi.com/pqdweb?index=0&did=1663077971&SrchMode=2&sid=1&Fmt=2&VInst=PROD&VType=PQD&RQT=309&VName=PQD&TS=1268240766&clientId=48051.
Full textIncludes abstract. Title from first page of PDF file (viewed March 10, 2010). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 93-98). Also issued in print.
Idris, Muhammad. "Real-time Business Intelligence through Compact and Efficient Query Processing Under Updates." Doctoral thesis, Universite Libre de Bruxelles, 2019. https://dipot.ulb.ac.be/dspace/bitstream/2013/284705/5/contratMI.pdf.
Full textDoctorat en Sciences de l'ingénieur et technologie
info:eu-repo/semantics/nonPublished
Ren, Xiangnan. "Traitement et raisonnement distribués des flux RDF." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1139/document.
Full textReal-time processing of data streams emanating from sensors is becoming a common task in industrial scenarios. In an Internet of Things (IoT) context, data are emitted from heterogeneous stream sources, i.e., coming from different domains and data models. This requires that IoT applications efficiently handle data integration mechanisms. The processing of RDF data streams hence became an important research field. This trend enables a wide range of innovative applications where the real-time and reasoning aspects are pervasive. The key implementation goal of such application consists in efficiently handling massive incoming data streams and supporting advanced data analytics services like anomaly detection. However, a modern RSP engine has to address volume and velocity characteristics encountered in the Big Data era. In an on-going industrial project, we found out that a 24/7 available stream processing engine usually faces massive data volume, dynamically changing data structure and workload characteristics. These facts impact the engine's performance and reliability. To address these issues, we propose Strider, a hybrid adaptive distributed RDF Stream Processing engine that optimizes logical query plan according to the state of data streams. Strider has been designed to guarantee important industrial properties such as scalability, high availability, fault-tolerant, high throughput and acceptable latency. These guarantees are obtained by designing the engine's architecture with state-of-the-art Apache components such as Spark and Kafka. Moreover, an increasing number of processing jobs executed over RSP engines are requiring reasoning mechanisms. It usually comes at the cost of finding a trade-off between data throughput, latency and the computational cost of expressive inferences. Therefore, we extend Strider to support real-time RDFS+ (i.e., RDFS + owl:sameAs) reasoning capability. We combine Strider with a query rewriting approach for SPARQL that benefits from an intelligent encoding of knowledge base. The system is evaluated along different dimensions and over multiple datasets to emphasize its performance. Finally, we have stepped further to exploratory RDF stream reasoning with a fragment of Answer Set Programming. This part of our research work is mainly motivated by the fact that more and more streaming applications require more expressive and complex reasoning tasks. The main challenge is to cope with the large volume and high-velocity dimensions in a scalable and inference-enabled manner. Recent efforts in this area still missing the aspect of system scalability for stream reasoning. Thus, we aim to explore the ability of modern distributed computing frameworks to process highly expressive knowledge inference queries over Big Data streams. To do so, we consider queries expressed as a positive fragment of LARS (a temporal logic framework based on Answer Set Programming) and propose solutions to process such queries, based on the two main execution models adopted by major parallel and distributed execution frameworks: Bulk Synchronous Parallel (BSP) and Record-at-A-Time (RAT). We implement our solution named BigSR and conduct a series of evaluations. Our experiments show that BigSR achieves high throughput beyond million-triples per second using a rather small cluster of machines
Works, Karen E. "Targeted Prioritized Processing in Overloaded Data Stream Systems." Digital WPI, 2013. https://digitalcommons.wpi.edu/etd-dissertations/414.
Full textBordin, Maycon Viana. "A benchmark suite for distributed stream processing systems." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/163441.
Full textRecently a new application domain characterized by the continuous and low-latency processing of large volumes of data has been gaining attention. The growing number of applications of such genre has led to the creation of Stream Processing Systems (SPSs), systems that abstract the details of real-time applications from the developer. More recently, the ever increasing volumes of data to be processed gave rise to distributed SPSs. Currently there are in the market several distributed SPSs, however the existing benchmarks designed for the evaluation this kind of system covers only a few applications and workloads, while these systems have a much wider set of applications. In this work a benchmark for stream processing systems is proposed. Based on a survey of several papers with real-time and stream applications, the most used applications and areas were outlined, as well as the most used metrics in the performance evaluation of such applications. With these information the metrics of the benchmark were selected as well as a list of possible application to be part of the benchmark. Those passed through a workload characterization in order to select a diverse set of applications. To ease the evaluation of SPSs a framework was created with an API to generalize the application development and collect metrics, with the possibility of extending it to support other platforms in the future. To prove the usefulness of the benchmark, a subset of the applications were executed on Storm and Spark using the Azure Platform and the results have demonstrated the usefulness of the benchmark suite in comparing these systems.
Martin, André. "Minimizing Overhead for Fault Tolerance in Event Stream Processing Systems." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-210251.
Full textKautharapu, K. B. "Aqueous two phase systems for down stream processing of proteins." Thesis(Ph.D.), CSIR-National Chemical Laboratory, Pune, 2009. http://dspace.ncl.res.in:8080/xmlui/handle/20.500.12252/2750.
Full textChong, Fong Ho. "Frequency-stream-tying hidden Markov model /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202003%20CHONG.
Full textIncludes bibliographical references (leaves 119-123). Also available in electronic version. Access restricted to campus users.
Nehme, Rimma V. "Continuous query processing on spatio-temporal data streams." Link to electronic thesis, 2005. http://www.wpi.edu/Pubs/ETD/Available/etd-082305-154035/.
Full textVASCONCELOS, RAFAEL OLIVEIRA. "A DYNAMIC LOAD BALANCING MECHANISM FOR DATA STREAM PROCESSING ON DDS SYSTEMS." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2013. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=23629@1.
Full textCOORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
Esta dissertação apresenta a solução de balanceamento de carga baseada em fatias de processamento de dados (Data Processing Slice Load Balancing solution) para permitir o balanceamento de carga dinâmico do processamento de fluxos de dados em sistemas baseados em DDS (Data Distribution Service). Um grande número de aplicações requer o processamento contínuo de alto volume de dados oriundos de várias fontes distribuídas., tais como monitoramento de rede, sistemas de engenharia de tráfego, roteamento inteligente de carros em áreas metropolitanas, redes de sensores, sistemas de telecomunicações, aplicações financeiras e meteorologia. Conceito chave da solução proposta é o Data Processing Slice, o qual é a unidade básica da carga de processamento dos dados dos nós servidores em um domínio DDS. A solução consiste de um nó balanceador, o qual é responsável por monitorar a carga atual de um conjunto de nós processadores homogêneos e quando um desbalanceamento de carga é detectado, coordenar ações para redistribuir entre os nós processadores algumas fatias de carga de trabalho de forma segura. Experimentos feitos com grandes fluxos de dados que demonstram a baixa sobrecarga, o bom desempenho e a confiabilidade da solução apresentada.
This thesis presents the Data Processing Slice Load Balancing solution to enable dynamic load balancing of Data Stream Processing on DDS-based systems (Data Distribution Service). A large number of applications require continuous and timely processing of high-volume of data originated from many distributed sources, such as network monitoring, traffic engineering systems, intelligent routing of cars in metropolitan areas, sensor networks, telecommunication systems, financial applications and meteorology. The key concept of the proposed solution is the Data Processing Slice (DPS), which is the basic unit of data processing load of server nodes in a DDS Domain. The Data Processing Slice Load Balancing solution consists of a load balancer, which is responsible for monitoring the current load of a set of homogenous data processing nodes and when a load unbalance is detected, it coordinates the actions to redistribute some data processing slices among the processing nodes in a secure way. Experiments with large data stream have demonstrated the low overhead, good performance and the reliability of the proposed solution.
Bustamante, Fabián Ernesto. "The active streams approach to adaptive distributed applications and services." Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/15481.
Full textSansrimahachai, Watsawee. "Tracing fine-grained provenance in stream processing systems using a reverse mapping method." Thesis, University of Southampton, 2012. https://eprints.soton.ac.uk/337675/.
Full textWeisenseel, Chuck, and David Lane. "SIMULTANEOUS DATA PROCESSING OF MULTIPLE PCM STREAMS ON A PC BASED SYSTEM." International Foundation for Telemetering, 1999. http://hdl.handle.net/10150/608317.
Full textThe trend of current data acquisition and recording systems is to capture multiple streams of Pulse Code Modulation (PCM) data on a single media. The MARS II data recording system manufactured by Datatape, the Asynchronous Realtime Multiplexer and Output Reconstructor (ARMOR) systems manufactured by Calculex, Inc., and other systems on the market today are examples of this technology. The quantity of data recorded by these systems can be impressive, and can cause difficulties in post-test data processing in terms of data storage and turn around time to the analyst. This paper describes the system currently in use at the Strategic Systems Combined Test Force B-1B division to simultaneously post-flight process up to twelve independent PCM streams at twice real-time speeds. This system is entirely personal computer (PC) based running the Window NT 4.0 operating system with an internal ISA bus PCM decommutation card. Each PC is capable of receiving and processing one stream at a time. Therefore, the core of the system is twelve PCs each with decommutation capability. All PCs are connected via a fast ethernet network hub. The data processed by this system is IRIG 106 Chapter 8 converted MIL-STD-1553B bus data and Chapter 4 Class I and II PCM data. All system operator inputs are via Distributed Component Object Modeling (DCOM) provided by Microsoft Developers Studio, Versions 5.0 and 6.0, which allows control and status of multiple data processing PCs from one workstation. All data processing software is written in-house using Visual C++ and Visual Basic.
Fernández, Moctezuma Rafael J. "A Data-Descriptive Feedback Framework for Data Stream Management Systems." PDXScholar, 2012. https://pdxscholar.library.pdx.edu/open_access_etds/116.
Full textChen, Liang. "A grid-based middleware for processing distributed data streams." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1157990530.
Full textPenczek, Frank. "Static guarantees for coordinated components : a statically typed composition model for stream-processing networks." Thesis, University of Hertfordshire, 2012. http://hdl.handle.net/2299/9046.
Full textMartin, André [Verfasser], Christof [Akademischer Betreuer] Fetzer, and Peter [Gutachter] Pietzuch. "Minimizing Overhead for Fault Tolerance in Event Stream Processing Systems / André Martin ; Gutachter: Peter Pietzuch ; Betreuer: Christof Fetzer." Dresden : Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://d-nb.info/1119362482/34.
Full textBüchner, Steffen [Verfasser], Jörg [Gutachter] Nolte, Rolf [Gutachter] Kraemer, and Wolfgang [Gutachter] Schröder-Preikschat. "Applying the stream-processing paradigm to ultra high-speed communication systems / Steffen Büchner ; Gutachter: Jörg Nolte, Rolf Kraemer, Wolfgang Schröder-Preikschat." Cottbus : BTU Cottbus - Senftenberg, 2020. http://d-nb.info/1218080191/34.
Full textLe, Quoc Do. "Approximate Data Analytics Systems." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2018. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-234219.
Full textSree, Kumar Sruthi. "External Streaming State Abstractions and Benchmarking." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291338.
Full textDistribuerad dataströmsbehandling är ett populärt forskningsområde och är ett av de lovande paradigmen för snabbare och effektivare datahantering. Applicationstate är en förstklassig medborgare i nästan alla strömbehandlingssystem. Numera är strömbearbetning per definition statlig. För en strömbehandlingsapplikation backar staten operationer som aggregeringar, sammanfogningar och windows. Apache Flink är ett av de mest accepterade och mest använda strömbehandlingssystemen i branschen. En av de främsta anledningarna till att ingenjörer väljer ApacheFlink för att skriva och distribuera kontinuerliga applikationer är dess unika kombination av flexibilitet och skalbarhet för statlig programmerbarhet, och företaget garanterar att systemet säkerställer. Apache Flinks garantier gör alltid dess tillstånd korrekt och konsekvent även när noder misslyckas eller när antalet uppgifter ändras. Flink-tillstånd kan skala upp till dess beräkningsnods hårddiskgränser genom att använda inbäddade databaser för att lagra och hämta data. I allmänna tillståndsstöd som officiellt stöds av Flink är staten dock alltid tillgänglig lokalt för att beräkna uppgifter. Även om detta gör installationen bekvämare, skapar det andra utmaningar som icke-trivial tillståndskonfiguration och felåterställning. Samtidigt måste beräkning och tillstånd vara tätt kopplade. Den här strategin leder också till överanvändning och är kontraintuitiv för statligt intensiva endast arbetsbelastningar eller beräkningsintensiva endast arbetsbelastningar. Denna avhandling undersöker en alternativ statsbackendarkitektur, FlinkNDB, som kan hantera dessa utmaningar. FlinkNDB frikopplar tillstånd och beräknar med hjälp av en distribuerad databas för att lagra tillståndet. Avhandlingen täcker utmaningarna med befintliga statliga backends och designval och den nya implementeringen av statebackend. Vi har utvärderat genomförandet av FlinkNDBagainst befintliga statliga backends som erbjuds av Apache Flink.
Henriksson, Jonas. "Implementation of a real-time Fast Fourier Transform on a Graphics Processing Unit with data streamed from a high-performance digitizer." Thesis, Linköpings universitet, Programvara och system, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-113389.
Full textKlein, Anja. "Datenqualität in Sensordatenströmen." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2010. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-27581.
Full textJohnson, Robert A. "A Comparison Between Two-Dimensional and Three-DimensionalAnalysis, A Review of Horizontal Wood Diaphragms and a Case Study of the Structure Located at 89 Shrewsbury Street, Worcester, MA." Digital WPI, 2008. https://digitalcommons.wpi.edu/etd-theses/524.
Full textSOUSA, Rodrigo Duarte. "Escalonamento adaptativo para sistemas de processamento contínuo de eventos." Universidade Federal de Campina Grande, 2014. http://dspace.sti.ufcg.edu.br:8080/jspui/handle/riufcg/381.
Full textMade available in DSpace on 2018-04-13T17:23:58Z (GMT). No. of bitstreams: 1 RODRIGO DUARTE SOUSA - DISSERTAÇÃO - PPGCC 2014..pdf: 3708263 bytes, checksum: d9e59ec276a62382b6317ec8ce6bf880 (MD5) Previous issue date: 2014-08-04
Sistemasde processamento contínuo de eventos vêm sendo utilizados em aplicações que necessitam de um processamento quase em tempo real. Essa necessidade, junto da quantidade elevada de dados processados nessas aplicações, provocam que tais sistemas possuam fortes requisitos de desempenho e tolerância a falhas. Sendo assim, escalonadores geralmente fazem uso de informações de utilização dos recursos das máquinas do sistema (como utilização de CPU, memória RAM, rede e disco) natentativadereagirapossíveissobrecargasque possam aumentar a utilização dos recursos, provocando uma piora no desempenho da aplicação. Entretanto, devido aos diferentes perfis de aplicações e componentes, a complexidade de se decidir, de forma flexível e genérica, o que deve ser monitorado e a diferença entre o que torna um recurso mais importante que outro em um dado momento, podem provocar escolhas não adequadas por parte do escalonador. O trabalho apresentado nesta dissertação propõe um algoritmo de escalonamento que, através de uma abordagem reativa, se adapta a diferentes perfis de aplicações e de carga, tomando decisões baseadas no monitoramento da variação do desempenho de seus operadores. Periodicamente,o escalonador realiza uma avaliação de quais operadores apresentaram uma piora em seu desempenho e, posteriormente, tenta migrar tais operadores para nós menos sobrecarregados. Foram executados experimentos onde um protótipo do algoritmo foi avaliado e os resultados demonstraram uma melhora no desempenho do sistema, apartirdadiminuiçãodalatênciadeprocessamentoedamanutenção da quantidade de eventos processados. Em execuções com variações bruscas da carga de trabalho, a latência média de processamento dos operadores foi reduzida em mais de 84%, enquanto queaquantidadedeeventos processados diminuiuapenas 1,18%.
The usage of event stream processing systems is growing lately, mainly at applications that have a near real-time processing as a requirement. That need, combined with the high amount of data processed by these applications, increases the dependency on performance and fault tolerance of such systems. Therefore, to handle these requirements, schedulers usually make use of the resources utilization (like CPU, RAM, disk and network bandwidth) in an attempt to react to potential over loads that may further increase their utilization, causing the application’s performance to deteriorate. However, due to different application profiles and components, the complexity of deciding, in a flexible and generic way, what resources should be monitored and the difference between what makes a resource utilization more important than another in a given time, can provoke the scheduler to perform wrong actions. In this work, we propose a scheduling algorithm that, via a reactive approach, adapts to different applications profiles and load, taking decisions based at the latency variation from its operators. Periodically, the system scheduler performs an evaluation of which operators are giving evidence of beingin an over loaded state, then, the scheduler tries to migrate those operators to a machine with less utilization. The experiments showed an improvement in the system performance, in scenarios with a bursty workload, the operators’ average processing latency was reduced by more than 84%, while the number of processed events decreased by only1.18%.
Seshadri, Sangeetha. "Enhancing availability in large scale." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29715.
Full textCommittee Chair: Ling Liu; Committee Member: Brian Cooper; Committee Member: Calton Pu; Committee Member: Douglas Blough; Committee Member: Karsten Schwan. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Kamaleswaran, Rishikesan. "CBPsp: complex business processes for stream processing." Thesis, 2011. http://hdl.handle.net/10155/151.
Full textUOIT
Liu, Rong-Tai, and 劉榮太. "Stream Processing Engine in the Network Intrusion Detection System." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/84357251864564162409.
Full text國立清華大學
資訊工程學系
93
With growing Internet connectivity comes evolving opportunities for attackers to unlawfully access computers over the network. The Network Intrusion Detection Systems (NIDSes) are designed to identify attacks against networks or a host that are invisible to firewalls, thus providing an additional layer of security. The NIDS aims to detect a wide range of security violations ranging from attempted break-ins by outsiders to system penetrations and abuses by insiders. Generally two main methods are used for intrusion detection, namely Pattern Matching and Statistical Analysis. The former method applies a static set of patterns and alerts to traffic sequences with known signatures. Meanwhile, the latter method detects anomalous events statistically by gathering protocol header information and comparing this traffic to known attacks, as well as by sensing anomalies. Pattern matching tools are excellent at detecting known attacks, but perform poorly when facing a fresh assault or a modification of an old assault. NIDSes that use statistical analysis perform worse at sensing known problems, but much better at reporting unknown assaults. Improved implementation of an NIDS should combine these two methods to improve network protection. Either way, NIDSes rely on exact string matching from network packet payloads against thousands of intrusion signatures. This dissertation first discusses an efficient and practical mechanism named FSS (First-Seen SYN) filter which can mitigate and block SYN Flood attacks. Then it presents a TCP processing engine which tracks the behaviors of each TCP connection including the state transition, sequence and acknowledgement number, and integrity checking. The most important of all, it eliminates the ambiguities when the attackers use ambiguities in network protocol specifications to deceive network security systems. Then we introduce several fast pattern-matching algorithms since it’s the most computation -intensive task in an NIDS and dominates the performance of an NIDS. Two software-based algorithms and one hardware-based architecture are proposed and proven to be more efficient and high-performance compared to other existing methodologies.
Balazinska, Magdalena, Hari Balakrishnan, Samuel Madden, and Mike Stonebraker. "Availability-Consistency Trade-Offs in a Fault-Tolerant Stream Processing System." 2004. http://hdl.handle.net/1721.1/30506.
Full textChen, Mei-Hsuan, and 陳美璇. "Data Flow Graph Partitioning for Stream Processing in Multi-FPGA Reconfigurable System." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/05549707611165024588.
Full text國立交通大學
資訊工程系
91
The reconfigurable computing offers computation ability in hardware to increase performance, but also keeps the flexibility in software solution. The multi-FGPA reconfigurable system provides means for dealing with the applications that are too large to fit within a single FPGA, but may be partitioned over multiple FPGA available. The systems have a limited number of I/O pins that connect the FPGAs together, and therefore I/O pins must be used carefully. The object of this thesis is to exploit potential throughput of stream processing in multi-FPGA reconfigurable system. We proposed two approaches that schedule data flow graph onto the multi-FPGA system. The first method makes use of data flow graph to find the ideal size and connectivity of FPGA for multi-FPGA reconfigurable system. And the second approach increases the throughput by decreasing the communication overhead in current multi-FPGA reconfigurable system. In our simulation, we use kernel algorithms of DSP as benchmark. The results are promising.
Carvalho, José Miguel Saramago de. "PhisioStream : a physiology monitoring system using off-the-shelf stream processing frameworks." Master's thesis, 2018. http://hdl.handle.net/10773/25888.
Full textO projeto VR2Market surgiu a partir de um consórcio composto por vários parceiros desde a área da tecnologia à psicologia, incluindo a Universidade de Carnegie Mellow, Estados Unidos, sob o programa CMU-Portugal financiado pelo FCT. O principal objetivo deste projeto é fornecer uma solução de monitorização de equipas de operacionais em profissões de risco, First Responders, em relação a aspetos tanto ambientais como fisiológicos. Contudo, a presente solução não oferece suporte à cloud e é composta maioritariamente por componentes ad hoc, o que dificulta o processo de evolução para soluções mais distribuídas. O objetivo do presente trabalho consiste no refactoring do VR2Market no sentido de oferecer suporte à cloud, a partir de uma arquitetura mais expansível e que possibilite o processamento e visualização de dados sem comprometer as funcionalidades existentes no momento. As opções tomadas recaem sobre o uso de processamento de streams e soluções off-the-shelf, tipicamente mais usadas para tarefas de gestão e monitorização de logs. O processamento de streams assente sobre Apache Kafka revelou ser uma boa abordagem para garantir o tratamento e processamento de dados pré-existentes bem como para criar alarmes simples sobre alguns parâmetros. Esta capacidade de processamento poderá ser elevada a níveis mais complexos de analytics, nomeadamente através de ferramentas como o Apache Spark ou Storm, sem comprometer o funcionamento da restante arquitetura. O tratamento dos dados como uma stream possibilitou ainda a integração de ferramentas off-the-shelf que possibilitaram a visualização dos dados de forma contínua ao longo do tempo. Ao combinar estas duas abordagens, foi possível garantir a visualização e processamento de dados de uma forma dinâmica e flexível – tanto sobre dados pré-existentes como os que chegam ao sistema. Foi adotada uma abordagem baseada em Docker containers que possibilitou não só uma forma mais simples de instalar o sistema como também chegar a uma solução totalmente cloud-enabled. Apesar de estar diretamente relacionado com o contexto do VR2Market, pela sua natureza, a nossa arquitetura pode ser facilmente adaptada a outro tipo de cenários. Além disso, a integração de novos tipos de sensores pode ser agora feita de forma mais fácil.
Mestrado em Engenharia de Computadores e Telemática