Rozprawy doktorskie na temat „Distributed Stream Processing Systems”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Distributed Stream Processing Systems”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Vijayakumar, Nithya Nirmal. "Data management in distributed stream processing systems". [Bloomington, Ind.] : Indiana University, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3278228.
Pełny tekst źródłaSource: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6093. Adviser: Beth Plale. Title from dissertation home page (viewed May 9, 2008).
Drougas, Ioannis. "Rate allocation in distributed stream processing systems". Diss., [Riverside, Calif.] : University of California, Riverside, 2008. http://proquest.umi.com/pqdweb?index=0&did=1663077971&SrchMode=2&sid=1&Fmt=2&VInst=PROD&VType=PQD&RQT=309&VName=PQD&TS=1268240766&clientId=48051.
Pełny tekst źródłaIncludes abstract. Title from first page of PDF file (viewed March 10, 2010). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 93-98). Also issued in print.
Bordin, Maycon Viana. "A benchmark suite for distributed stream processing systems". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/163441.
Pełny tekst źródłaRecently a new application domain characterized by the continuous and low-latency processing of large volumes of data has been gaining attention. The growing number of applications of such genre has led to the creation of Stream Processing Systems (SPSs), systems that abstract the details of real-time applications from the developer. More recently, the ever increasing volumes of data to be processed gave rise to distributed SPSs. Currently there are in the market several distributed SPSs, however the existing benchmarks designed for the evaluation this kind of system covers only a few applications and workloads, while these systems have a much wider set of applications. In this work a benchmark for stream processing systems is proposed. Based on a survey of several papers with real-time and stream applications, the most used applications and areas were outlined, as well as the most used metrics in the performance evaluation of such applications. With these information the metrics of the benchmark were selected as well as a list of possible application to be part of the benchmark. Those passed through a workload characterization in order to select a diverse set of applications. To ease the evaluation of SPSs a framework was created with an API to generalize the application development and collect metrics, with the possibility of extending it to support other platforms in the future. To prove the usefulness of the benchmark, a subset of the applications were executed on Storm and Spark using the Azure Platform and the results have demonstrated the usefulness of the benchmark suite in comparing these systems.
Kakkad, Vasvi. "Curracurrong: a stream processing system for distributed environments". Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/12861.
Pełny tekst źródłaAl-Sinayyid, Ali. "JOB SCHEDULING FOR STREAMING APPLICATIONS IN HETEROGENEOUS DISTRIBUTED PROCESSING SYSTEMS". OpenSIUC, 2020. https://opensiuc.lib.siu.edu/dissertations/1868.
Pełny tekst źródłaBalazinska, Magdalena. "Fault-tolerance and load management in a distributed stream processing system". Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/35287.
Pełny tekst źródłaThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 187-199).
Advances in monitoring technology (e.g., sensors) and an increased demand for online information processing have given rise to a new class of applications that require continuous, low-latency processing of large-volume data streams. These "stream processing applications" arise in many areas such as sensor-based environment monitoring, financial services, network monitoring, and military applications. Because traditional database management systems are ill-suited for high-volume, low-latency stream processing, new systems, called stream processing engines (SPEs), have been developed. Furthermore, because stream processing applications are inherently distributed, and because distribution can improve performance and scalability, researchers have also proposed and developed distributed SPEs. In this dissertation, we address two challenges faced by a distributed SPE: (1) faulttolerant operation in the face of node failures, network failures, and network partitions, and (2) federated load management. For fault-tolerance, we present a replication-based scheme, called Delay, Process, and Correct (DPC), that masks most node and network failures.
(cont.) When network partitions occur, DPC addresses the traditional availability-consistency trade-off by maintaining, when possible, a desired availability specified by the application or user, but eventually also delivering the correct results. While maintaining the desired availability bounds, DPC also strives to minimize the number of inaccurate results that must later be corrected. In contrast to previous proposals for fault tolerance in SPEs, DPC simultaneously supports a variety of applications that differ in their preferred trade-off between availability and consistency. For load management, we present a Bounded-Price Mechanism (BPM) that enables autonomous participants to collaboratively handle their load without individually owning the resources necessary for peak operation. BPM is based on contracts that participants negotiate offline. At runtime, participants move load only to partners with whom they have a contract and pay each other the contracted price. We show that BPM provides incentives that foster participation and leads to good system-wide load distribution. In contrast to earlier proposals based on computational economies, BPM is lightweight, enables participants to develop and exploit preferential relationships, and provides stability and predictability.
(cont.) Although motivated by stream processing, BPM is general and can be applied to any federated system. We have implemented both schemes in the Borealis distributed stream processing engine. They will be available with the next release of the system.
by Magdalena Balazinska.
Ph.D.
Bustamante, Fabián Ernesto. "The active streams approach to adaptive distributed applications and services". Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/15481.
Pełny tekst źródłaPenczek, Frank. "Static guarantees for coordinated components : a statically typed composition model for stream-processing networks". Thesis, University of Hertfordshire, 2012. http://hdl.handle.net/2299/9046.
Pełny tekst źródłaChen, Liang. "A grid-based middleware for processing distributed data streams". Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1157990530.
Pełny tekst źródłaSree, Kumar Sruthi. "External Streaming State Abstractions and Benchmarking". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291338.
Pełny tekst źródłaDistribuerad dataströmsbehandling är ett populärt forskningsområde och är ett av de lovande paradigmen för snabbare och effektivare datahantering. Applicationstate är en förstklassig medborgare i nästan alla strömbehandlingssystem. Numera är strömbearbetning per definition statlig. För en strömbehandlingsapplikation backar staten operationer som aggregeringar, sammanfogningar och windows. Apache Flink är ett av de mest accepterade och mest använda strömbehandlingssystemen i branschen. En av de främsta anledningarna till att ingenjörer väljer ApacheFlink för att skriva och distribuera kontinuerliga applikationer är dess unika kombination av flexibilitet och skalbarhet för statlig programmerbarhet, och företaget garanterar att systemet säkerställer. Apache Flinks garantier gör alltid dess tillstånd korrekt och konsekvent även när noder misslyckas eller när antalet uppgifter ändras. Flink-tillstånd kan skala upp till dess beräkningsnods hårddiskgränser genom att använda inbäddade databaser för att lagra och hämta data. I allmänna tillståndsstöd som officiellt stöds av Flink är staten dock alltid tillgänglig lokalt för att beräkna uppgifter. Även om detta gör installationen bekvämare, skapar det andra utmaningar som icke-trivial tillståndskonfiguration och felåterställning. Samtidigt måste beräkning och tillstånd vara tätt kopplade. Den här strategin leder också till överanvändning och är kontraintuitiv för statligt intensiva endast arbetsbelastningar eller beräkningsintensiva endast arbetsbelastningar. Denna avhandling undersöker en alternativ statsbackendarkitektur, FlinkNDB, som kan hantera dessa utmaningar. FlinkNDB frikopplar tillstånd och beräknar med hjälp av en distribuerad databas för att lagra tillståndet. Avhandlingen täcker utmaningarna med befintliga statliga backends och designval och den nya implementeringen av statebackend. Vi har utvärderat genomförandet av FlinkNDBagainst befintliga statliga backends som erbjuds av Apache Flink.
Braik, William. "Détection d'évènements complexes dans les flux d'évènements massifs". Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0596/document.
Pełny tekst źródłaPattern detection over streams of events is gaining more and more attention, especially in the field of eCommerce. Our industrial partner Cdiscount, which is one of the largest eCommerce companies in France, aims to use pattern detection for real-time customer behavior analysis. The main challenges to consider are efficiency and scalability, as the detection of customer behaviors must be achieved within a few seconds, while millions of unique customers visit the website every day,thus producing a large event stream. In this thesis, we present Auros, a system for large-scale an defficient pattern detection for eCommerce. It relies on a domain-specific language to define behavior patterns. Patterns are then compiled into deterministic finite automata, which are run on a BigData streaming platform. Our evaluation shows that our approach is efficient and scalable, and fits the requirements of Cdiscount
Liu, Ying. "Query optimization for distributed stream processing". [Bloomington, Ind.] : Indiana University, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3274258.
Pełny tekst źródłaSource: Dissertation Abstracts International, Volume: 68-07, Section: B, page: 4597. Adviser: Beth Plale. Title from dissertation home page (viewed Apr. 21, 2008).
Newton, Ryan Rhodes 1980. "Language design for distributed stream processing". Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/46795.
Pełny tekst źródłaThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 149-152).
Applications that combine live data streams with embedded, parallel, and distributed processing are becoming more commonplace. WaveScript is a domain-specific language that brings high-level, type-safe, garbage-collected programming to these domains. This is made possible by three primary implementation techniques, each of which leverages characteristics of the streaming domain. First, WaveScript employs an evaluation strategy that uses a combination of interpretation and reification to partially evaluate programs into stream dataflow graphs. Second, we use profile-driven compilation to enable many optimizations that are normally only available in the synchronous (rather than asynchronous) dataflow domain. Finally, an empirical, profile-driven approach also allows us to compute practical partitions of dataflow graphs, spreading them across embedded nodes and more powerful servers. We have used our language to build and deploy applications, including a sensor-network for the acoustic localization of wild animals such as the Yellow-Bellied marmot. We evaluate WaveScript's performance on this application, showing that it yields good performance on both embedded and desktop-class machines. Our language allowed us to implement the application rapidly, while outperforming a previous C implementation by over 35%, using fewer than half the lines of code. We evaluate the contribution of our optimizations to this success. We also evaluate WaveScript's ability to extract parallelism from this and other applications.
by Ryan Rhodes Newton.
Ph.D.
Ren, Xiangnan. "Traitement et raisonnement distribués des flux RDF". Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1139/document.
Pełny tekst źródłaReal-time processing of data streams emanating from sensors is becoming a common task in industrial scenarios. In an Internet of Things (IoT) context, data are emitted from heterogeneous stream sources, i.e., coming from different domains and data models. This requires that IoT applications efficiently handle data integration mechanisms. The processing of RDF data streams hence became an important research field. This trend enables a wide range of innovative applications where the real-time and reasoning aspects are pervasive. The key implementation goal of such application consists in efficiently handling massive incoming data streams and supporting advanced data analytics services like anomaly detection. However, a modern RSP engine has to address volume and velocity characteristics encountered in the Big Data era. In an on-going industrial project, we found out that a 24/7 available stream processing engine usually faces massive data volume, dynamically changing data structure and workload characteristics. These facts impact the engine's performance and reliability. To address these issues, we propose Strider, a hybrid adaptive distributed RDF Stream Processing engine that optimizes logical query plan according to the state of data streams. Strider has been designed to guarantee important industrial properties such as scalability, high availability, fault-tolerant, high throughput and acceptable latency. These guarantees are obtained by designing the engine's architecture with state-of-the-art Apache components such as Spark and Kafka. Moreover, an increasing number of processing jobs executed over RSP engines are requiring reasoning mechanisms. It usually comes at the cost of finding a trade-off between data throughput, latency and the computational cost of expressive inferences. Therefore, we extend Strider to support real-time RDFS+ (i.e., RDFS + owl:sameAs) reasoning capability. We combine Strider with a query rewriting approach for SPARQL that benefits from an intelligent encoding of knowledge base. The system is evaluated along different dimensions and over multiple datasets to emphasize its performance. Finally, we have stepped further to exploratory RDF stream reasoning with a fragment of Answer Set Programming. This part of our research work is mainly motivated by the fact that more and more streaming applications require more expressive and complex reasoning tasks. The main challenge is to cope with the large volume and high-velocity dimensions in a scalable and inference-enabled manner. Recent efforts in this area still missing the aspect of system scalability for stream reasoning. Thus, we aim to explore the ability of modern distributed computing frameworks to process highly expressive knowledge inference queries over Big Data streams. To do so, we consider queries expressed as a positive fragment of LARS (a temporal logic framework based on Answer Set Programming) and propose solutions to process such queries, based on the two main execution models adopted by major parallel and distributed execution frameworks: Bulk Synchronous Parallel (BSP) and Record-at-A-Time (RAT). We implement our solution named BigSR and conduct a series of evaluations. Our experiments show that BigSR achieves high throughput beyond million-triples per second using a rather small cluster of machines
Kammoun, Abderrahmen. "Enhancing Stream Processing and Complex Event Processing Systems". Thesis, Lyon, 2019. http://www.theses.fr/2019LYSES012.
Pełny tekst źródłaAs more and more connected objects and sensory devices are becoming part of our daily lives, the sea of high-velocity information flow is growing. This massive amount of data produced at high rates requires rapid insight to be useful in various applications such as the Internet of Things, health care, energy management, etc. Traditional data storage and processing techniques are proven inefficient. This gives rise to Data Stream Management and Complex Event Processing (CEP) systems.This thesis aims to provide optimal solutions for complex and proactive queries. Our proposed techniques, in addition to CPU and memory efficiency, enhance the capabilities of existing CEP systems by adding predictive feature through real-time learning. The main contributions of this thesis are as follows:We proposed various techniques to reduce the CPU and memory requirements of expensive queries. These operators result in exponential complexity both in terms of CPU and memory. Our proposed recomputation and heuristic-based algorithm reduce the costs of these operators. These optimizations are based on enabling efficient multidimensional indexing using space-filling curves and by clustering events into batches to reduce the cost of pair-wise joins.We designed a novel predictive CEP system that employs historical information to predict future complex events. We proposed a compressed index structure, range query processing techniques and an approximate summarizing technique over the historical space.The applicability of our techniques over the real-world problems presented has produced further customize-able solutions that demonstrate the viability of our proposed methods
Mei, Haitao. "Real-time stream processing in embedded systems". Thesis, University of York, 2017. http://etheses.whiterose.ac.uk/19750/.
Pełny tekst źródłaArgile, Andrew Duncan Stuart. "Distributed processing in decision support systems". Thesis, Nottingham Trent University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.259647.
Pełny tekst źródłaUnnava, Vasundhara. "Query processing in distributed database systems". Connect to resource, 1992. http://rave.ohiolink.edu/etdc/view.cgi?acc%5Fnum=osu1261314105.
Pełny tekst źródłaHarel, Nissim. "Memory Optimizations for Distributed Stream-based Applications". Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/13988.
Pełny tekst źródłaZhou, Wanlei, i mikewood@deakin edu au. "Building reliable distributed systems". Deakin University. School of Computing and Mathematics, 2001. http://tux.lib.deakin.edu.au./adt-VDU/public/adt-VDU20051017.160921.
Pełny tekst źródłaGunaseelan, L. "Debugging of Distributed object systems". Diss., Georgia Institute of Technology, 1994. http://hdl.handle.net/1853/9219.
Pełny tekst źródłaWorks, Karen E. "Targeted Prioritized Processing in Overloaded Data Stream Systems". Digital WPI, 2013. https://digitalcommons.wpi.edu/etd-dissertations/414.
Pełny tekst źródłaNavaratnam, Srivallipuranandan. "Reliable group communication in distributed systems". Thesis, University of British Columbia, 1987. http://hdl.handle.net/2429/26505.
Pełny tekst źródłaScience, Faculty of
Computer Science, Department of
Graduate
孫昱東 i Yudong Sun. "A distributed object model for solving irregularly structured problemson distributed systems". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31243630.
Pełny tekst źródłaFukuzono, Hayato. "Spatial Signal Processing on Distributed MIMO Systems". 京都大学 (Kyoto University), 2016. http://hdl.handle.net/2433/217206.
Pełny tekst źródłaAl-Bassiouni, Abdel-Aziz Mahmoud. "Optimum signal processing in distributed sensor systems". Thesis, Monterey, California: U.S. Naval Postgraduate School, 1987. http://hdl.handle.net/10945/22401.
Pełny tekst źródłaWe consider the problem of detection of known signals in noise using quantized, discrete sensor observations. Optimal design of the quantizers at the sensor sites as well as the global fusion of the quantized observations is presented. Also the equivalence between a team of two sensors and their fusion centre and another team of a primary decision maker and a second opinion is shown. Since the fusion of information is a main pillar of the thesis, an early chapter is devoted to the optimum fusion policy. Extension of the results to the case of vector sensor observations is also considered
Wong, Kar Leong. "A message controller for distributed processing systems". Thesis, Nottingham Trent University, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.312309.
Pełny tekst źródłaMillar, Dean Lee. "Parallel distributed processing in rock engineering systems". Thesis, Imperial College London, 2008. http://hdl.handle.net/10044/1/37116.
Pełny tekst źródłaAndersson, Sara. "Data Processing and Collection in Distributed Systems". Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-85313.
Pełny tekst źródłaDistribuerade system kan ses i en mängd olika applikationer som används idag. Tritech jobbar med flera produkter som till viss del består av distribuerade system av noder. Det dessa system har gemensamt är att noderna samlar in data och denna data kommer på ett eller ett annat sätt behöva bearbetas. En fråga som ofta behövs besvaras vid uppsättning av arkitekturen för sådana projekt är huruvida datan ska bearbetas, d.v.s. vilken arkitektkonfiguration som är mest lämplig för systemet. Att ta dessa beslut har visat sig inte alltid vara helt simpelt, och det ändrar sig relativt snabbt med den utvecklingen som sker på dessa områden. Denna uppsats syftar till att utföra en studie om vilka faktorer som påverkar valet av arkitektur för ett distribuerat system samt hur dessa faktorer förhåller sig mot varandra. För att kunna analysera vilka faktorer som påverkar valet av arkitektur och i vilken utsträckning, implementerades en simulator. Simulatorn tog faktorerna som input och returnerade en eller flera arkitekturkonfigurationer som output. Genom att utföra kvalitativa intervjuer valdes faktorerna till simulatorn. Faktorerna som analyserades i denna uppsats var: säkerhet, lagring, arbetsminne, storlek på data, antal noder, databearbetning per datamängd, robust kommunikation, batteriförbrukning och kostnad. Från de kvalitativa intervjuerna och från förstudien valdes även fem stycken arkitekturkonfigurationer. De valda arkitekturerna var: thin-client server, thick-client server, three-tier client-server, peer-to-peer, och cloud computing. Simulatorn validerades inom de tre givna användarfallen: lantbruk, tågindustri och industriell IoT. Valideringen bestod av fem befintliga projekt från Tritech. Från resultatet av valideringen producerade simulatorn korrekta resultat för tre av de fem projekten. Utifrån simulatorns resultat, kunde det ses vilka faktorer som påverkade mer vid valet av arkitektur och är svåra att kombinera i en och samma arkitekturkonfiguration. Dessa faktorer var säkerhet tillsammans med arbetsminne och robust kommunikation. Samt arbetsminne tillsammans med batteriförbrukning visade sig också vara faktorer som var svåra att kombinera i samma arkitektkonfiguration. Därför, enligt simulatorn, kan det ses att de faktorer som påverkar valet av arkitektur var arbetsminne, batteriförbrukning, säkerhet och robust kommunikation. Genom att använda simulatorns resultat utformades en beslutsmatris vars syfte var att underlätta valet av arkitektur. Utvärderingen av beslutsmatrisen bestod av fyra projekt från Tritech som inkluderade de tre givna användarfallen: lantbruk, tågindustrin och industriell IoT. Resultatet från utvärderingen av beslutsmatrisen visade att de två arkitekturerna som fick flest poäng, var en av arkitekturerna den som användes i det validerade projektet
Xia, Yu S. M. Massachusetts Institute of Technology. "Logical timestamps in distributed transaction processing systems". Thesis, Massachusetts Institute of Technology, 2018. https://hdl.handle.net/1721.1/122877.
Pełny tekst źródłaCataloged from PDF version of thesis.
Includes bibliographical references (pages 73-79).
Distributed transactions are such transactions with remote data access. They usually suffer from high network latency (compared to the internal overhead) during data operations on remote data servers, and therefore lengthen the entire transaction executiont time. This increases the probability of conflicting with other transactions, causing high abort rates. This, in turn, causes poor performance. In this work, we constructed Sundial, a distributed concurrency control algorithm that applies logical timestamps seaminglessly with a cache protocol, and works in a hybrid fashion where an optimistic approach is combined with lock-based schemes. Sundial tackles the inefficiency problem in two ways. Firstly, Sundial decides the order of transactions on the fly. Transactions get their commit timestamp according to their data access traces. Each data item in the database has logical leases maintained by the system. A lease corresponds to a version of the item. At any logical time point, only a single transaction holds the 'lease' for any particular data item. Therefore, lease holders do not have to worry about someone else writing to the item because in the logical timeline, the data writer needs to acquire a new lease which is disjoint from the holder's. This lease information is used to calculate the logical commit time for transactions. Secondly, Sundial has a novel caching scheme that works together with logical leases. The scheme allows the local data server to automatically cache data from the remote server while preserving data coherence. We benchmarked Sundial along with state-of-the-art distributed transactional concurrency control protocols. On YCSB, Sundial outperforms the second best protocol by 57% under high data access contention. On TPC-C, Sundial has a 34% improvement over the state-of-the-art candidate. Our caching scheme has performance gain comparable with hand-optimized data replication. With high access skew, it speeds the workload by up to 4.6 x.
"This work was supported (in part) by the U.S. National Science Foundation (CCF-1438955)"
by Yu Xia.
S.M.
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Bernabéu-Aubán, José Manuel. "Location finding algorithms for distributed systems". Diss., Georgia Institute of Technology, 1988. http://hdl.handle.net/1853/32951.
Pełny tekst źródłaBennett, John K. "Distributed Smalltalk : inheritance and reactiveness in distributed systems /". Thesis, Connect to this title online; UW restricted, 1988. http://hdl.handle.net/1773/6923.
Pełny tekst źródłaCannalire, Pietro. "Geo-distributed multi-layer stream aggregation". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230217.
Pełny tekst źródłaStandardbehandlingsarkitekturer är tillräckligt för uppfylla behoven av många tillämpningar genom användning av befintliga ramverk för flödesbehandling med stöd för distribuerad databehandling. I specifika fall kan geografiskt fördelade datakällor kräva att databehandlingen fördelas över ett stort område med hjälp av en geografiskt distribuerad arkitektur. Problemet som behandlas i detta arbete är minskningen av kontinuerlig dataöverföring i ett nätverk med geo-distribuerad arkitektur. Minskad dataöverföring kan vara avgörande för minskade bandbreddskonstnader då åtkomst av länkar placerade i mitten av ett nätverk kan vara dyrt och öka ytterligare med tilltagande dataöverföring. I det här arbetet vill vi skapa ett nytt koncept för att upprätta geografiskt distribuerade arkitekturer med hjälp av Apache Spark Structured Streaming och Apache Kafka. Funktioner och förutsättningar som behövs för att en algoritm ska kunna köras på en geografisk distribuerad arkitektur tillhandahålls. Algoritmerna som ska köras på denna arkitektur tillämpar “windowing synopsing” och “data synopses”-tekniker för att framställa en sammanfattning av ingående data samt behandla problem beträffande den geografiskt fördelade arkitekturen. Beräkning av medelvärdet och Misra-Gries-algoritmen implementeras för att testa den konstruerade arkitekturen. Denna avhandling bidrar till att förse ny modell för att bygga geografiskt distribuerad arkitektur. Experimentella resultat visar att beräkningstiden reduceras i genomsnitt 70% för de algoritmer som körs ovanför den geo-distribuerade arkitekturen jämfört med den distribuerade konfigurationen. På liknande sätt reduceras mängden data som utväxlas över nätverket med 99% i snitt jämfört med den distribuerade inställningen.
Gater, Christian. "Fault-tolerant distributed measurement systems". Thesis, University of Edinburgh, 1987. http://hdl.handle.net/1842/16990.
Pełny tekst źródłaLaitala, J. (Joni). "Metadata management in distributed file systems". Bachelor's thesis, University of Oulu, 2017. http://urn.fi/URN:NBN:fi:oulu-201709092881.
Pełny tekst źródłaKhalidi, M. Yousef Amin. "Hardware support for distributed object-based systems". Diss., Georgia Institute of Technology, 1989. http://hdl.handle.net/1853/8192.
Pełny tekst źródła張立新 i Lap-sun Cheung. "Load balancing in distributed object computing systems". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31224179.
Pełny tekst źródłaVASCONCELOS, RAFAEL OLIVEIRA. "AN EFFICIENT APPROACH TO COORDINATED RECONFIGURATION IN DISTRIBUTED DATA STREAM SYSTEMS". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2017. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=30660@1.
Pełny tekst źródłaCONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
Ao mesmo tempo em que sistemas de processamento de fluxo de dados devem prover serviços de análise e manipulação de dados ininterruptamente (disponibilidade 24x7), eles comumente também precisam lidar com mudanças em seus ambientes de execução (e.g., alterar a topologia da rede) e nos requisitos que eles devem cumprir (e.g., adição de novas funções de processamento dos fluxos de dados). Por um lado, reconfiguração dinâmica de software (i.e., a capacidade de substituir parte do software em tempo de execução) é uma característica desejável. Por outro lado, sistemas de fluxo de dados podem sofrer com a interrupção e sobrecarga causada pela reconfiguração. Por conta da necessidade de reconfigurar (i.e., evoluir) o sistema ao mesmo tempo em que o sistema não pode ser interrompido (i.e., bloqueado), reconfiguração consistente e não bloqueante é ainda considerada um problema em aberto na literatura. Esta tese apresenta e valida uma abordagem não quiescente para reconfiguração dinâmica de software que preserva a consistência de sistemas de fluxo de dados distribuídos. A abordagem proposta permite que o sistema seja reconfigurado gradual e suavemente, sem precisar interromper o processamento do fluxo de dados ou atingir a quiescência. A avaliação indica que a abordagem proposta realiza reconfiguração distribuída consistentemente e tem um impacto desprezível sobre a diminuição na disponibilidade e no desempenho do sistema. Além disto, a implementação da abordagem proposta teve um desempenho melhor em todos os testes comparativos.
While many data stream systems have to provide continuous (24x7) services with no acceptable downtime, they also have to cope with changes in their execution environments and in the requirements that they must comply (e.g., moving from on-premises architecture to a cloud system, changing the network technology, adding new functionality or modifying existing parts). On one hand, dynamic software reconfiguration (i.e., the capability of evolving on the fly) is a desirable feature. On the other hand, stream systems may suffer from the disruption and overhead caused by the reconfiguration. Due to the necessity of reconfiguring (i.e., evolving) the system whilst the system must not be disrupted (i.e., blocked), consistent and non-disruptive reconfiguration is still considered an open problem. This thesis presents and validates a non-quiescent approach for dynamic software reconfiguration that preserves the consistency of distributed data stream processing systems. Unlike many works that require the system to reach a safe state (e.g., quiescence) before performing a reconfiguration, the proposed approach enables the system to smoothly evolve (i.e., be reconfigured) in a non-disruptive way without reaching quiescence. The evaluation indicates that the proposed approach supports consistent distributed reconfiguration and has negligible impact on availability and performance. Furthermore, the implementation of the proposed approach showed better performance results in all experiments than the quiescent approach and Upstart.
Reale, Andrea <1986>. "Quality of Service in Distributed Stream Processing for large scale Smart Pervasive Environments". Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amsdottorato.unibo.it/6390/1/main.pdf.
Pełny tekst źródłaReale, Andrea <1986>. "Quality of Service in Distributed Stream Processing for large scale Smart Pervasive Environments". Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amsdottorato.unibo.it/6390/.
Pełny tekst źródłaKotto, Kombi Roland. "Distributed query processing over fluctuating streams". Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEI050/document.
Pełny tekst źródłaIn a Big Data context, stream processing has become a very active research domain. In order to manage ephemeral data (Velocity) arriving at important rates (Volume), some specific solutions, denoted data stream management systems (DSMSs),have been developed. DSMSs take as inputs some queries, called continuous queries,defined on a set of data streams. Acontinuous query generates new results as long as new data arrive in input. In many application domains, data streams haveinput rates and distribution of values which change over time. These variations may impact significantly processingrequirements for each continuous query.This thesis takes place in the ANR project Socioplug (ANR-13-INFR-0003). In this context, we consider a collaborative platformfor stream processing. Each user can submit multiple continuous queries and contributes to the execution support of theplatform. However, as each processing unit supporting treatments has limited resources in terms of CPU and memory, asignificant increase in input rate may cause the congestion of the system. The problem is then how to adjust dynamicallyresource usage to processing requirements for each continuous query ? It raises several challenges : i) how to detect a need ofreconfiguration ? ii) when reconfiguring the system to avoid its congestion at runtime ?In this work, we are interested by the different processing steps involved in the treatment of a continuous query over adistributed infrastructure. From this global analysis, we extract mechanisms enabling dynamic adaptation of resource usage foreach continuous query. We focus on automatic parallelization, or auto-parallelization, of operators composing the executionplan of a continuous query. We suggest an original approach based on the monitoring of operators and an estimation ofprocessing requirements in near future. Thus, we can increase (scale-out), or decrease (scale-in) the parallelism degree ofoperators in a proactive many such as resource usage fits to processing requirements dynamically. Compared to a staticconfiguration defined by an expert, we show that it is possible to avoid the congestion of the system in many cases or to delay itin most critical cases. Moreover, we show that resource usage can be reduced significantly while delivering equivalentthroughput and result quality. We suggest also to combine this approach with complementary mechanisms for dynamic adaptation of continuous queries at runtime. These differents approaches have been implemented within a widely used DSMS and have been tested over multiple and reproductible micro-benchmarks
Kohli, Prince. "User-level state sharing in distributed systems". Diss., Georgia Institute of Technology, 1996. http://hdl.handle.net/1853/9170.
Pełny tekst źródłaBrito, Andrey. "Speculation in Parallel and Distributed Event Processing Systems". Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2010. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-38911.
Pełny tekst źródłaBhasker, Bharat. "Query processing in heterogeneous distributed database management systems". Diss., Virginia Tech, 1992. http://hdl.handle.net/10919/39437.
Pełny tekst źródłaPh. D.
Elmagarmid, Ahmed Khalifa. "Deadlock detection and resolution in distributed processing systems /". The Ohio State University, 1985. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487261919110166.
Pełny tekst źródłaDI, SALVO ANDREA. "CMOS distributed signal processing systems for radiation sensors". Doctoral thesis, Politecnico di Torino, 2022. http://hdl.handle.net/11583/2957742.
Pełny tekst źródłaEbrahimian, Mohammad Reza. "Power system operations : state estimation distributed processing /". Digital version accessible at:, 1999. http://wwwlib.umi.com/cr/utexas/main.
Pełny tekst źródłaPeiro, Sajjad Hooman. "Towards Unifying Stream Processing over Central and Near-the-Edge Data Centers". Licentiate thesis, KTH, Programvaruteknik och Datorsystem, SCS, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-193582.
Pełny tekst źródłaQC 20161005
Sun, Yudong. "A distributed object model for solving irregularly structured problems on distributed systems /". Hong Kong : University of Hong Kong, 2001. http://sunzi.lib.hku.hk/hkuto/record.jsp?B23501662.
Pełny tekst źródłaMartin, André. "Minimizing Overhead for Fault Tolerance in Event Stream Processing Systems". Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-210251.
Pełny tekst źródła