Dissertations / Theses: 'Large Scale Applications Implementing'

1

Smaragdakis, Ioannis. "Implementing large-scale object-oriented components /." Digital version accessible at:, 1999. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Martínez, Trujillo Andrea. "Dynamic Tuning for Large-Scale Parallel Applications." Doctoral thesis, Universitat Autònoma de Barcelona, 2013. http://hdl.handle.net/10803/125872.

Full text

Abstract:

La era actual de computación a gran escala se caracteriza por el uso de aplicaciones paralelas ejecutadas en miles de cores. Sin embargo, el rendimiento obtenido al ejecutar estas aplicaciones no siempre es el esperado. La sintonización dinámica es una potente técnica que puede ser usada para reducir la diferencia entre el rendimiento real y el esperado en aplicaciones paralelas. Actualmente, la mayoría de las aproximaciones que ofrecen sintonización dinámica siguen una estructura centralizada, donde un único módulo de análisis, responsable de controlar toda la aplicación paralela, puede convertirse en un cuello de botella en entornos a gran escala. La principal contribución de esta tesis es la creación de un modelo novedoso que permite la sintonización dinámica descentralizada de aplicaciones paralelas a gran escala. Dicho modelo se apoya en dos conceptos principales: la descomposición de la aplicación y un mecanismo de abstracción. Mediante la descomposición, la aplicación paralela es dividida en subconjuntos disjuntos de tareas, los cuales son analizados y sintonizados separadamente. Mientras que el mecanismo de abstracción permite que estos subconjuntos sean vistos como una única aplicación virtual y, de esta manera, se puedan conseguir mejoras de rendimiento globales. Este modelo se diseña como una red jerárquica de sintonización formada por módulos de análisis distribuidos. La topología de la red de sintonización se puede configurar para acomodarse al tamaño de la aplicación paralela y la complejidad de la estrategia de sintonización empleada. De esta adaptabilidad surge la escalabilidad del modelo. Para aprovechar la adaptabilidad de la topología, en este trabajo se propone un método que calcula topologías de redes de sintonización compuestas por el mínimo número de módulos de análisis necesarios para proporcionar sintonización dinámica de forma efectiva. El modelo propuesto ha sido implementado como una herramienta para sintonización dinámica a gran escala llamada ELASTIC. Esta herramienta presenta una arquitectura basada en plugins y permite aplicar distintas técnicas de análisis y sintonización. Empleando ELASTIC, se ha llevado a cabo una evaluación experimental sobre una aplicación sintética y una aplicación real. Los resultados muestran que el modelo propuesto, implementado en ELASTIC, es capaz de escalar para cumplir los requerimientos de sintonizar dinámicamente miles de procesos y, además, mejorar el rendimiento de esas aplicaciones.
The current large-scale computing era is characterised by parallel applications running on many thousands of cores. However, the performance obtained when executing these applications is not always what it is expected. Dynamic tuning is a powerful technique which can be used to reduce the gap between real and expected performance of parallel applications. Currently, the majority of the approaches that offer dynamic tuning follow a centralised scheme, where a single analysis module, responsible for controlling the entire parallel application, can become a bottleneck in large-scale contexts. The main contribution of this thesis is a novel model that enables decentralised dynamic tuning of large-scale parallel applications. Application decomposition and an abstraction mechanism are the two key concepts which support this model. The decomposition allows a parallel application to be divided into disjoint subsets of tasks which are analysed and tuned separately. Meanwhile, the abstraction mechanism permits these subsets to be viewed as a single virtual application so that global performance improvements can be achieved. A hierarchical tuning network of distributed analysis modules fits the design of this model. The topology of this tuning network can be configured to accommodate the size of the parallel application and the complexity of the tuning strategy being employed. It is from this adaptability that the model's scalability arises. To fully exploit this adaptable topology, in this work a method is proposed which calculates tuning network topologies composed of the minimum number of analysis modules required to provide effective dynamic tuning. The proposed model has been implemented in the form of ELASTIC, an environment for large-scale dynamic tuning. ELASTIC presents a plugin architecture, which allows different performance analysis and tuning strategies to be applied. Using ELASTIC, experimental evaluation has been carried out on a synthetic and a real parallel application. The results show that the proposed model, embodied in ELASTIC, is able to not only scale to meet the demands of dynamic tuning over thousands of processes, but is also able to effectively improve the performance of these applications.

APA, Harvard, Vancouver, ISO, and other styles

3

Dacosta, Italo. "Practical authentication in large-scale internet applications." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44863.

Full text

Abstract:

Due to their massive user base and request load, large-scale Internet applications have mainly focused on goals such as performance and scalability. As a result, many of these applications rely on weaker but more efficient and simpler authentication mechanisms. However, as recent incidents have demonstrated, powerful adversaries are exploiting the weaknesses in such mechanisms. While more robust authentication mechanisms exist, most of them fail to address the scale and security needs of these large-scale systems. In this dissertation we demonstrate that by taking into account the specific requirements and threat model of large-scale Internet applications, we can design authentication protocols for such applications that are not only more robust but also have low impact on performance, scalability and existing infrastructure. In particular, we show that there is no inherent conflict between stronger authentication and other system goals. For this purpose, we have designed, implemented and experimentally evaluated three robust authentication protocols: Proxychain, for SIP-based VoIP authentication; One-Time Cookies (OTC), for Web session authentication; and Direct Validation of SSL/TLS Certificates (DVCert), for server-side SSL/TLS authentication. These protocols not only offer better security guarantees, but they also have low performance overheads and do not require additional infrastructure. In so doing, we provide robust and practical authentication mechanisms that can improve the overall security of large-scale VoIP and Web applications.

APA, Harvard, Vancouver, ISO, and other styles

4

Roy, Yagnaseni. "Modeling nanofiltration for large scale desalination applications." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/100096.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 91-94).
The Donnan Steric Pore Model with dielectric exclusion (DSPM-DE) is implemented over flatsheet and spiral-wound leaves to develop a comprehensive model for nanofiltration modules. This model allows the user to gain insight into the physics of the nanofiltration process by allowing one to adjust and investigate effects of membrane charge, pore radius, and other membrane characteristics. The study shows how operating conditions such as feed flow rate and pressure affect the recovery ratio and solute rejection across the membrane. A comparison is made between the results for the flat-sheet and spiral-wound configurations. The comparison showed that for the spiral-wound leaf, the maximum values of transmembrane pressure, flux and velocity occur at the feed entrance (near the permeate exit), and the lowest value of these quantities are at the diametrically opposite corner. This is in contrast to the flat-sheet leaf, where all the quantities vary only in the feed flow direction. However it is found that the extent of variation of these quantities along the permeate flow direction in the spiral-wound membrane is negligibly small in most cases. Also, for identical geometries and operating conditions, the flatsheet and spiral-wound configurations give similar results. Thus the computationally expensive and complex spiral-wound model can be replaced by the flat-sheet model for a variety of purposes. In addition, the model was utilized to predict the performance of a seawater nanofiltration system which has been validated with the data obtained from a large-scale seawater desalination plant, thereby establishing a reliable model for desalination using nanofiltration.
by Yagnaseni Roy.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

5

Huang, Jen-Cheng. "Efficient simulation techniques for large-scale applications." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53963.

Full text

Abstract:

Architecture simulation is an important performance modeling approach. Modeling hardware components with sufficient detail helps architects to identify both hardware and software bottlenecks. However, the major issue of architectural simulation is the huge slowdown compared to native execution. The slowdown gets higher for the emerging workloads that feature high throughput and massive parallelism, such as GPGPU kernels. In this dissertation, three simulation techniques were proposed to simulate emerging GPGPU kernels and data analytic workloads efficiently. First, TBPoint reduce the simulated instructions of GPGPU kernels using the inter-launch and intra-launch sampling approaches. Second, GPUmech improves the simulation speed of GPGPU kernels by abstracting the simulation model using functional simulation and analytical modeling. Finally, SimProf applies stratified random sampling with performance counters to select representative simulation points for data analytic workloads to deal with data-dependent performance. This dissertation presents the techniques that can be used to simulate the emerging large-scale workloads accurately and efficiently.

APA, Harvard, Vancouver, ISO, and other styles

6

Verdugo, Retamal Cristian Andrés. "Photovoltaic power converter for large scale applications." Doctoral thesis, Universitat Politècnica de Catalunya, 2021. http://hdl.handle.net/10803/672343.

Full text

Abstract:

Most of large scale photovoltaic systems are based on centralized configurations with voltage source converters of two or three output voltage levels connected to photovoltaic panels. With the development of multilevel converters, new possible topologies have come out to replace current configurations in large scale photovoltaic applications, reducing filter requirements in the ac side, increasing the voltage level operation and improving power quality. One of the main challenges of implementing multilevel converter in large scale photovoltaic power plants is the appearance of high leakage currents and floating voltages due to the significant number of power modules in series connection. To solve this issue, multilevel converters have introduced high or low frequency transformers, which provide inherent galvanic isolation to the photovoltaic panels. The Cascaded H-Bridge converter (CHB) with high frequency transformers in a second conversion stage has provided a promising solution for large scale application, since it eliminates the floating voltage problem and provides an isolated control stage for each dc side of the power modules. In an effort to integrate ac transformers to reduce the requirement of a second conversion stage, Cascaded Transformer Multilevel Inverters (CTMI) have been propose for photovoltaic applications. These configurations use the secondary side of the ac transformers to create the series connection, while the primary side is connected to each power module, satisfying isolation requirements and providing different possibilities of winding connections for symmetrical and asymmetrical configurations. Based on the requirements of multilevel converters for large scale photovoltaic applications, the main goal of this PhD dissertation is to develop a new multilevel converter which provides galvanic isolation to all power modules, while allowing an independent control algorithm for their power generation. The configuration proposed is called Isolated Multi-Modular Converter (IMMC) and provides galvanic isolation through ac transformers. The IMMC comprises two group of series connected power modules referred as arm, which are electrically interconnected in parallel. The power modules are based on three-phase voltage source converters connected to individual group of photovoltaic panels in the dc side, while the ac side is connected to three-phase low frequency transformers. Therefore, several isolated modules can be connected in series. Because the power generated by photovoltaic panels may be affected by environmental conditions, power modules are prone to generate different power levels. This scenario must be covered by the IMMC, thus providing high flexibility in the power regulation. In this regard, this PhD dissertation proposes two control strategies embedded in each power module, whose role is to control the power flow based on the dc voltage level and the current arm information. The Amplitude Voltage Compensation (AVC) regulates the amplitude of the modulated voltage, while the Quadrature Voltage Compensation (QVC) regulates its phase angle by introducing a circulating current flowing through the arms. Additionally, it is demonstrated that combining both control strategies, the capability to withstand power imbalances increases, providing a higher flexibility. The IMMC was modelled and validated via simulation results. Besides, a control algorithm was proposed to regulate the total power generated. A downscale experimental setup of 10kW was built to endorse the analysis demonstrated via simulation. This study considers an IMMC connected to the electrical grid, which operates in balance and imbalance power scenarios, demonstrating the complete flexibility of the converter to be implemented in large photovoltaic applications.
La mayoría de los sistemas fotovoltaicos de gran escala tienen una configuraci ón centralizada con convertidores de dos otres niveles de tensión de salida conectados a paneles fotovoltaicos. Con el desarrollo de los convertidores multinivel, nuevas topologías han aparecido para reemplazar las configuraciones usadas actualmente en aplicaciones fotovoltaicas, reduciendo los requerimientos de grandes filtros, incrementando los niveles de tensi ón de operación y mejorando la calidad de la potencia. Uno de los mayores desafíos de los convertidores multinivel en aplicaciones fotovoltaicas de gran escala es la presencia de corrientes de fuga y tensiones de flotación debido al significante aumento de módulos de potencia conectados en serie. Para solucionar este problema, los convertidores multinivel incluyen transformadores de alta o baja frecuencia, los cuales proveen aislación galvánica a los paneles fotovoltaicos. El Convertidor Cascada con Puente H y transformadores de alta frecuencia en una segunda etapa de conversión ha proporcionado una solución prometedora para aplicaciones de gran escala, ya que elimina el problema de tensión de flotación y además proporciona una etapa de control independiente al bus dc. En un esfuerzo de integrar transformadores en el lado de corriente alterna para evitar una segunda etapa de conversión, los Convertidores Multinivel con Transformadores en Cascada (CTMI) han sido propuestos para aplicaciones fotovoltaicas. Estas configuraciones utilizan el secundario del transformador para crear la conexi ón serie, mientras que el primario es conectado a cada módulo de potencia, satisfaciendo los requisitos de aislaci ón y proporcionando diferentes posibilidades de conexiones en los devanados para generar configuraciones sim étricas y asimétricas. Considerando los requisitos de convertidores multinivel para aplicaciones fotovoltaicas de gran escala, el principal propósito de esta tesis es desarrollar una configuración de convertidor multinivel el cual proporcione aislación galvánica a todos sus módulos, además de un control independiente para la potencia generada. La configuraci ón propuesta se llama Convertidor Multi-Modular Aislado (IMMC) y proporciona aislaci ón galvánica a través de transformadores en el lado ac. El IMMC se conforma de dos grupos de módulos de potencia conectados en serie denominadas ramas, las cuales se interconectan en paralelo. Los módulos de potencia se conforman de convertidores fuente de tensi ón trifásicos conectados a grupos individuales de paneles fotovoltaicos, mientras que el lado ac se conecta a transformadores trif ásicos de baja frecuencia. Por lo tanto, varios módulos aislados pueden ser conectados en serie. Debido a que la potencia generada por los paneles fotovoltaicos depende de las condiciones ambientales, los m ódulos son propensos a generar diferentes niveles de potencia. Este escenario debe ser soportado por el IMMC, proporcionando alta flexibilidad en la regulación de potencia. Por lo tanto, esta tesis propone dos estrategias de control cuyo rol es regular el flujo de potencia de cada módulo mediante la tensión en la etapa continua y la corriente de rama. La compensaci ón por amplitud de tensión (AVC) regula la amplitud del índice de modulación, mientras que la compensación de la tensión en cuadratura (QVC) regula el ángulo de fase. Adicionalmente, se demuestra que, al combinar ambas estrategias de control, la capacidad para tolerar desbalances de potencia aumenta, proporcionando una mayor flexibilidad. El convertidor IMMC fue modelado y validado mediante resultados de simulaci ón. Además, una estrategia de control fue propuesta para regular la potencia total generada. Un prototipo de 10kW fue construido para respaldar los resultados presentados en simulación. Este estudio considera un convertidor IMMC conectado a la red el éctrica que opera en diferentes condiciones de potencia, demostrando una alta flexibilidad
Sistemes d'energia elèctrica

APA, Harvard, Vancouver, ISO, and other styles

7

Branco, Miguel. "Distributed data management for large scale applications." Thesis, University of Southampton, 2009. https://eprints.soton.ac.uk/72283/.

Full text

Abstract:

Improvements in data storage and network technologies, the emergence of new highresolution scientific instruments, the widespread use of the Internet and the World Wide Web and even globalisation have contributed to the emergence of new large scale dataintensive applications. These applications require new systems that allow users to store, share and process data across computing centres around the world. Worldwide distributed data management is particularly important when there is a lot of data, more than can fit in a single computer or even in a single data centre. Designing systems to cope with the demanding requirements of these applications is the focus of the present work. This thesis presents four contributions. First, it introduces a set of design principles that can be used to create distributed data management systems for data-intensive applications. Second, it describes an architecture and implementation that follows the proposed design principles, and which results in a scalable, fault tolerant and secure system. Third, it presents the system evaluation, which occurred under real operational conditions using close to one hundred computing sites and with more than 14 petabytes of data. Fourth, it proposes novel algorithms to model the behaviour of file transfers on a wide-area network. This work also presents a detailed description of the problem of managing distributed data, ranging from the collection of requirements to the identification of the uncertainty that underlies a large distributed environment. This includes a critique of existing work and the identification of practical limits to the development of transfer algorithms on a shared distributed environment. The motivation for this work has been the ATLAS Experiment for the Large Hadron Collider (LHC) at CERN, where the author was responsible for the development of the data management middleware.

APA, Harvard, Vancouver, ISO, and other styles

8

Van, Mai Vien. "Large-Scale Optimization With Machine Learning Applications." Licentiate thesis, KTH, Reglerteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263147.

Full text

Abstract:

This thesis aims at developing efficient algorithms for solving some fundamental engineering problems in data science and machine learning. We investigate a variety of acceleration techniques for improving the convergence times of optimization algorithms. First, we investigate how problem structure can be exploited to accelerate the solution of highly structured problems such as generalized eigenvalue and elastic net regression. We then consider Anderson acceleration, a generic and parameter-free extrapolation scheme, and show how it can be adapted to accelerate practical convergence of proximal gradient methods for a broad class of non-smooth problems. For all the methods developed in this thesis, we design novel algorithms, perform mathematical analysis of convergence rates, and conduct practical experiments on real-world data sets.

QC 20191105

APA, Harvard, Vancouver, ISO, and other styles

9

McKenzie, Donald. "Modeling large-scale fire effects : concepts and applications /." Thesis, Connect to this title online; UW restricted, 1998. http://hdl.handle.net/1773/5602.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Lu, Haihao Ph D. Massachusetts Institute of Technology. "Large-scale optimization Methods for data-science applications." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122272.

Full text

Abstract:

Thesis: Ph. D. in Mathematics and Operations Research, Massachusetts Institute of Technology, Department of Mathematics, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 203-211).
In this thesis, we present several contributions of large scale optimization methods with the applications in data science and machine learning. In the first part, we present new computational methods and associated computational guarantees for solving convex optimization problems using first-order methods. We consider general convex optimization problem, where we presume knowledge of a strict lower bound (like what happened in empirical risk minimization in machine learning). We introduce a new functional measure called the growth constant for the convex objective function, that measures how quickly the level sets grow relative to the function value, and that plays a fundamental role in the complexity analysis. Based on such measure, we present new computational guarantees for both smooth and non-smooth convex optimization, that can improve existing computational guarantees in several ways, most notably when the initial iterate is far from the optimal solution set.
The usual approach to developing and analyzing first-order methods for convex optimization always assumes that either the gradient of the objective function is uniformly continuous (in the smooth setting) or the objective function itself is uniformly continuous. However, in many settings, especially in machine learning applications, the convex function is neither of them. For example, the Poisson Linear Inverse Model, the D-optimal design problem, the Support Vector Machine problem, etc. In the second part, we develop a notion of relative smoothness, relative continuity and relative strong convexity that is determined relative to a user-specified "reference function" (that should be computationally tractable for algorithms), and we show that many differentiable convex functions are relatively smooth or relatively continuous with respect to a correspondingly fairly-simple reference function.
We extend the mirror descent algorithm to our new setting, with associated computational guarantees. Gradient Boosting Machine (GBM) introduced by Friedman is an extremely powerful supervised learning algorithm that is widely used in practice -- it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In the third part, we propose the Randomized Gradient Boosting Machine (RGBM) and the Accelerated Gradient Boosting Machine (AGBM). RGBM leads to significant computational gains compared to GBM, by using a randomization scheme to reduce the search in the space of weak-learners. AGBM incorporate Nesterov's acceleration techniques into the design of GBM, and this is the first GBM type of algorithm with theoretically-justified accelerated convergence rate. We demonstrate the effectiveness of RGBM and AGBM over GBM in obtaining a model with good training and/or testing data fidelity.
by Haihao Lu.
Ph. D. in Mathematics and Operations Research
Ph.D.inMathematicsandOperationsResearch Massachusetts Institute of Technology, Department of Mathematics

APA, Harvard, Vancouver, ISO, and other styles

11

Justinia, Taghreed. "Implementing large-scale healthcare information systems : the technological, managerial and behavioural issues." Thesis, Swansea University, 2009. https://cronfa.swan.ac.uk/Record/cronfa42224.

Full text

Abstract:

This study investigated the challenges that a national healthcare organisation in Saudi Arabia had to overcome in order to achieve a nationwide large-scale healthcare information system implementation. The study also examined the implications of those issues on the applicability of organisational change management models in healthcare systems implementations. The project's focus on the implementation process directed the methodology towards a qualitative approach. Semi-structured, in-depth interviews were used. Thirty-two participants were interviewed. They were members of the organisation who were directly involved with the implementation either as Information Technology executives and managers. Information Technology analysts and implementers, senior hospital executives from clinical areas, and other stakeholders from various departments. The data were systematically analysed using an original 'five-stage analysis framework'; specifically designed for this study. This lead to the inductive identification of forty codes, that were further refined and structured through additional stages of analysis influenced by Grounded Theory. Finally, as observed within the interviews, the most significant challenges were categorised under three broad interconnected themes; Information Technology and Systems (internal and external issues). Managerial Affairs (managing the project and resources), and Behavioural Issues (leadership and change management structures). These three themes were further structured leading to a detailed discussion on the findings. While the collection of data was driven by questions on challenges typically associated with healthcare systems implementations, the findings divulged a set of unique problems for this Saudi healthcare organisation. Some challenges were specific to it because of its nature, resources (financial and human), size, distribution of sites, project scale and its regional setting, and political atmosphere, while others were more generic problems typical of healthcare systems implementations. What has resulted from this implementation was a model for leading change in healthcare systems implementations that could be used to guide IT implementations in healthcare organisations elsewhere.

APA, Harvard, Vancouver, ISO, and other styles

12

Wang, Xudong. "Large-Scale Patterned Oxide Nanostructures: Fabrication, Characterization and Applications." Diss., Available online, Georgia Institute of Technology, 2005, 2005. http://etd.gatech.edu/theses/available/etd-11212005-142143/.

Full text

Abstract:

Thesis (Ph. D.)--Materials Science and Engineering, Georgia Institute of Technology, 2006.
Wang, Zhong Lin, Committee Chair ; Summers, Christopher J., Committee Co-Chair ; Wong, C. P., Committee Member ; Dupuis, Russell D., Committee Member ; Wagner, Brent, Committee Member

APA, Harvard, Vancouver, ISO, and other styles

13

Morari, Alessadro. "Scalable system software for high performance large-scale applications." Doctoral thesis, Universitat Politècnica de Catalunya, 2014. http://hdl.handle.net/10803/144564.

Full text

Abstract:

In the last decades, high-performance large-scale systems have been a fundamental tool for scientific discovery and engineering advances. The sustained growth of supercomputing performance and the concurrent reduction in cost have made this technology available for a large number of scientists and engineers working on many different problems. The design of next-generation supercomputers will include traditional HPC requirements as well as the new requirements to handle data-intensive computations. Data intensive applications will hence play an important role in a variety of fields, and are the current focus of several research trends in HPC. Due to the challenges of scalability and power efficiency, next-generation of supercomputers needs a redesign of the whole software stack. Being at the bottom of the software stack, system software is expected to change drastically to support the upcoming hardware and to meet new application requirements. This PhD thesis addresses the scalability of system software. The thesis start at the Operating System level: first studying general-purpose OS (ex. Linux) and then studying lightweight kernels (ex. CNK). Then, we focus on the runtime system: we implement a runtime system for distributed memory systems that includes many of the system services required by next-generation applications. Finally we focus on hardware features that can be exploited at user-level to improve applications performance, and potentially included into our advanced runtime system. The thesis contributions are the following: Operating System Scalability: We provide an accurate study of the scalability problems of modern Operating Systems for HPC. We design and implement a methodology whereby detailed quantitative information may be obtained for each OS noise event. We validate our approach by comparing it to other well-known standard techniques to analyze OS noise, such FTQ (Fixed Time Quantum). Evaluation of the address translation management for a lightweight kernel: we provide a performance evaluation of different TLB management approaches ¿ dynamic memory mapping, static memory mapping with replaceable TLB entries, and static memory mapping with fixed TLB entries (no TLB misses) on a IBM BlueGene/P system. Runtime System Scalability: We show that a runtime system can efficiently incorporate system services and improve scalability for a specific class of applications. We design and implement a full-featured runtime system and programming model to execute irregular appli- cations on a commodity cluster. The runtime library is called Global Memory and Threading library (GMT) and integrates a locality-aware Partitioned Global Address Space communication model with a fork/join program structure. It supports massive lightweight multi-threading, overlapping of communication and computation and small messages aggregation to tolerate network latencies. We compare GMT to other PGAS models, hand-optimized MPI code and custom architectures (Cray XMT) on a set of large scale irregular applications: breadth first search, random walk and concurrent hash map access. Our runtime system shows performance orders of magnitude higher than other solutions on commodity clusters and competitive with custom architectures. User-level Scalability Exploiting Hardware Features: We show the high complexity of low-level hardware optimizations for single applications, as a motivation to incorporate this logic into an adaptive runtime system. We evaluate the effects of controllable hardware-thread priority mechanism that controls the rate at which each hardware-thread decodes instruction on IBM POWER5 and POWER6 processors. Finally, we show how to effectively exploits cache locality and network-on-chip on the Tilera many-core architecture to improve intra-core scalability.

APA, Harvard, Vancouver, ISO, and other styles

14

Du, Jian, and 杜健. "Distributed estimation in large-scale networks : theories and applications." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013. http://hdl.handle.net/10722/197090.

Full text

Abstract:

Parameter estimation plays a key role in many signal processing applications. Traditional parameter estimation relies on centralized method which requires gathering of all information dispersed over the network in a central processing unit. As the scale of network increases, centralized estimation is not preferred since it requires not only the knowledge of network topology but also heavy communications from peripheral nodes to central processing unit. Besides, computation at the control center cannot scale indefinitely with the network size. Therefore, distributed estimation which involves only local computation at each node and limited information exchanges between immediate neighbouring nodes is needed. In this thesis, for local observations in the form of a pairwise linear model corrupted by Gaussian noise, belief propagation (BP) algorithm is investigated to perform distributed estimation. It involves only iterative updating of the estimates with local message exchange between immediate neighboring nodes. Since convergence has always been the biggest concern when using BP, we establish the convergence properties of asynchronous vector form Gaussian BP under the pairwise model. It is shown analytically that under mild condition, the asynchronous BP algorithm converges to the optimal estimates with estimation mean square error (MSE) at each node approaching the centralized Bayesian Cram´er-Rao bound (BCRB) regardless of the network topology. The proposed framework encompasses both classes of synchronous and asynchronous algorithms for distributed estimation and is robust to random link failures. Two challenging parameter estimation problems in large-scale networks, i.e., network-wide distributed carrier frequency offsets (CFOs) estimation, and global clock synchronization in sensor network, are studied based on BP. The proposed algorithms do not require any centralized information processing nor knowledge of the global network topology and are scalable with the network size. Simulation results further verify the established theoretical analyses: the proposed algorithms always converge to the optimal estimates regardless of network topology. Simulations also demonstrate the MSE at each node approaches the corresponding centralized CRB within a few iterations of message exchange. Furthermore, distributed estimation is studied for the linear model with unknown coefficients. Such problem itself is challenging even for centralized estimation as the nonlinear property of the observation model. One problem following this model is the power state estimation with unknown sampling phase error. In this thesis, distributed estimation scheme is proposed based on variational inference with parallel update schedule and limited message exchange between neighboring areas, and the convergence is guaranteed. Simulation results show that after convergence the proposed algorithm performs very close to that of the ideal case which assumes perfect synchronization, and centralized information processing.
published_or_final_version
Electrical and Electronic Engineering
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

15

Deng, Jie. "Profiling large-scale live video streaming and distributed applications." Thesis, Queen Mary, University of London, 2018. http://qmro.qmul.ac.uk/xmlui/handle/123456789/43948.

Full text

Abstract:

Today, distributed applications run at data centre and Internet scales, from intensive data analysis, such as MapReduce; to the dynamic demands of a worldwide audience, such as YouTube. The network is essential to these applications at both scales. To provide adequate support, we must understand the full requirements of the applications, which are revealed by the workloads. In this thesis, we study distributed system applications at different scales to enrich this understanding. Large-scale Internet applications have been studied for years, such as social networking service (SNS), video on demand (VoD), and content delivery networks (CDN). An emerging type of video broadcasting on the Internet featuring crowdsourced live video streaming has garnered attention allowing platforms such as Twitch to attract over 1 million concurrent users globally. To better understand Twitch, we collected real-time popularity data combined with metadata about the contents and found the broadcasters rather than the content drives its popularity. Unlike YouTube and Netflix where content can be cached, video streaming on Twitch is generated instantly and needs to be delivered to users immediately to enable real-time interaction. Thus, we performed a large-scale measurement of Twitchs content location revealing the global footprint of its infrastructure as well as discovering the dynamic stream hosting and client redirection strategies that helped Twitch serve millions of users at scale. We next consider applications that run inside the data centre. Distributed computing applications heavily rely on the network due to data transmission needs and the scheduling of resources and tasks. One successful application, called Hadoop, has been widely deployed for Big Data processing. However, little work has been devoted to understanding its network. We found the Hadoop behaviour is limited by hardware resources and processing jobs presented. Thus, after characterising the Hadoop traffic on our testbed with a set of benchmark jobs, we built a simulator to reproduce Hadoops job traffic With the simulator, users can investigate the connections between Hadoop traffic and network performance without additional hardware cost. Different network components can be added to investigate the performance, such as network topologies, queue policies, and transport layer protocols. In this thesis, we extended the knowledge of networking by investigated two widelyused applications in the data centre and at Internet scale. We (i) studied the most popular live video streaming platform Twitch as a new type of Internet-scale distributed application revealing that broadcaster factors drive the popularity of such platform, and we (ii) discovered the footprint of Twitch streaming infrastructure and the dynamic stream hosting and client redirection strategies to provide an in-depth example of video streaming delivery occurring at the Internet scale, also we (iii) investigated the traffic generated by a distributed application by characterising the traffic of Hadoop under various parameters, (iv) with such knowledge, we built a simulation tool so users can efficiently investigate the performance of different network components under distributed application.

APA, Harvard, Vancouver, ISO, and other styles

16

Fragkos, I. "Large-scale optimisation in operations management : algorithms and applications." Thesis, University College London (University of London), 2014. http://discovery.ucl.ac.uk/1413951/.

Full text

Abstract:

The main contributions of this dissertation are the design, development and application of optimisation methodology, models and algorithms for large-scale problems arising in Operations Management. The first chapter introduces constraint transformations and valid inequalities that enhance the performance of column generation and Lagrange relaxation. I establish theoretical connections with dual-space reduction techniques and develop a novel algorithm that combines Lagrange relaxation and column generation. This algorithm is embedded in a branch-and-price scheme, which combines large neighbourhood and local search to generate upper bounds. Computational experiments on capacitated lot sizing show significant improvements over existing methodologies. The second chapter introduces a Horizon-Decomposition approach that partitions the problem horizon in contiguous intervals. In this way, subproblems identical to the original problem but of smaller size are created. The size of the master problem and the subproblems are regulated via two scalar parameters, giving rise to a family of reformulations. I investigate the efficiency of alternative parameter configurations empirically. Computational experiments on capacitated lot sizing demonstrate superior performance against commercial solvers. Finally, extensions to generic mathematical programs are presented. The final chapter shows how large-scale optimisation methods can be applied to complex operational problems, and presents a modelling framework for scheduling the transhipment operations of the Noble Group, a global supply chain manager of energy products. I focus on coal operations, where coal is transported from mines to vessels using barges and floating cranes. Noble pay millions of dollars in penalties for delays, and for additional resources hired to minimize the impact of delays. A combination of column generation and dedicated heuristics reduces the cost of penalties and additional resources, and improves the efficiency of the operations. Noble currently use the developed framework, and report significant savings attributed to it.

APA, Harvard, Vancouver, ISO, and other styles

17

Uppala, Roshni. "Simulating Large Scale Memristor Based Crossbar for Neuromorphic Applications." University of Dayton / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1429296073.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Benjaminsson, Simon. "On large-scale neural simulations and applications in neuroinformatics." Doctoral thesis, KTH, Beräkningsbiologi, CB, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-122190.

Full text

Abstract:

This thesis consists of three parts related to the in silico study of the brain: technologies for large-scale neural simulations, neural algorithms and models and applications in large-scale data analysis in neuroinformatics. All parts rely on the use of supercomputers. A large-scale neural simulator is developed where techniques are explored for the simulation, analysis and visualization of neural systems on a high biological abstraction level. The performance of the simulator is investigated on some of the largest supercomputers available. Neural algorithms and models on a high biological abstraction level are presented and simulated. Firstly, an algorithm for structural plasticity is suggested which can set up connectivity and response properties of neural units from the statistics of the incoming sensory data. This can be used to construct biologically inspired hierarchical sensory pathways. Secondly, a model of the mammalian olfactory system is presented where we suggest a mechanism for mixture segmentation based on adaptation in the olfactory cortex. Thirdly, a hierarchical model is presented which uses top-down activity to shape sensory representations and which can encode temporal history in the spatial representations of populations. Brain-inspired algorithms and methods are applied to two neuroinformatics applications involving large-scale data analysis. In the first application, we present a way to extract resting-state networks from functional magnetic resonance imaging (fMRI) resting-state data where the final extraction step is computationally inexpensive, allowing for rapid exploration of the statistics in large datasets and their visualization on different spatial scales. In the second application, a method to estimate the radioactivity level in arterial plasma from segmented blood vessels from positron emission tomography (PET) images is presented. The method outperforms previously reported methods to a degree where it can partly remove the need for invasive arterial cannulation and continuous sampling of arterial blood during PET imaging. In conclusion, this thesis provides insights into technologies for the simulation of large-scale neural models on supercomputers, their use to study mechanisms for the formation of neural representations and functions in hierarchical sensory pathways using models on a high biological abstraction level and the use of large-scale, fine-grained data analysis in neuroinformatics applications.

QC 20130515

APA, Harvard, Vancouver, ISO, and other styles

19

Pottier, Loïc. "Co-scheduling for large-scale applications : memory and resilience." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEN039/document.

Full text

Abstract:

Cette thèse explore les problèmes liés à l'ordonnancement concurrent dans le contexte des applications massivement parallèle, de deux points de vue: le coté mémoire (en particulier la mémoire cache) et le coté tolérance aux fautes.Avec l'avènement récent des architectures dites many-core, tels que les récents processeurs multi-coeurs, le nombre d'unités de traitement augmente de manière importante.Dans ce contexte, les avantages fournis par les techniques d'ordonnancements concurrents ont été démontrés à travers de nombreuses études.L'ordonnancement concurrent, aussi appelé co-ordonnancement, consiste à exécuter les applications de manière concurrente plutôt que les unes après les autres, dans le but d'améliorer le débit global de la plateforme.Mais le partage des ressources peut souvent générer des interférences.Une des solutions pour réduire de manière importante ces interférences est le partitionnement de cache.À travers un modèle théorique, des simulations et des expériences sur une plateforme existante, nous montrons l'utilité et l'importance du co-ordonnancement quand nos stratégies de partitionnement de cache sont utilisées.De plus, avec ce nombre croissant de processeurs, la probabilité d'une panne augmente également.L'efficacité des techniques de co-ordonnancement a été démontrée dans un contexte sans pannes, mais les plateformes massivement parallèles sont confrontées à des pannes fréquentes, et des techniques de tolérance aux fautes doivent être mise en place pour améliorer l'efficacité de ces plateformes.Nous étudions la complexité du problème avec un modèle théorique, nous concevons des heuristiques et nous effectuons un ensemble complet de simulations avec un simulateur de pannes, qui démontre l'efficacité des heuristiques proposées
This thesis explores co-scheduling problems in the context of large-scale applications with two main focus: the memory side, in particular the cache memory and the resilience side.With the recent advent of many-core architectures such as chip multiprocessors (CMP), the number of processing units is increasing.In this context, the benefits of co-scheduling techniques have been demonstrated. Recall that, the main idea behind co-scheduling is to execute applications concurrently rather than in sequence in order to improve the global throughput of the platform.But sharing resources often generates interferences.With the arising number of processing units accessing to the same last-level cache, those interferences among co-scheduled applications becomes critical.In addition, with that increasing number of processors the probability of a failure increases too.Resiliency aspects must be taking into account, specially for co-scheduling because failure-prone resources might be shared between applications.On the memory side, we focus on the interferences in the last-level cache, one solution used to reduce these interferences is the cache partitioning.Extensive simulations demonstrate the usefulness of co-scheduling when our efficient cache partitioning strategies are deployed.We also investigate the same problem on a real cache partitioned chip multiprocessors, using the Cache Allocation Technology recently provided by Intel.In a second time, still on the memory side, we study how to model and schedule task graphs on the new many-core architectures, such as Knights Landing architecture.These architectures offer a new level in the memory hierarchy through a new on-packagehigh-bandwidth memory. Current approaches usually do not take intoaccount this new memory level, however new scheduling algorithms anddata partitioning schemes are needed to take advantage of this deepmemory hierarchy.On the resilience, we explore the impact on failures on co-scheduling performance.The co-scheduling approach has been demonstrated in a fault-free context, but large-scale computer systems are confronted by frequent failures, and resilience techniques must be employed for large applications to execute efficiently. Indeed, failures may create severe imbalance between applications, and significantly degrade performance.We aim at minimizing the expected completion time of a set of co-scheduled applications in a failure-prone context by redistributing processors

APA, Harvard, Vancouver, ISO, and other styles

20

Nie, Bin. "GPGPU Reliability Analysis: From Applications to Large Scale Systems." W&M ScholarWorks, 2019. https://scholarworks.wm.edu/etd/1563898932.

Full text

Abstract:

Over the past decade, GPUs have become an integral part of mainstream high-performance computing (HPC) facilities. Since applications running on HPC systems are usually long-running, any error or failure could result in significant loss in scientific productivity and system resources. Even worse, since HPC systems face severe resilience challenges as progressing towards exascale computing, it is imperative to develop a better understanding of the reliability of GPUs. This dissertation fills this gap by providing an understanding of the effects of soft errors on the entire system and on specific applications. To understand system-level reliability, a large-scale study on GPU soft errors in the field is conducted. The occurrences of GPU soft errors are linked to several temporal and spatial features, such as specific workloads, node location, temperature, and power consumption. Further, machine learning models are proposed to predict error occurrences on GPU nodes so as to proactively and dynamically turning on/off the costly error protection mechanisms based on prediction results. To understand the effects of soft errors at the application level, an effective fault-injection framework is designed aiming to understand the reliability and resilience characteristics of GPGPU applications. This framework is effective in terms of reducing the tremendous number of fault injection locations to a manageable size while still preserving remarkable accuracy. This framework is validated with both single-bit and multi-bit fault models for various GPGPU benchmarks. Lastly, taking advantage of the proposed fault-injection framework, this dissertation develops a hierarchical approach to understanding the error resilience characteristics of GPGPU applications at kernel, CTA, and warp levels. In addition, given that some corrupted application outputs due to soft errors may be acceptable, we present a use case to show how to enable low-overhead yet reliable GPU computing for GPGPU applications.

APA, Harvard, Vancouver, ISO, and other styles

21

Stroud, Caleb Zachary. "Implementing Differential Privacy for Privacy Preserving Trajectory Data Publication in Large-Scale Wireless Networks." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/84548.

Full text

Abstract:

Wireless networks collect vast amounts of log data concerning usage of the network. This data aids in informing operational needs related to performance, maintenance, etc., but it is also useful for outside researchers in analyzing network operation and user trends. Releasing such information to these outside researchers poses a threat to privacy of users. The dueling need for utility and privacy must be addressed. This thesis studies the concept of differential privacy for fulfillment of these goals of releasing high utility data to researchers while maintaining user privacy. The focus is specifically on physical user trajectories in authentication manager log data since this is a rich type of data that is useful for trend analysis. Authentication manager log data is produced when devices connect to physical access points (APs) and trajectories are sequences of these spatiotemporal connections from one AP to another for the same device. The fulfillment of this goal is pursued with a variable length n-gram model that creates a synthetic database which can be easily ingested by researchers. We found that there are shortcomings to the algorithm chosen in specific application to the data chosen, but differential privacy itself can still be used to release sanitized datasets while maintaining utility if the data has a low sparsity.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

22

Murphy, Kris. "A THEORY OF STEERING COMMITTEE CAPABILITIES FOR IMPLEMENTING LARGE SCALE ENTERPRISE-WIDE INFORMATION SYSTEMS." Case Western Reserve University School of Graduate Studies / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1458218732.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Feldman, Charlotte Hannah. "Smart X-ray optics for large and small scale applications." Thesis, University of Leicester, 2009. http://hdl.handle.net/2381/7833.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Muresan, Adrian. "Scheduling and deployment of large-scale applications on Cloud platforms." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2012. http://tel.archives-ouvertes.fr/tel-00786475.

Full text

Abstract:

Infrastructure as a service (IaaS) Cloud platforms are increasingly used in the IT industry. IaaS platforms are providers of virtual resources from a catalogue of predefined types. Improvements in virtualization technology make it possible to create and destroy virtual machines on the fly, with a low overhead. As a result, the great benefit of IaaS platforms is the ability to scale a virtual platform on the fly, while only paying for the used resources. From a research point of view, IaaS platforms raise new questions in terms of making efficient virtual platform scaling decisions and then efficiently scheduling applications on dynamic platforms. The current thesis is a step forward towards exploring and answering these questions. The first contribution of the current work is focused on resource management. We have worked on the topic of automatically scaling cloud client applications to meet changing platform usage. There have been various studies showing self-similarities in web platform traffic which implies the existence of usage patterns that may or may not be periodical. We have developed an automatic platform scaling strategy that predicted platform usage by identifying non-periodic usage patterns and extrapolating future platform usage based on them. Next we have focused on extending an existing grid platform with on-demand resources from an IaaS platform. We have developed an extension to the DIET (Distributed Interactive Engineering Toolkit) middleware, that uses a virtual market based approach to perform resource allocation. Each user is given a sum of virtual currency that he will use for running his tasks. This mechanism help in ensuring fair platform sharing between users. The third and final contribution targets application management for IaaS platforms. We have studied and developed an allocation strategy for budget-constrained workflow applications that target IaaS Cloud platforms. The workflow abstraction is very common amongst scientific applications. It is easy to find examples in any field from bioinformatics to geology. In this work we have considered a general model of workflow applications that comprise parallel tasks and permit non-deterministic transitions. We have elaborated two budget-constrained allocation strategies for this type of workflow. The problem is a bi-criteria optimization problem as we are optimizing both budget and workflow makespan. This work has been practically validated by implementing it on top of the Nimbus open source cloud platform and the DIET MADAG workflow engine. This is being tested with a cosmological simulation workflow application called RAMSES. This is a parallel MPI application that, as part of this work, has been ported for execution on dynamic virtual platforms. Both theoretical simulations and practical experiments have shown encouraging results and improvements.

APA, Harvard, Vancouver, ISO, and other styles

25

Wang, Jiechao. "Approaches for contextualization and large-scale testing of mobile applications." Thesis, Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49142.

Full text

Abstract:

In this thesis, we focused on two problems in mobile application development: contextualization and large-scale testing. We identified the limitations of current contextualization and testing solutions. On one hand, advanced-remote-computing- based mobilization does not provide context awareness to the mobile applications it mobilized, so we presented contextify to provide context awareness to them without rewriting the applications or changing their source code. Evaluation results and user surveys showed that contextify-contextualized applications reduce users' time and effort to complete tasks. On the other hand, current mobile application testing solutions cannot conduct tests at the UI level and in a large-scale manner simultaneously, so we presented and implemented automated cloud computing (ACT) to achieve this goal. Evaluation results showed that ACT can support a large number of users and it is stable, cost-efficiency as well as time-efficiency.

APA, Harvard, Vancouver, ISO, and other styles

26

Kim, Daeki. "Large scale transportation service network design : models, algorithms and applications." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/10366.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Hurwitz, Jeremy Scott. "Error-correcting codes and applications to large scale classification systems." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/53140.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.
Includes bibliographical references (p. 37-39).
In this thesis, we study the performance of distributed output coding (DOC) and error-Correcting output coding (ECOC) as potential methods for expanding the class of tractable machine-learning problems. Using distributed output coding, we were able to scale a neural-network-based algorithm to handle nearly 10,000 output classes. In particular, we built a prototype OCR engine for Devanagari and Korean texts based upon distributed output coding. We found that the resulting classifiers performed better than existing algorithms, while maintaining small size. Error-correction, however, was found to be ineffective at increasing the accuracy of the ensemble. For each language, we also tested the feasibility of automatically finding a good codebook. Unfortunately, the results in this direction were primarily negative.
by Jeremy Scott Hurwitz.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

28

Lee, John Jaesung. "Efficient object recognition and image retrieval for large-scale applications." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/45637.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 91-93).
Algorithms for recognition and retrieval tasks generally call for both speed and accuracy. When scaling up to very large applications, however, we encounter additional significant requirements: adaptability and scalability. In many real-world systems, large numbers of images are constantly added to the database, requiring the algorithm to quickly tune itself to recent trends so it can serve queries more effectively. Moreover, the systems need to be able to meet the demands of simultaneous queries from many users. In this thesis, I describe two new algorithms intended to meet these requirements and give an extensive experimental evaluation for both. The first algorithm constructs an adaptive vocabulary forest, which is an efficient image-database model that grows and shrinks as needed while adapting its structure to tune itself to recent trends. The second algorithm is a method for efficiently performing classification tasks by comparing query images to only a fixed number of training examples, regardless of the size of the image database. These two methods can be combined to create a fast, adaptable, and scalable vision system suitable for large-scale applications. I also introduce LIBPMK, a fast implementation of common computer vision processing pipelines such as that of the pyramid match kernel. This implementation was used to build several successful interactive applications as well as batch experiments for research settings. This implementation, in addition to the two new algorithms introduced by this thesis, are a step toward meeting the speed, adaptability, and scalability requirements of practical large-scale vision systems.
by John Jaesung Lee.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

29

Zhou, Yi (Software engineer). "Uncertainty Evaluation in Large-scale Dynamical Systems: Theory and Applications." Thesis, University of North Texas, 2014. https://digital.library.unt.edu/ark:/67531/metadc700073/.

Full text

Abstract:

Significant research efforts have been devoted to large-scale dynamical systems, with the aim of understanding their complicated behaviors and managing their responses in real-time. One pivotal technological obstacle in this process is the existence of uncertainty. Although many of these large-scale dynamical systems function well in the design stage, they may easily fail when operating in realistic environment, where environmental uncertainties modulate system dynamics and complicate real-time predication and management tasks. This dissertation aims to develop systematic methodologies to evaluate the performance of large-scale dynamical systems under uncertainty, as a step toward real-time decision support. Two uncertainty evaluation approaches are pursued: the analytical approach and the effective simulation approach. The analytical approach abstracts the dynamics of original stochastic systems, and develops tractable analysis (e.g., jump-linear analysis) for the approximated systems. Despite the potential bias introduced in the approximation process, the analytical approach provides rich insights valuable for evaluating and managing the performance of large-scale dynamical systems under uncertainty. When a system’s complexity and scale are beyond tractable analysis, the effective simulation approach becomes very useful. The effective simulation approach aims to use a few smartly selected simulations to quickly evaluate a complex system’s statistical performance. This approach was originally developed to evaluate a single uncertain variable. This dissertation extends the approach to be scalable and effective for evaluating large-scale systems under a large-number of uncertain variables. While a large portion of this dissertation focuses on the development of generic methods and theoretical analysis that are applicable to broad large-scale dynamical systems, many results are illustrated through a representative large-scale system application on strategic air traffic management application, which is concerned with designing robust management plans subject to a wide range of weather possibilities at 2-15 hours look-ahead time.

APA, Harvard, Vancouver, ISO, and other styles

30

Mahadevan, Karthikeyan. "Estimating reliability impact of biometric devices in large scale applications." Morgantown, W. Va. : [West Virginia University Libraries], 2003. http://etd.wvu.edu/templates/showETD.cfm?recnum=3096.

Full text

Abstract:

Thesis (M.S.)--West Virginia University, 2003.
Title from document title page. Document formatted into pages; contains vii, 66 p. : ill. (some col.). Vita. Includes abstract. Includes bibliographical references (p. 62-64).

APA, Harvard, Vancouver, ISO, and other styles

31

Wehbe, Diala. "Simulations and applications of large-scale k-determinantal point processes." Thesis, Lille 1, 2019. http://www.theses.fr/2019LIL1I012/document.

Full text

Abstract:

Avec la croissance exponentielle de la quantité de données, l’échantillonnage est une méthode pertinente pour étudier les populations. Parfois, nous avons besoin d’échantillonner un grand nombre d’objets d’une part pour exclure la possibilité d’un manque d’informations clés et d’autre part pour générer des résultats plus précis. Le problème réside dans le fait que l’échantillonnage d’un trop grand nombre d’individus peut constituer une perte de temps.Dans cette thèse, notre objectif est de chercher à établir des ponts entre la statistique et le k-processus ponctuel déterminantal(k-DPP) qui est défini via un noyau. Nous proposons trois projets complémentaires pour l’échantillonnage de grands ensembles de données en nous basant sur les k-DPPs. Le but est de sélectionner des ensembles variés qui couvrent un ensemble d’objets beaucoup plus grand en temps polynomial. Cela peut être réalisé en construisant différentes chaînes de Markov où les k-DPPs sont les lois stationnaires.Le premier projet consiste à appliquer les processus déterminantaux à la sélection d’espèces diverses dans un ensemble d’espèces décrites par un arbre phylogénétique. En définissant le noyau du k-DPP comme un noyau d’intersection, les résultats fournissent une borne polynomiale sur le temps de mélange qui dépend de la hauteur de l’arbre phylogénétique.Le second projet vise à utiliser le k-DPP dans un problème d’échantillonnage de sommets sur un graphe connecté de grande taille. La pseudo-inverse de la matrice Laplacienne normalisée est choisie d’étudier la vitesse de convergence de la chaîne de Markov créée pour l’échantillonnage de la loi stationnaire k-DPP. Le temps de mélange résultant est borné sous certaines conditions sur les valeurs propres de la matrice Laplacienne.Le troisième sujet porte sur l’utilisation des k-DPPs dans la planification d’expérience avec comme objets d’étude plus spécifiques les hypercubes latins d’ordre n et de dimension d. La clé est de trouver un noyau positif qui préserve le contrainte de ce plan c’est-à-dire qui préserve le fait que chaque point se trouve exactement une fois dans chaque hyperplan. Ensuite, en créant une nouvelle chaîne de Markov dont le n-DPP est sa loi stationnaire, nous déterminons le nombre d’étapes nécessaires pour construire un hypercube latin d’ordre n selon le n-DPP
With the exponentially growing amount of data, sampling remains the most relevant method to learn about populations. Sometimes, larger sample size is needed to generate more precise results and to exclude the possibility of missing key information. The problem lies in the fact that sampling large number may be a principal reason of wasting time.In this thesis, our aim is to build bridges between applications of statistics and k-Determinantal Point Process(k-DPP) which is defined through a matrix kernel. We have proposed different applications for sampling large data sets basing on k-DPP, which is a conditional DPP that models only sets of cardinality k. The goal is to select diverse sets that cover a much greater set of objects in polynomial time. This can be achieved by constructing different Markov chains which have the k-DPPs as their stationary distribution.The first application consists in sampling a subset of species in a phylogenetic tree by avoiding redundancy. By defining the k-DPP via an intersection kernel, the results provide a fast mixing sampler for k-DPP, for which a polynomial bound on the mixing time is presented and depends on the height of the phylogenetic tree.The second application aims to clarify how k-DPPs offer a powerful approach to find a diverse subset of nodes in large connected graph which authorizes getting an outline of different types of information related to the ground set. A polynomial bound on the mixing time of the proposed Markov chain is given where the kernel used here is the Moore-Penrose pseudo-inverse of the normalized Laplacian matrix. The resulting mixing time is attained under certain conditions on the eigenvalues of the Laplacian matrix. The third one purposes to use the fixed cardinality DPP in experimental designs as a tool to study a Latin Hypercube Sampling(LHS) of order n. The key is to propose a DPP kernel that establishes the negative correlations between the selected points and preserve the constraint of the design which is strictly confirmed by the occurrence of each point exactly once in each hyperplane. Then by creating a new Markov chain which has n-DPP as its stationary distribution, we determine the number of steps required to build a LHS with accordance to n-DPP

APA, Harvard, Vancouver, ISO, and other styles

32

Biel, Martin. "Distributed Stochastic Programming with Applications to Large-Scale Hydropower Operations." Licentiate thesis, KTH, Reglerteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263616.

Full text

Abstract:

Stochastic programming is a subfield of mathematical programming concerned with optimization problems subjected to uncertainty. Many engineering problems with random elements can be accurately modeled as a stochastic program. In particular, decision problems associated with hydropower operations motivate the application of stochastic programming. When complex decision-support problems are considered, the corresponding stochastic programming models often grow too large to store and solve on a single computer. This clarifies the need for parallel approaches that could enable efficient treatment of large-scale stochastic programs in a distributed environment. In this thesis, we develop mathematical and computational tools in order to facilitate distributed stochastic programs that can be efficiently stored and solved. First, we present a software framework for stochastic programming implemented in the Julia language. A key feature of the framework is the support for distributing stochastic programs in memory. Moreover, the framework includes a large set of structure-exploiting algorithms for solving stochastic programming problems. These algorithms are based on the classical L-shaped and progressive-hedging algorithms and can run in parallel on distributed stochastic programs. The distributed performance of our software tools is improved by exploring algorithmic innovations and software patterns. We present the architecture of the framework and highlight key implementation details. Finally, we provide illustrative examples of stochastic programming functionality and benchmarks on large-scale problems. Then, we pursue further algorithmic improvements to the distributed L-shaped algorithm. Specifically, we consider the use of dynamic cut aggregation. We develop theoretical results on convergence and complexity and then showcase performance improvements in numerical experiments. We suggest several aggregation schemes that are based on parameterized selection rules. Before we perform large-scale experiments, the aggregation parameters are determined by a tuning procedure. In brief, cut aggregation can yield major performance improvements to L-shaped algorithms in distributed settings. Finally, we consider an application to hydropower operations. The day-ahead planning problem involves specifying optimal order volumes in a deregulated electricity market, without knowledge of the next-day market price, and then optimizing the hydropower production. We provide a detailed introduction to the day-ahead model and explain how we can implement it with our computational tools. This covers a complete procedure of gathering data, generating forecasts from the data, and finally formulating and solving a stochastic programming model of the day-ahead problem. Using a sample-based algorithm that internally relies on our structure-exploiting solvers, we obtain tight confidence intervals around the optimal solution of the day-ahead problem.

APA, Harvard, Vancouver, ISO, and other styles

33

Kutlu, Mucahid. "Parallel Processing of Large Scale Genomic Data." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1436355132.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Wessman, Love, and Niklas Wessman. "Threat modeling of large-scale computer systems : Implementing and evaluating threat modeling at Company X." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280099.

Full text

Abstract:

Threat modeling is a growing field in cyber security. As computer systems grow larger and more complicated, the understanding of how to model and protect these systems becomes ever more important. Threat modeling is a method well suited for this task. The two primary motivations behind this research is to strengthen the cyber security at Company X and help the development of threat modeling, which in turn can help strengthen the field of cyber security. The main question that is investigated is what results can be achieved from the KTH Threat Modeling Method applied to specific systems used by Company X. This question is answered by implementing the method on the specified systems. The experience and the result of the implementation of the model are then used to evaluate the threat model method. The produced model implies that the biggest risk in the investigated systems are the Connected Smoke Sensor and the Smart Meter which measures water and electricity usage. Some of the given recommendations are to create protections against SQL-injection by keeping the systems up to date and to validate input. The main impression from implementing the threat model method on Company X is that the method is easy to use, learn and to understand. Another result was that the more information one has about the systems used in the IT-infrastructure being investigated, the more precise the threat model can become. The method is ideally used with focus on pure, interconnected software implementations, rather than modeling several non-connected systems in a single iteration of the method, which is what this report does. In order to teach and spread the method easier, a comprehensive written source such as a book could be utilized. To improve the method itself, the inclusion of automated attack simulation and modeling tools is suggested. Lastly, the KTH Threat Modeling Method is an iterative process, which can and should be improved by continuously iterating over the model going more in depth by every step. The body of work presented in this report is a first iteration of this ongoing process. The findings of this report point to the fact that while the KTH threat modeling method is already a mature method fully able to produce meaningful threat modeling results, there are still aspects that could be improved or added which would increase the overall strength of the method.
Hotmodellering är ett växande område inom cybersäkerhet. När datorsystem växer och blir mer komplicerade, så blir kunskapen om hur man modellerar och skyddar systemen allt viktigare. Hotmodellering är ett verktyg väl anpassat till denna uppgift. I den här rapporten ligger fokuset på att höja cybersäkerheten på Företag X och att bidra till utvecklingen av hotmodellering, vilket i sig bidrar till att stärka forskningen kring cybersäkerhet. Den huvudsakliga frågan som undersöks är vilka resultat som kan uppnås med en implementation av KTH Threat Modeling Method på specifika system hos Företag X. Frågan besvaras genom att implementera metoden på de specificerade systemen. Därefter används erfarenheten av utvecklingen och det framtagna resultatet till att evaluera hotmodellerings metoden. Den framtagna modellen pekar på att den största risken i de undersökta systemen hos Företag X är deras internetanslutna rökdetektorer och smarta mätare som mäter vatten- och elförbrukning. De rekommendationer som ges är bland annat att skydda sig mot SQL-injektionerna genom att hålla systemen uppdaterade och att validera indata. De huvudsakliga intrycken som erhölls från att implementera hotmodelleringsmetoden på Företag X är att metoden är lätt att använda, lära sig, och förstå. Ett annat resultat är att ju mer information hotmodelleraren har kring systemen som utforskas, desto mer exakt kan hotmodellen bli. Metoden är idealiskt lämpad för renodlad, sammansluten mjukvara, snarare än att modellera flera icke-sammanslutna system i en och samma iteration av metoden, vilket är vad denna rapport gör. För att förenkla utlärningsprocessen av metoden så kan en omfattande skriven resurs som exempelvis en bok vara till god hjälp. För att förbättra själva metoden föreslås integration av automatiserade attacksimulerings- och modelleringsverktyg. The KTH Threat Modeling Method är en iterativ process. Modellen kan och bör göras bättre genom att kontinuerligt iterera över arbetet flera gånger, där modellens detaljrikhet ökas för varje iteration. Det som presenteras i denna rapport är första iterationen av denna process. Innebörden av resultaten från denna rapport visar att även om hotmodelleringsmetoden redan är en mogen metod som kan producera meningsfulla hotmodelleringsresultat, finns det fortfarande vissa bitar som kan förbättras eller läggas till, vilket enligt författarna skulle öka metodens styrka i allmänhet.

APA, Harvard, Vancouver, ISO, and other styles

35

Papacharalampos, Georgios. "Small scale/large scale MFC stacks for improved power generation and implementation in robotic applications." Thesis, University of the West of England, Bristol, 2016. http://eprints.uwe.ac.uk/27396/.

Full text

Abstract:

Microbial Fuel Cells (MFCs) are biological electrical generators or batteries that have shown to be able to energise electronic devices solely from the breakdown of organic matter found in wastewater. The generated power from a single unit is currently insufficient to run standard electronics hence alternative strategies are needed for stepping-up their performance to functional levels. This line of work deals with MFC miniaturisation; their proliferation into large stacks; power improvement by using new electrode components and finally a novel method of energy harvesting that will enhance the operation of a self-sustainable robotic platform. A new-design small-MFC design was developed using 3D printing technology that outperformed a pre-existing MFC of the same volume (6.25 mL) highlighting the importance of reactor configuration and material selection. Furthermore, improvements were made by the use of a cathode electrode that facilitates a higher rate of oxygen reduction reaction (ORR) due to the high surface area carbon nanoparticles coated on the outer layer. Consequently, a 24-MFC stack was built to simulate a small-scale wastewater treatment system. The MFC units were connected in various arrangements, both fluidically as a series of cascades and electrically in-parallel or in-series, for identifying the best possible configuration for organic content reduction and power output. Results suggest that in-parallel connections allow for higher waste removal and the addition of extra units in a cascade is a possible way to ensure that the organic content of the feedstock is always reduced to below the set or permitted levels for environmental discharge. Finally, a new method of fault-proof energy harvesting in stacks was devised and developed to produce a unique energy autonomous energy harvester without any voltage boosting and efficiencies above 90%. This thesis concludes with the transferability of the above findings to a robotic test platform which demonstrates energy autonomous behaviour and highlights the synergy between the bacterial engine and the mechatronics.

APA, Harvard, Vancouver, ISO, and other styles

36

Nytén, Anton. "Low-Cost Iron-Based Cathode Materials for Large-Scale Battery Applications." Doctoral thesis, Uppsala University, Department of Materials Chemistry, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6842.

Full text

Abstract:

There are today clear indications that the Li-ion battery of the type currently used worldwide in mobile-phones and lap-tops is also destined to soon become the battery of choice in more energy-demanding concepts such as electric and electric hybrid vehicles (EVs and EHVs). Since the currently used cathode materials (typically of the Li(Ni,Co)O₂-type) are too expensive in large-scale applications, these new batteries will have to exploit some much cheaper transition-metal. Ideally, this should be the very cheapest - iron(Fe) - in combination with a graphite(C)-based anode. In this context, the obvious Fe-based active cathode of choice appears to be LiFePO₄. A second and in some ways even more attractive material - Li₂FeSiO₄ - has emerged during the course of this work.

An effort has here been made to understand the Li extraction/insertion mechanism on electrochemical cycling of Li₂FeSiO₄. A fascinating picture has emerged (following a complex combination of Mössbauer, X-ray diffraction and electrochemical studies) in which the material is seen to cycle between Li₂FeSiO₄ and LiFeSiO₄, but with the structure of the original Li₂FeSiO₄ transforming from a metastable short-range ordered solid-solution into a more stable long-range ordered structure during the first cycle. Density Functional Theory calculations on Li₂FeSiO₄ and the delithiated on LiFeSiO₄ structure provide an interesting insight into the experimental result.

Photoelectron spectroscopy was used to study the surface chemistry of both carbon-treated LiFePO₄ and Li₂FeSiO₄ after electrochemical cycling. The surface-layer on both materials was concluded to be very thin and with incomplete coverage, giving the promise of good long-term cycling.

LiFePO₄ and Li₂FeSiO₄ should both be seen as highly promising candidates as positive-electrode materials for large-scale Li-ion battery applications.

APA, Harvard, Vancouver, ISO, and other styles

37

Gutin, Eli. "Practical applications of large-scale stochastic control for learning and optimization." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/120191.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 183-188).
This thesis explores a variety of techniques for large-scale stochastic control. These range from simple heuristics that are motivated by the problem structure and are amenable to analysis, to more general deep reinforcement learning (RL) which applies to broader classes of problems but is trickier to reason about. In the first part of this thesis, we explore a less known application of stochastic control in Multi-armed bandits. By assuming a Bayesian statistical model, we get enough problem structure so that we can formulate an MDP to maximize total rewards. If the objective involved total discounted rewards over an infinite horizon, then the celebrated Gittins index policy would be optimal. Unfortunately, the analysis there does not carry over to the non-discounted, finite-horizon problem. In this work, we propose a tightening sequence of 'optimistic' approximations to the Gittins index. We show that the use of these approximations together with the use of an increasing discount factor appears to offer a compelling alternative to state-of-the-art algorithms. We prove that these optimistic indices constitute a regret optimal algorithm, in the sense of meeting the Lai-Robbins lower bound, including matching constants. The second part of the thesis focuses on the collateral management problem (CMP). In this work, we study the CMP, faced by a prime brokerage, through the lens of multi-period stochastic optimization. We find that, for a large class of CMP instances, algorithms that select collateral based on appropriately computed asset prices are near-optimal. In addition, we back-test the method on data from a prime brokerage and find substantial increases in revenue. Finally, in the third part, we propose novel deep reinforcement learning (DRL) methods for option pricing and portfolio optimization problems. Our work on option pricing enables one to compute tighter confidence bounds on the price, using the same number of Monte Carlo samples, than existing techniques. We also examine constrained portfolio optimization problems and test out policy gradient algorithms that work with somewhat different objective functions. These new objectives measure the performance of a projected version of the policy and penalize constraint violation.
by Eli Gutin.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

38

McVay, Elaine D. "Large scale applications of 2D materials for sensing and energy harvesting." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/111925.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references.
In this project we demonstrate the fabrication and characterization of printed reduced graphene oxide strain sensors, Chemical Vapor Deposition (CVD) 2D material transistors, and tungsten diselenide (WSe₂) photovoltaic devices that were produced through a combination of printing and conventional microfabrication processes. Each of these components were designed with the purpose of fitting into a "smart skin" system that could be discretely integrated into and sense its environment. This thesis document will describe the modification-of a 3D printer to give it inkjet capabilities that allow for the direct deposition of graphene oxide flakes onto a 3D printed surface. These graphene oxide flake traces were then reduced, making them more conductive and able to function as strain sensors. Next, this thesis will discuss the development of CVD molybdenum disulfide (MoS₂) and CVD graphene transistors and how they can be modified to function as chemical sensors. Finally, this work will detail steps taken to design, fabricate, and test a WSe₂ photovoltaic device which is composed of a printed active layer. In summary, these devices can fit into the sensing, communication, and energy harvesting blocks required in realizing a ubiquitous sensing system.
by Elaine D. McVay.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

39

Ezeozue, Chidube Donald. "Large-scale consensus clustering and data ownership considerations for medical applications." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/86273.

Full text

Abstract:

Thesis: S.M. in Technology and Policy, Massachusetts Institute of Technology, Engineering Systems Division, Technology and Policy Program, 2013.
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 97-101).
An intersection of events has led to a massive increase in the amount of medical data being collected from patients inside and outside the hospital. These events include the development of new sensors, the continuous decrease in the cost of data storage, the development of Big Data algorithms in other domains and the Health Information Technology for Economic and Clinical Health (HITECH) Act's $20 billion incentive for hospitals to install and use Electronic Health Record (EHR) systems. The data being collected presents an excellent opportunity to improve patient care. However, this opportunity is not without its challenges. Some of the challenges are technical in nature, not the least of which is how to efficiently process such massive amounts of data. At the other end of the spectrum, there are policy questions that deal with data privacy, confidentiality and ownership to ensure that research continues unhindered while preserving the rights and interests of the stakeholders involved. This thesis addresses both ends of the challenge spectrum. First of all, we design and implement a number of methods for automatically discovering groups within large amounts of data, otherwise known as clustering. We believe this technique would prove particularly useful in identifying patient states, segregating cohorts of patients and hypothesis generation. Specifically, we scale a popular clustering algorithm, Expectation-Maximization (EM) for Gaussian Mixture Models to be able to run on a cloud of computers. We also give a lot of attention to the idea of Consensus Clustering which allows multiple clusterings to be merged into a single ensemble clustering. Here, we scale one existing consensus clustering algorithm, which relies on EM for multinomial mixture models. We also develop and implement a more general framework for retrofitting any consensus clustering algorithm and making it amenable to streaming data as well as distribution on a cloud. On the policy end of the spectrum, we argue that the issue of data ownership is essential and highlight how the law in the United States has handled this issue in the past several decades, focusing on common law and state law approaches. We proceed to identify the flaws, especially the fragmentation, in the current system and make recommendations for a more equitable and efficient policy stance. The recommendations center on codifying the policy stance in Federal Law and allocating the property rights of the data to both the healthcare provider and the patient.
by Chidube Donald Ezeozue.
S.M. in Technology and Policy
S.M.

APA, Harvard, Vancouver, ISO, and other styles

40

Scarciotti, Giordano. "Approximation, analysis and control of large-scale systems : theory and applications." Thesis, Imperial College London, 2016. http://hdl.handle.net/10044/1/30781.

Full text

Abstract:

This work presents some contributions to the fields of approximation, analysis and control of large-scale systems. Consequently the Thesis consists of three parts. The first part covers approximation topics and includes several contributions to the area of model reduction. Firstly, model reduction by moment matching for linear and nonlinear time-delay systems, including neutral differential time-delay systems with discrete-delays and distributed delays, is considered. Secondly, a theoretical framework and a collection of techniques to obtain reduced order models by moment matching from input/output data for linear (time-delay) systems and nonlinear (time-delay) systems is presented. The theory developed is then validated with the introduction and use of a low complexity algorithm for the fast estimation of the moments of the NETS-NYPS benchmark interconnected power system. Then, the model reduction problem is solved when the class of input signals generated by a linear exogenous system which does not have an implicit (differential) form is considered. The work regarding the topic of approximation is concluded with a chapter covering the problem of model reduction for linear singular systems. The second part of the Thesis, which concerns the area of analysis, consists of two very different contributions. The first proposes a new "discontinuous phasor transform" which allows to analyze in closed-form the steady-state behavior of discontinuous power electronic devices. The second presents in a unified framework a class of theorems inspired by the Krasovskii-LaSalle invariance principle for the study of "liminf" convergence properties of solutions of dynamical systems. Finally, in the last part of the Thesis the problem of finite-horizon optimal control with input constraints is studied and a methodology to compute approximate solutions of the resulting partial differential equation is proposed.

APA, Harvard, Vancouver, ISO, and other styles

41

Moise, Diana Maria. "Optimizing data management for MapReduce applications on large-scale distributed infrastructures." Thesis, Cachan, Ecole normale supérieure, 2011. http://www.theses.fr/2011DENS0067/document.

Full text

Abstract:

Les applications data-intensive sont largement utilisées au sein de domaines diverses dans le but d'extraire et de traiter des informations, de concevoir des systèmes complexes, d'effectuer des simulations de modèles réels, etc. Ces applications posent des défis complexes tant en termes de stockage que de calcul. Dans le contexte des applications data-intensive, nous nous concentrons sur le paradigme MapReduce et ses mises en oeuvre. Introduite par Google, l'abstraction MapReduce a révolutionné la communauté intensif de données et s'est rapidement étendue à diverses domaines de recherche et de production. Une implémentation domaine publique de l'abstraction mise en avant par Google, a été fournie par Yahoo à travers du project Hadoop. Le framework Hadoop est considéré l'implémentation de référence de MapReduce et est actuellement largement utilisé à des fins diverses et sur plusieurs infrastructures. Nous proposons un système de fichiers distribué, optimisé pour des accès hautement concurrents, qui puisse servir comme couche de stockage pour des applications MapReduce. Nous avons conçu le BlobSeer File System (BSFS), basé sur BlobSeer, un service de stockage distribué, hautement efficace, facilitant le partage de données à grande échelle. Nous étudions également plusieurs aspects liés à la gestion des données intermédiaires dans des environnements MapReduce. Nous explorons les contraintes des données intermédiaires MapReduce à deux niveaux: dans le même job MapReduce et pendant l'exécution des pipelines d'applications MapReduce. Enfin, nous proposons des extensions de Hadoop, un environnement MapReduce populaire et open-source, comme par example le support de l'opération append. Ce travail inclut également l'évaluation et les résultats obtenus sur des infrastructures à grande échelle: grilles informatiques et clouds
Data-intensive applications are nowadays, widely used in various domains to extract and process information, to design complex systems, to perform simulations of real models, etc. These applications exhibit challenging requirements in terms of both storage and computation. Specialized abstractions like Google’s MapReduce were developed to efficiently manage the workloads of data-intensive applications. The MapReduce abstraction has revolutionized the data-intensive community and has rapidly spread to various research and production areas. An open-source implementation of Google's abstraction was provided by Yahoo! through the Hadoop project. This framework is considered the reference MapReduce implementation and is currently heavily used for various purposes and on several infrastructures. To achieve high-performance MapReduce processing, we propose a concurrency-optimized file system for MapReduce Frameworks. As a starting point, we rely on BlobSeer, a framework that was designed as a solution to the challenge of efficiently storing data generated by data-intensive applications running at large scales. We have built the BlobSeer File System (BSFS), with the goal of providing high throughput under heavy concurrency to MapReduce applications. We also study several aspects related to intermediate data management in MapReduce frameworks. We investigate the requirements of MapReduce intermediate data at two levels: inside the same job, and during the execution of pipeline applications. Finally, we show how BSFS can enable extensions to the de facto MapReduce implementation, Hadoop, such as the support for the append operation. This work also comprises the evaluation and the obtained results in the context of grid and cloud environments

APA, Harvard, Vancouver, ISO, and other styles

42

Asif, Fayyaz Muhammad. "Achieving Robust Self Management for Large Scale Distributed Applications using Management Elements." Thesis, KTH, School of Information and Communication Technology (ICT), 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-24229.

Full text

Abstract:

Abstract

Autonomic computing is an approach proposed by IBM that enables a system to self-con gure, self-heal, self-optimize, and self-protect itself, usually referred to as self-* or self-management. Humans should only specify higher level policies to guide the self-* behavior of the system.

Self-Management is achieved using control feedback loops that consist of four stages: monitor, analyze, plan, and execute. Management is more challenging in dynamic distributed environments where resources can join, leave, and fail. To address this problem a Distributed Component Management System (DCMS), a.k.a Niche, is being developed at KTH and SICS (Swedish Institute of Computer Science). DCMS provides abstractions that enable the construction of distributed control feedback loops. Each loop consists of a number of management elements (MEs) that do one or more of the four stages of a control loop mentioned above.

The current implementation of DCMS assumes that management elements (MEs) are deployed on stable nodes that do not fail. This assumption is dicult to guarantee in many environments and application scenarios. One solution to this limitation is to replicate MEs so that if one fails other MEs can continue working and restore the failed one. The problem is that MEs are stateful. We need to keep the state consistent among replicas. We also want to be sure that all events are processed (nothing is lost) and all actions are applied exactly once.

This report explains a proposal for the replication of stateful MEs under DCMS framework. For improved scalability, load-balancing and fault-tolerance, dierent breakthroughs in the eld of replicated state machine has been taken into account and discussed in this report. Chord has been used as an underlying structured overlay network (SON). This report also describes a prototype implementation of this proposal and discusses the results.

APA, Harvard, Vancouver, ISO, and other styles

43

Putthividhya, Wanida. "Quality of service (QoS) support for multimedia applications in large-scale networks." [Ames, Iowa : Iowa State University], 2006.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

44

Seal, Sudip K. "Parallel methods for large-scale applications in computational electromagnetics and materials science." [Ames, Iowa : Iowa State University], 2007.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

45

Tor, Ali Hakan. "Derivative Free Algorithms For Large Scale Non-smooth Optimization And Their Applications." Phd thesis, METU, 2013. http://etd.lib.metu.edu.tr/upload/12615700/index.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Liu, X. "Kernel methods for large-scale process monitoring with applications in power systems." Thesis, Queen's University Belfast, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.517392.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Wiersma, Mark Edward. "Standards and benchmark tests for evaluating large scale manipulators with construction applications." Thesis, Monterey, California. Naval Postgraduate School, 1995. http://hdl.handle.net/10945/26265.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Sunderland, Andrew Gareth. "Large scale applications on distributed-memory parallel computers using efficient numerical methods." Thesis, University of Liverpool, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.367976.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Wollenweber, Fritz Georg. "Programming environments and tools for massively parallel computers and large scale applications." Thesis, University of Southampton, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.266943.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Eltayeb, Mohammed Soleiman. "Efficient data scheduling for real-time large-scale data-intensive distributed applications." The Ohio State University, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=osu1095719463.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Large Scale Applications Implementing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles