Dissertations / Theses: '080201 Analysis of Algorithms and Complexity'

1

Richey, Michael Bruce. "Combinatorial optimization on series-parallel graphs : algorithms and complexity." Diss., Georgia Institute of Technology, 1985. http://hdl.handle.net/1853/24542.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Nordin, Henrik, and Kevin Jouper. "Performance analysis of multithreaded sorting algorithms." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-10404.

Full text

Abstract:

Context. Almost all of the modern computers today have a CPU withmultiple cores, providing extra computational power. In the new ageof big data, parallel execution is essential to improve the performanceto an acceptable level. With parallelisation comes new challenges thatneeds to be considered. Objectives. In this work, parallel algorithms are compared and analysedin relation to their sequential counterparts, using the Java platform.Through this, find the potential speedup for multithreading andwhat factors affects the performance. In addition, provide source codefor multithreaded algorithms with proven time complexities. Methods. A literature study was conducted to gain knowledge anddeeper understanding into the aspects of sorting algorithms and thearea of parallel computing. An experiment followed of implementing aset of algorithms from which data could be gather through benchmarkingand testing. The data gathered was studied and analysed with itscorresponding source code to prove the validity of parallelisation. Results. Multithreading does improve performance, with two threadsin average providing a speedup of up to 2x and four threads up to3x. However, the potential speedup is bound to the available physicalthreads of the CPU and dependent of balancing the workload. Conclusions. The importance of workload balancing and using thecorrect number of threads in relation to the problem to be solved,needs to be carefully considered in order to utilize the extra resourcesavailable to its full potential.

APA, Harvard, Vancouver, ISO, and other styles

3

Castura, Jeff. "Performance analysis and optimization of reduced complexity Low Density Parity Check decoding algorithms." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0017/MQ53426.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Powell, David Richard 1973. "Algorithms for sequence alignment." Monash University, School of Computer Science and Software Engineering, 2001. http://arrow.monash.edu.au/hdl/1959.1/8051.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Philips, Petra Camilla, and petra philips@gmail com. "Data-Dependent Analysis of Learning Algorithms." The Australian National University. Research School of Information Sciences and Engineering, 2005. http://thesis.anu.edu.au./public/adt-ANU20050901.204523.

Full text

Abstract:

This thesis studies the generalization ability of machine learning algorithms in a statistical setting. It focuses on the data-dependent analysis of the generalization performance of learning algorithms in order to make full use of the potential of the actual training sample from which these algorithms learn.¶ First, we propose an extension of the standard framework for the derivation of generalization bounds for algorithms taking their hypotheses from random classes of functions. This approach is motivated by the fact that the function produced by a learning algorithm based on a random sample of data depends on this sample and is therefore a random function. Such an approach avoids the detour of the worst-case uniform bounds as done in the standard approach. We show that the mechanism which allows one to obtain generalization bounds for random classes in our framework is based on a “small complexity” of certain random coordinate projections. We demonstrate how this notion of complexity relates to learnability and how one can explore geometric properties of these projections in order to derive estimates of rates of convergence and good confidence interval estimates for the expected risk. We then demonstrate the generality of our new approach by presenting a range of examples, among them the algorithm-dependent compression schemes and the data-dependent luckiness frameworks, which fall into our random subclass framework.¶ Second, we study in more detail generalization bounds for a specific algorithm which is of central importance in learning theory, namely the Empirical Risk Minimization algorithm (ERM). Recent results show that one can significantly improve the high-probability estimates for the convergence rates for empirical minimizers by a direct analysis of the ERM algorithm. These results are based on a new localized notion of complexity of subsets of hypothesis functions with identical expected errors and are therefore dependent on the underlying unknown distribution. We investigate the extent to which one can estimate these high-probability convergence rates in a data-dependent manner. We provide an algorithm which computes a data-dependent upper bound for the expected error of empirical minimizers in terms of the “complexity” of data-dependent local subsets. These subsets are sets of functions of empirical errors of a given range and can be determined based solely on empirical data. We then show that recent direct estimates, which are essentially sharp estimates on the high-probability convergence rate for the ERM algorithm, can not be recovered universally from empirical data.

APA, Harvard, Vancouver, ISO, and other styles

6

Ho, Lester Tse Wee. "Self-organising algorithms for fourth generation wireless networks and its analysis using complexity metrics." Thesis, Queen Mary, University of London, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.407388.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Kenny, Robert. "Orbit complexity and computable Markov partitions." University of Western Australia. School of Mathematics and Statistics, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0231.

Full text

Abstract:

Markov partitions provide a 'good' mechanism of symbolic dynamics for uniformly hyperbolic systems, forming the classical foundation for the thermodynamic formalism in this setting, and remaining useful in the modern theory. Usually, however, one takes Bowen's 1970's general construction for granted, or restricts to cases with simpler geometry (as on surfaces) or more algebraic structure. This thesis examines several questions on the algorithmic content of (topological) Markov partitions, starting with the pointwise, entropy-like, topological conjugacy invariant known as orbit complexity. The relation between orbit complexity de nitions of Brudno and Galatolo is examined in general compact spaces, and used in Theorem 2.0.9 to bound the decrease in some of these quantities under semiconjugacy. A corollary, and a pointwise analogue of facts about metric entropy, is that any Markov partition produces symbolic dynamics matching the original orbit complexity at each point. A Lebesgue-typical value for orbit complexity near a hyperbolic attractor is also established (with some use of Brin-Katok local entropy), and is technically distinct from typicality statements discussed by Galatolo, Bonanno and their co-authors. Both our results are proved adapting classical arguments of Bowen for entropy. Chapters 3 and onwards consider the axiomatisation and computable construction of Markov partitions. We propose a framework of 'abstract local product structures'

APA, Harvard, Vancouver, ISO, and other styles

8

Wei, Ke. "Efficient algorithms for compressed sensing and matrix completion." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:0e2e72fb-dd0c-457b-a0a5-f91c5212f5f5.

Full text

Abstract:

Compressed sensing and matrix completion are two new data acquisition techniques whose efficiency is achieved by exploring low dimensional structures in high dimensional data. Despite the combinatorial nature of compressed sensing and matrix completion, there has been significant development of computationally efficient algorithms which can produce accurate desired solutions to these problems. In this thesis, we are concerned with the development of low per iteration computational complexity algorithms for compressed sensing and matrix completion. First, we derive a locally optimal stepsize selection rule for the simplest iterative hard thresholding algorithm for matrix completion, and obtain a simple yet efficient algorithm. It is observed to have average case performance superior in some aspects to other matrix completion algorithms. To balance the fast convergence rates of more sophisticated recovery algorithms with the low per iteration computational cost of simple line-search algorithms, we introduce a family of conjugate gradient iterative hard thresholding algorithms for both compressed sensing and matrix completion. The theoretical results establish recovery guarantees for the restarted and projected variants of the algorithms, while the empirical performance comparisons establish significant computational advantages of the proposed methods over other hard thresholding algorithms. Finally, we introduce an alternating steepest descent method and a scaled variant especially designed for the matrix completion problem based on a simple factorization model of the low rank matrix. The computational efficacy of this method is achieved by reducing the high per iteration computational cost of the second order method and fully exploring the numerical linear algebra structure in the algorithm. Empirical evaluations establish the effectiveness of the proposed algorithms, compared with other state-of-the-art algorithms.

APA, Harvard, Vancouver, ISO, and other styles

9

Starrett, Dean. "Optimal Alignment of Multiple Sequence Alignments." Diss., The University of Arizona, 2008. http://hdl.handle.net/10150/194840.

Full text

Abstract:

An essential tool in biology is the alignment of multiple sequences. Biologists use multiple sequence alignments for tasks such as predicting protein structure and function, reconstructing phylogenetic trees, and finding motifs. Constructing high-quality multiple alignments is computationally hard, both in theory and in practice, and is typically done using heuristic methods. The majority of state-of-the-art multiple alignment programs employ a form and polish strategy, where in the construction phase, an initial multiple alignment is formed by progressively merging smaller alignments, starting with single sequences. Then in a local-search phase, the resulting alignment is polished by repeatedly splitting it into smaller alignments and re-merging. This merging of alignments, the basic computational problem in the construction and local-search phases of the best multiple alignment heuristics, is called the Aligning Alignments Problem. Under the sum-of-pairs objective for scoring multiple alignments, this problem may seem to be a simple extension of two-sequence alignment. It is proven here, however, that with affine gap costs (which are recognized as necessary to get biologically-informative alignments) the problem is NP-complete when gaps are counted exactly. Interestingly, this form of multiple alignment is polynomial-time solvable when we relax the exact count, showing that exact gap counts themselves are inherently hard in multiple sequence alignment. Unlike general multiple alignment however, we show that Aligning Alignments with affine gap costs and exact counts is tractable in practice, by demonstrating an effective algorithm and a fast implementation. Our software AlignAlign is both time- and space-efficient on biological data. Computational experiments on biological data show instances derived from standard benchmark suites can be optimally aligned with surprising efficiency, and experiments on simulated data show the time and space both scale well.

APA, Harvard, Vancouver, ISO, and other styles

10

Osbild, Ralf Verfasser], and Kurt [Akademischer Betreuer] [Mehlhorn. "General analysis tool box for controlled perturbation algorithms and complexity and computation of Θ-guarded regions / Ralf Osbild. Betreuer: Kurt Mehlhorn." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2013. http://d-nb.info/1053634994/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Saročka, Gediminas. "Rūšiavimo algoritmų vizualizavimas ir sudėtingumo analizė." Bachelor's thesis, Lithuanian Academic Libraries Network (LABT), 2012. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2012~D_20120702_130011-99489.

Full text

Abstract:

Rūšiavimo algoritmų sudėtingumo analizių galima atrasti be problemų, todėl pagrindinė šio darbo idėja buvo sukurti rūšiavimo algoritmų vizualizavimą. Šiame darbe buvo sukurtas trijų paprastųjų rūšiavimo algoritmų (įterpimo, burbulo ir išrinkimo), bei dviejų greitųjų rūšiavimo algoritmų (Šelo ir sąlajos) vizualizavimas. Darbe taip pat galima skaičiuoti rūšiavimo algoritmų rūšiuojamą laiką.
There is a lot of complexity analysis of sorting algorithms can be found without problems, so the main idea of this work was to create a visualization of sorting algorithms. This work was created three simple sorting algorithms (insertion sort, bubble sort and selection sort), and two high-speed sorting algorithms (Shell sort and merge sort) visualization. This program is capable of calculating sorting time of sorting algorithm for further sorting algorithm complexity analysis.

APA, Harvard, Vancouver, ISO, and other styles

12

Singh, Manjeet. "Mathematical Models, Heuristics and Algorithms for Efficient Analysis and Performance Evaluation of Job Shop Scheduling Systems Using Max-Plus Algebraic Techniques." Ohio University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1386087325.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Cunial, Fabio. "Analysis of the subsequence composition of biosequences." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44716.

Full text

Abstract:

Measuring the amount of information and of shared information in biological strings, as well as relating information to structure, function and evolution, are fundamental computational problems in the post-genomic era. Classical analyses of the information content of biosequences are grounded in Shannon's statistical telecommunication theory, while the recent focus is on suitable specializations of the notions introduced by Kolmogorov, Chaitin and Solomonoff, based on data compression and compositional redundancy. Symmetrically, classical estimates of mutual information based on string editing are currently being supplanted by compositional methods hinged on the distribution of controlled substructures. Current compositional analyses and comparisons of biological strings are almost exclusively limited to short sequences of contiguous solid characters. Comparatively little is known about longer and sparser components, both from the point of view of their effectiveness in measuring information and in separating biological strings from random strings, and from the point of view of their ability to classify and to reconstruct phylogenies. Yet, sparse structures are suspected to grasp long-range correlations and, at short range, they are known to encode signatures and motifs that characterize molecular families. In this thesis, we introduce and study compositional measures based on the repertoire of distinct subsequences of any length, but constrained to occur with a predefined maximum gap between consecutive symbols. Such measures highlight previously unknown laws that relate subsequence abundance to string length and to the allowed gap, across a range of structurally and functionally diverse polypeptides. Measures on subsequences are capable of separating only few amino acid strings from their random permutations, but they reveal that random permutations themselves amass along previously undetected, linear loci. This is perhaps the first time in which the vocabulary of all distinct subsequences of a set of structurally and functionally diverse polypeptides is systematically counted and analyzed. Another objective of this thesis is measuring the quality of phylogenies based on the composition of sparse structures. Specifically, we use a set of repetitive gapped patterns, called motifs, whose length and sparsity have never been considered before. We find that extremely sparse motifs in mitochondrial proteomes support phylogenies of comparable quality to state-of-the-art string-based algorithms. Moving from maximal motifs -- motifs that cannot be made more specific without losing support -- to a set of generators with decreasing size and redundancy, generally degrades classification, suggesting that redundancy itself is a key factor for the efficient reconstruction of phylogenies. This is perhaps the first time in which the composition of all motifs of a proteome is systematically used in phylogeny reconstruction on a large scale. Extracting all maximal motifs, or even their compact generators, is infeasible for entire genomes. In the last part of this thesis, we study the robustness of measures of similarity built around the dictionary of LZW -- the variant of the LZ78 compression algorithm proposed by Welch -- and of some of its recently introduced gapped variants. These algorithms use a very small vocabulary, they perform linearly in the input strings, and they can be made even faster than LZ77 in practice. We find that dissimilarity measures based on maximal strings in the dictionary of LZW support phylogenies that are comparable to state-of-the-art methods on test proteomes. Introducing a controlled proportion of gaps does not degrade classification, and allows to discard up to 20% of each input proteome during comparison.

APA, Harvard, Vancouver, ISO, and other styles

14

Riegel, Ryan Nelson. "Generalized N-body problems: a framework for scalable computation." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50269.

Full text

Abstract:

In the wake of the Big Data phenomenon, the computing world has seen a number of computational paradigms developed in response to the sudden need to process ever-increasing volumes of data. Most notably, MapReduce has proven quite successful in scaling out an extensible class of simple algorithms to even hundreds of thousands of nodes. However, there are some tasks---even embarrassingly parallelizable ones---that neither MapReduce nor any existing automated parallelization framework is well-equipped to perform. For instance, any computation that (naively) requires consideration of all pairs of inputs becomes prohibitively expensive even when parallelized over a large number of worker nodes. Many of the most desirable methods in machine learning and statistics exhibit these kinds of all-pairs or, more generally, all-tuples computations; accordingly, their application in the Big Data setting may seem beyond hope. However, a new algorithmic strategy inspired by breakthroughs in computational physics has shown great promise for a wide class of computations dubbed generalized N-body problems (GNBPs). This strategy, which involves the simultaneous traversal of multiple space-partitioning trees, has been applied to a succession of well-known learning methods, accelerating each asymptotically and by orders of magnitude. Examples of these include all-k-nearest-neighbors search, k-nearest-neighbors classification, k-means clustering, EM for mixtures of Gaussians, kernel density estimation, kernel discriminant analysis, kernel machines, particle filters, the n-point correlation, and many others. For each of these problems, no overall faster algorithms are known. Further, these dual- and multi-tree algorithms compute either exact results or approximations to within specified error bounds, a rarity amongst fast methods. This dissertation aims to unify a family of GNBPs under a common framework in order to ease implementation and future study. We start by formalizing the problem class and then describe a general algorithm, the generalized fast multipole method (GFMM), capable of solving all problems that fit the class, though with varying degrees of speedup. We then show O(N) and O(log N) theoretical run-time bounds that may be obtained under certain conditions. As a corollary, we derive the tightest known general-dimensional run-time bounds for exact all-nearest-neighbors and several approximated kernel summations. Next, we implement a number of these algorithms in a commercial database, empirically demonstrating dramatic asymptotic speedup over their conventional SQL implementations. Lastly, we implement a fast, parallelized algorithm for kernel discriminant analysis and apply it to a large dataset (40 million points in 4D) from the Sloan Digital Sky Survey, identifying approximately one million quasars with high accuracy. This exceeds the previous largest catalog of quasars in size by a factor of ten and has since been used in a follow-up study to confirm the existence of dark energy.

APA, Harvard, Vancouver, ISO, and other styles

15

Nirenstein, Shaun. "Fast and Accurate Visibility Preprocessing." Thesis, University of Cape Town, 2003. http://pubs.cs.uct.ac.za/archive/00000101/.

Full text

Abstract:

Visibility culling is a means of accelerating the graphical rendering of geometric models. Invisible objects are efficiently culled to prevent their submission to the standard graphics pipeline. It is advantageous to preprocess scenes in order to determine invisible objects from all possible camera views. This information is typically saved to disk and may then be reused until the model geometry changes. Such preprocessing algorithms are therefore used for scenes that are primarily static. Currently, the standard approach to visibility preprocessing algorithms is to use a form of approximate solution, known as conservative culling. Such algorithms over-estimate the set of visible polygons. This compromise has been considered necessary in order to perform visibility preprocessing quickly. These algorithms attempt to satisfy the goals of both rapid preprocessing and rapid run-time rendering. We observe, however, that there is a need for algorithms with superior performance in preprocessing, as well as for algorithms that are more accurate. For most applications these features are not required simultaneously. In this thesis we present two novel visibility preprocessing algorithms, each of which is strongly biased toward one of these requirements. The first algorithm has the advantage of performance. It executes quickly by exploiting graphics hardware. The algorithm also has the features of output sensitivity (to what is visible), and a logarithmic dependency in the size of the camera space partition. These advantages come at the cost of image error. We present a heuristic guided adaptive sampling methodology that minimises this error. We further show how this algorithm may be parallelised and also present a natural extension of the algorithm to five dimensions for accelerating generalised ray shooting. The second algorithm has the advantage of accuracy. No over-estimation is performed, nor are any sacrifices made in terms of image quality. The cost is primarily that of time. Despite the relatively long computation, the algorithm is still tractable and on average scales slightly superlinearly with the input size. This algorithm also has the advantage of output sensitivity. This is the first known tractable exact solution to the general 3D from-region visibility problem. In order to solve the exact from-region visibility problem, we had to first solve a more general form of the standard stabbing problem. An efficient solution to this problem is presented independently.

APA, Harvard, Vancouver, ISO, and other styles

16

Nataša, Dragnić. ""Konstrukcija i analiza klaster algoritma sa primenom u definisanju bihejvioralnih faktora rizika u populaciji odraslog stanovništva Srbije"." Phd thesis, Univerzitet u Novom Sadu, Doktorske disertacije iz interdisciplinarne odnosno multidisciplinarne oblasti na Univerzitetu u Novom Sadu, 2016. https://www.cris.uns.ac.rs/record.jsf?recordId=99629&source=NDLTD&language=en.

Full text

Abstract:

Klaster analiza ima dugu istoriju i mada seprimenjuje u mnogim oblastima i dalje ostajuznačajni izazovi. U disertaciji je prikazan uvodu neglatki optimizacioni pristup uklasterovanju, sa osvrtom na problemklasterovanja velikih skupova podataka.Međutim, ovi optimizacioni algoritmi boljefunkcionišu u radu sa neprekidnim podacima.Jedan od glavnih izazova u klaster analizi jerad sa velikim skupovima podataka sakategorijalnim i kombinovanim (numerički ikategorijalni) tipovima promenljivih. Rad savelikim brojem instanci (objekata) i velikimbrojem dimenzija (promenljivih), možepredstavljati problem u klaster analizi, zbogvremenske složenosti. Jedan od načinarešavanja ovog problema je redukovanje brojainstanci, bez gubitka informacija.Prvi cilj disertacije je bio upoređivanjerezultata klasterovanja na celom skupu iprostim slučajnim uzorcima sa kategorijalnim ikombinovanim podacima, za različite veličineuzorka i različit broj klastera. Nije utvrđenaznačajna razlika (p>0.05) u rezultatimaklasterovanja na uzorcima obima0.03m,0.05m,0.1m,0.3m (gde je m obimposmatranog skupa) i celom skupu.Drugi cilj disertacije je bio konstrukcijaefikasnog postupka klasterovanja velikihskupova podataka sa kategorijalnim ikombinovanim tipovima promenljivih.Predloženi postupak se sastoji iz sledećihkoraka: 1. klasterovanje na prostim slučajnimuzorcima određene kardinalnosti; 2.određivanje najboljeg klasterskog rešenja nauzorku, primenom odgovarajućeg kriterijumavalidnosti; 3. dobijeni centri klastera iz ovoguzorka služe za klasterovanje ostatka skupa.Treći cilj disertacije predstavlja primenuklaster analize u definisanju klasterabihejvioralnih faktora rizika u populacijiodraslog stanovništva Srbije, kao i analizusociodemografskih karakteristika dobijenihklastera. Klaster analiza je primenjena navelikom reprezentativnom uzorku odraslogstanovništva Srbije, starosti 20 i više godina.Izdvojeno je pet jasno odvojenih klastera sakarakterističnim kombinacijama bihejvioralnihfaktora rizika: Bez rizičnih faktora, Štetnaupotreba alkohola i druge rizične navike,Nepravilna ishrana i druge rizične navike,Nedovoljna fizička aktivnost, Pušenje. Rezultatimultinomnog logističkog regresionog modelaukazuju da ispitanici koji nisu u braku, lošijegsu materijalnog stanja, nižeg obrazovanja i živeu Vojvodini imaju veću šansu za prisustvovišestrukih bihejvioralnih faktora rizika.
The cluster analysis has a long history and alarge number of clustering techniques havebeen developed in many areas, however,significant challenges still remain. In thisthesis we have provided a introduction tononsmooth optimization approach to clusteringwith reference to clustering large datasets.Nevertheless, these optimization clusteringalgorithms work much better when a datasetcontains only vectors with continuous features.One of the main challenges is clustering of largedatasets with categorical and mixed (numericaland categorical) data. Clustering deals with alarge number of instances (objects) and a largenumber of dimensions (variables) can beproblematic because of time complexity. One ofthe ways to solve this problem is by reducingthe number of instances, without the loss ofinformation.The first aim of this thesis was to comparethe results of cluster algorithms on the wholedataset and on simple random samples withcategorical and mixed data, in terms of validity,for different number of clusters and fordifferent sample sizes. There were nosignificant differences (p>0.05) between theobtained results on the samples of the size of0.03m,0.05m,0.1m,0.3m (where m is the size ofthe dataset) and the whole dataset.The second aim of this thesis was todevelop an efficient clustering procedure forlarge datasets with categorical and mixed(numeric and categorical) values. The proposedprocedure consists of the following steps: 1.clustering on simple random samples of a givencardinality; 2. finding the best cluster solutionon a sample (by appropriate validity measure);3. using cluster centers from this sample forclustering of the remaining data.The third aim of this thesis was toexamine clustering of four lifestyle risk factorsand to examine the variation across differentsocio-demographic groups in a Serbian adultpopulation. Cluster analysis was carried out ona large representative sample of Serbian adultsaged 20 and over. We identified fivehomogenous health behaviour clusters withspecific combination of risk factors: 'No RiskBehaviours', 'Drinkers with Risk Behaviours','Unhealthy diet with Risk Behaviours','Smoking'. Results of multinomial logisticregression indicated that single adults, lesseducated, with low socio-economic status andliving in the region of Vojvodina are most likelyto be a part of the clusters with a high-riskprofile.

APA, Harvard, Vancouver, ISO, and other styles

17

Taeihagh, Araz. "A novel approach for the development of policies for socio-technical systems." Thesis, University of Oxford, 2011. http://ora.ox.ac.uk/objects/uuid:0183f800-51bf-4e4d-abba-cd91b7bf48f0.

Full text

Abstract:

The growth in the interdependence and complexity of socio-technical systems requires the development of tools and techniques to aid in the formulation of better policies. The efforts of this research focus towards developing methodologies and support tools for better policy design and formulation. In this thesis, a new framework and a systematic approach for the formulation of policies are proposed. Focus has been directed to the interactions between policy measures, inspired by concepts in process design and network analysis. Furthermore, we have developed an agent-based approach to create a virtual environment for the exploration and analysis of different configurations of policy measures in order to build policy packages and test the effects of changes and uncertainties while formulating policies. By developing systematic approaches for the formulation and analysis of policies it is possible to analyse different configuration alternatives in greater depth, examine more alternatives and decrease the time required for the overall analysis. Moreover, it is possible to provide real-time assessment and feedback to the domain experts on the effect of changes in the configurations. These efforts ultimately help in forming more effective policies with synergistic and reinforcing attributes while avoiding internal contradictions. This research constitutes the first step towards the development of a general family of computer-based systems that support the design of policies. The results from this research also demonstrate the usefulness of computational approaches in addressing the complexity inherent in the formulation of policies. As a proof of concept, the proposed framework and methodologies have been applied to the formulation of policies that deal with transportation issues and emission reduction, but can be extended to other domains.

APA, Harvard, Vancouver, ISO, and other styles

18

Fujdiak, Radek. "Analýza a optimalizace datové komunikace pro telemetrické systémy v energetice." Doctoral thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2017. http://www.nusl.cz/ntk/nusl-358408.

Full text

Abstract:

Telemetry system, Optimisation, Sensoric networks, Smart Grid, Internet of Things, Sensors, Information security, Cryptography, Cryptography algorithms, Cryptosystem, Confidentiality, Integrity, Authentication, Data freshness, Non-Repudiation.

APA, Harvard, Vancouver, ISO, and other styles

19

Evans, Patricia Anne. "Algorithms and complexity for annotated sequence analysis." Thesis, 1999. https://dspace.library.uvic.ca//handle/1828/8864.

Full text

Abstract:

Molecular biologists use algorithms that compare and otherwise analyze sequences that represent genetic and protein molecules. Most of these algorithms, however, operate on the basic sequence and do not incorporate the additional information that is often known about the molecule and its pieces. This research describes schemes to combinatorially annotate this information onto sequences so that it can be analyzed in tandem with the sequence; the overall result would thus reflect both types of information about the sequence. These annotation schemes include adding colours and arcs to the sequence. Colouring a sequence would produce a same-length sequence of colours or other symbols that highlight or label parts of the sequence. Arcs can be used to link sequence symbols (or coloured substrings) to indicate molecular bonds or other relationships. Adding these annotations to sequence analysis problems such as sequence alignment or finding the longest common subsequence can make the problem more complex, often depending on the complexity of the annotation scheme. This research examines the different annotation schemes and the corresponding problems of verifying annotations, creating annotations, and finding the longest common subsequence of pairs of sequences with annotations. This work involves both the conventional complexity framework and parameterized complexity, and includes algorithms and hardness results for both frameworks. Automata and transducers are created for some annotation verification and creation problems. Different restrictions on layered substring and arc annotation are considered to determine what properties an annotation scheme must have to make its incorporation feasible. Extensions to the algorithms that use weighting schemes are explored. schemes are explored.
Graduate

APA, Harvard, Vancouver, ISO, and other styles

20

Sani, Habiba M., Ci Lei, and Daniel Neagu. "Computational complexity analysis of decision tree algorithms." 2018. http://hdl.handle.net/10454/16762.

Full text

Abstract:

Yes
Decision tree is a simple but powerful learning technique that is considered as one of the famous learning algorithms that have been successfully used in practice for various classification tasks. They have the advantage of producing a comprehensible classification model with satisfactory accuracy levels in several application domains. In recent years, the volume of data available for learning is dramatically increasing. As a result, many application domains are faced with a large amount of data thereby posing a major bottleneck on the computability of learning techniques. There are different implementations of the decision tree using different techniques. In this paper, we theoretically and experimentally study and compare the computational power of the most common classical top-down decision tree algorithms (C4.5 and CART). This work can serve as part of review work to analyse the computational complexity of the existing decision tree classifier algorithm to gain understanding of the operational steps with the aim of optimizing the learning algorithm for large datasets.

APA, Harvard, Vancouver, ISO, and other styles

21

Bernstein, Daniel S. "Complexity analysis and optimal algorithms for decentralized decision making." 2005. https://scholarworks.umass.edu/dissertations/AAI3193879.

Full text

Abstract:

Coordination of distributed entities is required for problems arising in many areas, including multi-robot systems, networking applications, e-commerce applications, and the control of autonomous space vehicles. Because decisions must often be made without a global view of the system, coordination can be difficult. This dissertation focuses on the development of principled tools for solving problems of distributed decision making. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). This framework is very general, incorporating stochastic action effects, uncertainty about the system state, and limited communication. It has been adopted for use in the fields of control theory, operations research, and artificial intelligence. Despite this fact, a number of fundamental questions about the computational aspects of the model have gone unanswered. One contribution of this thesis is an analysis of the worst-case complexity of solving DEC-POMDPs. It was previously established that for a single agent, the finite-horizon version of the problem is PSPACE-complete. We show that the general problem is NEXP-complete, even if there are only two agents whose observations together determine the system state. This complexity result illustrates a fundamental difference between single agent and multiagent decision-making problems. In contrast to the single agent problem, the multiagent problem provably does not admit a polynomial-time algorithm. Furthermore, assuming that EXP and NEXP are distinct, the problem requires super-exponential time to solve in the worst case. A second contribution is an optimal policy iteration algorithm for solving DEC-POMDPs. Stochastic finite-state controllers are used to represent policies. A controller can include a correlation device, which allows agents to correlate their actions without communicating during execution. The algorithm alternates between expanding the controller and performing value-preserving transformations, which modify a controller without sacrificing value. We present two efficient value-preserving transformations, one which can reduce the size of the controller and another which can improve its value while keeping the size fixed. Our policy iteration algorithm serves as the first nontrivial exact algorithm for DEC-POMDPs.

APA, Harvard, Vancouver, ISO, and other styles

22

Pourhassan, Mojgan. "Parameterised complexity analysis of evolutionary algorithms for combinatorial optimization problems." Thesis, 2017. http://hdl.handle.net/2440/109799.

Full text

Abstract:

Evolutionary algorithms are general problem solvers that have been successfully used in solving combinatorial optimization problems. However, due to the great amount of randomness in these algorithms, theoretical understanding of them is quite challenging. In this thesis we analyse the parameterized complexity of evolutionary algorithms on combinatorial optimization problems. Studying the parameterized complexity of these algorithms can help us understand how different parameters of problems influence the runtime behaviour of the algorithm and consequently lead us in finding better performing algorithms. We focus on two NP-hard combinatorial optimization problems; the generalized travelling salesman problem (GTSP) and the vertex cover problem (VCP). For solving the GTSP, two hierarchical approaches with different neighbourhood structures have been proposed in the literature. In this thesis, local search algorithms and simple evolutionary algorithms based on these approaches are investigated from a theoretical perspective and complementary abilities of the two approaches are pointed out by presenting instances where they mutually outperform each other. After investigating the runtime behaviour of the mentioned randomised algorithms on GTSP, we turn our attention to the VCP. Evolutionary multi-objective optimization for the classical vertex cover problem has been previously analysed in the context of parameterized complexity analysis. We extend the analysis to the weighted version of the problem. We also examine a dynamic version of the classical problem and analyse evolutionary algorithms with respect to their ability to maintain a 2-approximation. Inspired by the concept of duality, an edge-based evolutionary algorithm for solving the VCP has been introduced in the literature. Here we show that this edge-based EA is able to maintain a 2-approximation solution in the dynamic setting. Moreover, using the dual form of the problem, we extend the edge-based approach to the weighted vertex cover problem.
Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2017.

APA, Harvard, Vancouver, ISO, and other styles

23

Chin, Kien-Weh, and 陳建文. "Algorithms and Complexity Analysis of the Symmetry Number Problem for Graphs." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/04362411441823877826.

Full text

Abstract:

碩士
國立臺灣大學
電機工程學系研究所
86
Graphs are known to be useful for modeling various scientific/engineering prob lems in the real world. Because of the popularity of graphs, graph drawing has emerged as a research topic of great importance in graph theory. Although a l ot of graph drawing algorithms have been proposed, few of them explicitly targ ets for graph drawings with high degree of symmetry. In this thesis, we define the symmetry number to measure the degree of symmetry of graphs, and discuss the complexity of deciding the symmetry number of graphs. We also propose a po lynomial time algorithm to decide the maximum symmetric subtree of a given tre e.

APA, Harvard, Vancouver, ISO, and other styles

24

Philips, Petra. "Data-Dependent Analysis of Learning Algorithms." Phd thesis, 2005. http://hdl.handle.net/1885/47998.

Full text

Abstract:

This thesis studies the generalization ability of machine learning algorithms in a statistical setting. It focuses on the data-dependent analysis of the generalization performance of learning algorithms in order to make full use of the potential of the actual training sample from which these algorithms learn.¶ First, we propose an extension of the standard framework for the derivation of generalization bounds for algorithms taking their hypotheses from random classes of functions. ... ¶ Second, we study in more detail generalization bounds for a specific algorithm which is of central importance in learning theory, namely the Empirical Risk Minimization algorithm (ERM). ...

APA, Harvard, Vancouver, ISO, and other styles

25

"Performance Analysis of Low-Complexity Resource-Allocation Algorithms in Stochastic Networks Using Fluid Models." Doctoral diss., 2015. http://hdl.handle.net/2286/R.I.36414.

Full text

Abstract:

abstract: Resource allocation in communication networks aims to assign various resources such as power, bandwidth and load in a fair and economic fashion so that the networks can be better utilized and shared by the communicating entities. The design of efficient resource-allocation algorithms is, however, becoming more and more challenging due to the precipitously increasing scale of the networks. This thesis strives to understand how to design such low-complexity algorithms with performance guarantees. In the first part, the link scheduling problem in wireless ad hoc networks is considered. The scheduler is charge of finding a set of wireless data links to activate at each time slot with the considerations of wireless interference, traffic dynamics, network topology and quality-of-service (QoS) requirements. Two different yet essential scenarios are investigated: the first one is when each packet has a specific deadline after which it will be discarded; the second is when each packet traverses the network in multiple hops instead of leaving the network after a one-hop transmission. In both scenarios the links need to be carefully scheduled to avoid starvation of users and congestion on links. One greedy algorithm is analyzed in each of the two scenarios and performance guarantees in terms of throughput of the networks are derived. In the second part, the load-balancing problem in parallel computing is studied. Tasks arrive in batches and the duty of the load balancer is to place the tasks on the machines such that minimum queueing delay is incurred. Due to the huge size of modern data centers, sampling the status of all machines may result in significant overhead. Consequently, an algorithm based on limited queue information at the machines is examined and its asymptotic delay performance is characterized and it is shown that the proposed algorithm achieves the same delay with remarkably less sampling overhead compared to the well-known power-of-two-choices algorithm. Two messages of the thesis are the following: greedy algorithms can work well in a stochastic setting; the fluid model can be useful in "derandomizing" the system and reveal the nature of the algorithm.
Dissertation/Thesis
Doctoral Dissertation Electrical Engineering 2015

APA, Harvard, Vancouver, ISO, and other styles

26

Vasudevan, Shobha. "High level static analysis of system descriptions for taming verification complexity." 2007. http://hdl.handle.net/2152/15978.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Zhang, Rui. "Model selection techniques for kernel-based regression analysis using information complexity measure and genetic algorithms." 2007. http://etd.utk.edu/2007/ZhangRui.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Hsu, Chao-Yuan, and 許兆元. "ICI Mitigation for High-mobility SISO/MIMO OFDM Systems: Low-complexity Algorithms and Performance Analysis." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/41881546213151002873.

Full text

Abstract:

博士
國立交通大學
電信工程系所
97
In orthogonal frequency-division multiplexing (OFDM) systems, it is generally assumed that the channel response is static in an OFDM symbol period. However, the assumption does not hold in high-mobility environments. As a result, intercarrier interference (ICI) is induced and the system performance is degraded. A simple remedy for this problem is the application of the zero-forcing (ZF) and minimum mean square error (MMSE) equalizers. Unfortunately, the direct ZF method requires the inversion of an NxN ICI matrix, where N is the number of subcarriers. When N is large, the computational complexity can become prohibitively high. As for the direct MMSE method, in addition to an NxN matrix inverse, it requires an extra NxN matrix multiplication, making the required computational complexity higher compared to the direct ZF method. In this dissertation, we first propose a low-complexity ZF method to solve the problem in single-input-single-output (SISO) OFDM systems. The main idea is to explore the special structure inherent in the ICI matrix and to apply Newton's iteration for matrix inversion. With our formulation, fast Fourier transforms (FFTs) can be used in the iterative process, reducing the complexity from O(N^3) to O(Nlog_2 N) . Also, the required number of the iteration is typically one or two. We also analyze the convergence behavior of the proposed method and derive the theoretical output signal-to-interference-noise-ratio (SINR). For the MMSE method, we first reformulate the MMSE solution in a way that the extra matrix multiplication can be avoided. Similar to the ZF method, we then exploit the structure of the ICI matrix and apply Newton's iteration to reduce the complexity of the matrix inversion. For a multiple-input-multiple-output (MIMO) OFDM system, the required complexity of the ZF and MMSE methods becomes more intractable. We then manage to extend the proposed ZF and MMSE methods for SISO-OFDM systems to MIMO-OFDM systems. It turns out that the computational complexity can be reduced even more significantly. Simulation results show that the performance of the proposed methods is almost as good as that of the direct ZF and MMSE methods, while the required computational complexity is reduced dramatically. Finally, we explore the application of the proposed methods in mobility-induced ICI mitigation for OFDM multiple access (OFDMA) systems, and in carrier frequency offset (CFO) induced ICI mitigation for OFDMA uplink systems. As that in OFDM systems, the proposed methods can reduce the required computational complexity, effectively.

APA, Harvard, Vancouver, ISO, and other styles

29

(8072036), Ahmed I. Al Herz. "APPROXIMATION ALGORITHMS FOR MAXIMUM VERTEX-WEIGHTED MATCHING." Thesis, 2019.

Find full text

Abstract:

We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Vertex-weighted matchings arise in many applications, including internet advertising, facility scheduling, constraint satisfaction, the design of network switches, and computation of sparse bases for the null space or the column space of a matrix. Let m be the number of edges, n number of vertices, and D the maximum degree of a vertex in the graph. We design two exact algorithms for the MVM problem with time complexities of O(mn) and O(Dmn). The new exact algorithms use a maximum cardinality matching as an initial matching, after which the weight of the matching is increased using weight-increasing paths.

Although MVM problems can be solved exactly in polynomial time, exact MVM algorithms are still slow in practice for large graphs with millions and even billions of edges. Hence we investigate several approximation algorithms for MVM in this thesis. First we show that a maximum vertex-weighted matching can be approximated within an approximation ratio arbitrarily close to one, to k/(k + 1), where k is related to the length of augmenting or weight-increasing paths searched by the algorithm. We identify two main approaches for designing approximation algorithms for MVM. The first approach is direct; vertices are sorted in non-increasing order of weights, and then the algorithm searches for augmenting paths of restricted length that reach a heaviest vertex. (In this approach each vertex is processed once). The second approach repeatedly searches for augmenting paths and increasing paths, again of restricted length, until none can be found. In this second, iterative approach, a vertex may need to be processed multiple times. We design two approximation algorithms based on the direct approach with approximation ratios of 1/2 and 2/3. The time complexities of the 1/2-approximation algorithm is O(m + n log n), and that of the 2/3-approximation algorithm is O(mlogD). Employing the second approach, we design 1/2- and 2/3-approximation algorithms for MVM with time complexities of O(Dm) and O(D²m), respectively. We show that the iterative algorithm can be generalized to nd a k/(k+1)-approximate MVM with a time complexity of O(D^km). In addition, we design parallel 1/2- and 2/3-approximation algorithms for a shared memory programming model, and introduce a new technique for locking augmenting paths to avoid deadlock and related problems.

MVM problems may be solved using algorithms for the maximum edge-weighted matching (MEM) by assigning to each edge a weight equal to the sum of the vertex weights on its endpoints. However, our results will show that this is one way to generate MEM problems that are difficult to solve. On such problems, exact MEM algorithms may require run times that are a factor of a thousand or more larger than the time of an exact MVM algorithm. Our results show the competitiveness of the new exact algorithms by demonstrating that they outperform MEM exact algorithms. Specifically, our fastest exact algorithm runs faster than the fastest MEM implementation by a factor of 37 and 18 on geometric mean, using two different sets of weights on our test problems. In some instances, the factor can be higher than 500. Moreover, extensive experimental results show that the MVM approximation algorithm outperforms an MEM approximation algorithm with the same approximation ratio, with respect to matching weight and run time. Indeed, our results show that the MVM approximation algorithm outperforms the corresponding MEM algorithm with respect to these metrics in both serial and parallel settings.

APA, Harvard, Vancouver, ISO, and other styles

30

Woolway, Matthew. "Computational approaches in compressed sensing." Thesis, 2014. http://hdl.handle.net/10539/15334.

Full text

Abstract:

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science. Johannesburg, 2014.
This thesis aims to provide a summary on computational approaches to solving the Compressed Sensing problem. The theoretical problem of solving systems of linear equations has long been investigated in academic literature. A relatively new field, Compressed Sensing is an application of such a problem. Specifically, with the ability to change the way in which we obtain and process signals. Under the assumption of sparse signals, Compressed Sensing is able to recover signals sampled at a rate much lower than that of the current Shannon/Nyquist sampling rate. The primary goal of this thesis, is to describe major algorithms currently used in the Compressed Sensing problem. This is done as a means to provide the reader with sufficient up to date knowledge on current approaches as well as their means of implementation, on central processing units (CPUs) and graphical processing units (GPUs), when considering computational concerns such as computational time, storage requirements and parallelisability.

APA, Harvard, Vancouver, ISO, and other styles

31

Sauer, Paul Van der Merwe. "The complexity of unavoidable word patterns." Thesis, 2019. http://hdl.handle.net/10500/27343.

Full text

Abstract:

Bibliography: pages 192-195
The avoidability, or unavoidability of patterns in words over finite alphabets has been studied extensively. The word α over a finite set A is said to be unavoidable for an infinite set B+ of nonempty words over a finite set B if, for all but finitely many elements w of B+, there exists a semigroup morphism φ ∶ A+ → B+ such that φ(α) is a factor of w. In this treatise, we start by presenting a historical background of results that are related to unavoidability. We present and discuss the most important theorems surrounding unavoidability in detail. We present various complexity-related properties of unavoidable words. For words that are unavoidable, we provide a constructive upper bound to the lengths of words that avoid them. In particular, for a pattern α of length n over an alphabet of size r, we give a concrete function N(n, r) such that no word of length N(n, r) over the alphabet of size r avoids α. A natural subsequent question is how many unavoidable words there are. We show that the fraction of words that are unavoidable drops exponentially fast in the length of the word. This allows us to calculate an upper bound on the number of unavoidable patterns for any given finite alphabet. Subsequently, we investigate computational aspects of unavoidable words. In particular, we exhibit concrete algorithms for determining whether a word is unavoidable. We also prove results on the computational complexity of the problem of determining whether a given word is unavoidable. Specifically, the NP-completeness of the aforementioned problem is established.
Decision Sciences
D. Phil. (Operations Research)

APA, Harvard, Vancouver, ISO, and other styles

32

Medina, Raúl Ezequiel. "Optimización de dominios de Planning." Bachelor's thesis, 2016. http://hdl.handle.net/11086/3216.

Full text

Abstract:

Tesis (Lic. en Ciencias de la Computación)--Universidad Nacional de Córdoba, Facultad de Matemática, Astronomía, Física y Computación, 2016.
En este trabajo se describe una técnica de optimización de dominios de Planning. Primero se presenta una introducción cerca de la inteligencia artificial en general. Luego se aborda el problema de planning revisando los momentos históricos más importantes hasta llegar al estado del arte. También se introduce el lenguaje PDDL para poder presentar nuevos dominios que se implementaron. Luego se presenta la técnica de optimización propuesta, la cual incluye el algoritmo de Split y Unsplit. Con el fin de poder estudiar el rendimiento de la técnica de optimización propuesta, se realizan pruebas de rendimiento sobre los dominios presentados y un dominio clásico. Por último se estudian los resultados obtenidos para poder dar una conclusión sobre la técnica en general.
This work explain an optimization technique for planning domains. First comes an introduction about artificial intelligence in general. Then describes the planning problem, the most important historical moments are reviewd until the state of the art. After that, the language PDDL is introduced, to understand the new domains implemented. Then the Split and Unsplit algorithms are presented. Finally after running tests, the results and conclusion are shown.

APA, Harvard, Vancouver, ISO, and other styles

33

(9741149), Lintao Ye. "Algorithmic and Graph-Theoretic Approaches for Optimal Sensor Selection in Large-Scale Systems." Thesis, 2020.

Find full text

Abstract:

Using sensor measurements to estimate the states and parameters of a system is a fundamental task in understanding the behavior of the system. Moreover, as modern systems grow rapidly in scale and complexity, it is not always possible to deploy sensors to measure all of the states and parameters of the system, due to cost and physical constraints. Therefore, selecting an optimal subset of all the candidate sensors to deploy and gather measurements of the system is an important and challenging problem. In addition, the systems may be targeted by external attackers who attempt to remove or destroy the deployed sensors. This further motivates the formulation of resilient sensor selection strategies. In this thesis, we address the sensor selection problem under different settings as follows.

First, we consider the optimal sensor selection problem for linear dynamical systems with stochastic inputs, where the Kalman filter is applied based on the sensor measurements to give an estimate of the system states. The goal is to select a subset of sensors under certain budget constraints such that the trace of the steady-state error covariance of the Kalman filter with the selected sensors is minimized. We characterize the complexity of this problem by showing that the Kalman filtering sensor selection problem is NP-hard and cannot be approximated within any constant factor in polynomial time for general systems. We then consider the optimal sensor attack problem for Kalman filtering. The Kalman filtering sensor attack problem is to attack a subset of selected sensors under certain budget constraints in order to maximize the trace of the steady-state error covariance of the Kalman filter with sensors after the attack. We show that the same results as the Kalman filtering sensor selection problem also hold for the Kalman filtering sensor attack problem. Having shown that the general sensor selection and sensor attack problems for Kalman filtering are hard to solve, our next step is to consider special classes of the general problems. Specifically, we consider the underlying directed network corresponding to a linear dynamical system and investigate the case when there is a single node of the network that is affected by a stochastic input. In this setting, we show that the corresponding sensor selection and sensor attack problems for Kalman filtering can be solved in polynomial time. We further study the resilient sensor selection problem for Kalman filtering, where the problem is to find a sensor selection strategy under sensor selection budget constraints such that the trace of the steady-state error covariance of the Kalman filter is minimized after an adversary removes some of the deployed sensors. We show that the resilient sensor selection problem for Kalman filtering is NP-hard, and provide a pseudo-polynomial-time algorithm to solve it optimally.

Next, we consider the sensor selection problem for binary hypothesis testing. The problem is to select a subset of sensors under certain budget constraints such that a certain metric of the Neyman-Pearson (resp., Bayesian) detector corresponding to the selected sensors is optimized. We show that this problem is NP-hard if the objective is to minimize the miss probability (resp., error probability) of the Neyman-Pearson (resp., Bayesian) detector. We then consider three optimization objectives based on the Kullback-Leibler distance, J-Divergence and Bhattacharyya distance, respectively, in the hypothesis testing sensor selection problem, and provide performance bounds on greedy algorithms when applied to the sensor selection problem associated with these optimization objectives.

Moving beyond the binary hypothesis setting, we also consider the setting where the true state of the world comes from a set that can have cardinality greater than two. A Bayesian approach is then used to learn the true state of the world based on the data streams provided by the data sources. We formulate the Bayesian learning data source selection problem under this setting, where the goal is to minimize the cost spent on the data sources such that the learning error is within a certain range. We show that the Bayesian learning data source selection is also NP-hard, and provide greedy algorithms with performance guarantees.

Finally, in light of the COVID-19 pandemic, we study the parameter estimation measurement selection problem for epidemics spreading in networks. Here, the measurements (with certain costs) are collected by conducting virus and antibody tests on the individuals in the epidemic spread network. The goal of the problem is then to optimally estimate the parameters (i.e., the infection rate and the recovery rate of the virus) in the epidemic spread network, while satisfying the budget constraint on collecting the measurements. Again, we show that the measurement selection problem is NP-hard, and provide approximation algorithms with performance guarantees.

APA, Harvard, Vancouver, ISO, and other styles

34

Tilak, Omkar Jayant. "Decentralized and Partially Decentralized Multi-Agent Reinforcement Learning." 2013. http://hdl.handle.net/1805/3462.

Full text

Abstract:

Indiana University-Purdue University Indianapolis (IUPUI)
Multi-agent systems consist of multiple agents that interact and coordinate with each other to work towards to certain goal. Multi-agent systems naturally arise in a variety of domains such as robotics, telecommunications, and economics. The dynamic and complex nature of these systems entails the agents to learn the optimal solutions on their own instead of following a pre-programmed strategy. Reinforcement learning provides a framework in which agents learn optimal behavior based on the response obtained from the environment. In this thesis, we propose various novel de- centralized, learning automaton based algorithms which can be employed by a group of interacting learning automata. We propose a completely decentralized version of the estimator algorithm. As compared to the completely centralized versions proposed before, this completely decentralized version proves to be a great improvement in terms of space complexity and convergence speed. The decentralized learning algorithm was applied; for the first time; to the domains of distributed object tracking and distributed watershed management. The results obtained by these experiments show the usefulness of the decentralized estimator algorithms to solve complex optimization problems. Taking inspiration from the completely decentralized learning algorithm, we propose the novel concept of partial decentralization. The partial decentralization bridges the gap between the completely decentralized and completely centralized algorithms and thus forms a comprehensive and continuous spectrum of multi-agent algorithms for the learning automata. To demonstrate the applicability of the partial decentralization, we employ a partially decentralized team of learning automata to control multi-agent Markov chains. More flexibility, expressiveness and flavor can be added to the partially decentralized framework by allowing different decentralized modules to engage in different types of games. We propose the novel framework of heterogeneous games of learning automata which allows the learning automata to engage in disparate games under the same formalism. We propose an algorithm to control the dynamic zero-sum games using heterogeneous games of learning automata.

APA, Harvard, Vancouver, ISO, and other styles

35

(10223831), Yuankun Fu. "Accelerated In-situ Workflow of Memory-aware Lattice Boltzmann Simulation and Analysis." Thesis, 2021.

Find full text

Abstract:

As high performance computing systems are advancing from petascale to exascale, scientific workflows to integrate simulation and visualization/analysis are a key factor to influence scientific campaigns. As one of the campaigns to study fluid behaviors, computational fluid dynamics (CFD) simulations have progressed rapidly in the past several decades, and revolutionized our lives in many fields. Lattice Boltzmann method (LBM) is an evolving CFD approach to significantly reducing the complexity of the conventional CFD methods, and can simulate complex fluid flow phenomena with cheaper computational cost. This research focuses on accelerating the workflow of LBM simulation and data analysis.

I start my research on how to effectively integrate each component of a workflow at extreme scales. Firstly, we design an in-situ workflow benchmark that integrates seven state-of-the-art in-situ workflow systems with three synthetic applications, two real-world CFD applications, and corresponding data analysis. Then detailed performance analysis using visualized tracing shows that even the fastest existing workflow system still has 42% overhead. Then, I develop a novel minimized end-to-end workflow system, Zipper, which combines the fine-grain task parallelism of full asynchrony and pipelining. Meanwhile, I design a novel concurrent data transfer optimization method, which employs a multi-threaded work-stealing algorithm to transfer data using both channels of network and parallel file system. It significantly reduces the data transfer time by up to 32%, especially when the simulation application is stalled. Then investigation on the speedup using OmniPath network tools shows that the network congestion has been alleviated by up to 80%. At last, the scalability of the Zipper system has been verified by a performance model and various largescale workflow experiments on two HPC systems using up to 13,056 cores. Zipper is the fastest workflow system and outperforms the second-fastest by up to 2.2 times.

After minimizing the end-to-end time of the LBM workflow, I began to accelerate the memory-bound LBM algorithms. We first design novel parallel 2D memory-aware LBM algorithms. Then I extend to design 3D memory-aware LBM that combine features of single-copy distribution, single sweep, swap algorithm, prism traversal, and merging multiple temporal time steps. Strong scalability experiments on three HPC systems show that 2D and 3D memory-aware LBM algorithms outperform the existing fastest LBM by up to 4 times and 1.9 times, respectively. The speedup reasons are illustrated by theoretical algorithm analysis. Experimental roofline charts on modern CPU architectures show that memory-aware LBM algorithms can improve the arithmetic intensity (AI) of the fastest existing LBM by up to 4.6 times.

APA, Harvard, Vancouver, ISO, and other styles

36

(9179300), Evgenia-Maria Kontopoulou. "RANDOMIZED NUMERICAL LINEAR ALGEBRA APPROACHES FOR APPROXIMATING MATRIX FUNCTIONS." Thesis, 2020.

Find full text

Abstract:

This work explores how randomization can be exploited to deliver sophisticated

algorithms with provable bounds for: (i) The approximation of matrix functions, such

as the log-determinant and the Von-Neumann entropy; and (ii) The low-rank approximation

of matrices. Our algorithms are inspired by recent advances in Randomized

Numerical Linear Algebra (RandNLA), an interdisciplinary research area that exploits

randomization as a computational resource to develop improved algorithms for

large-scale linear algebra problems. The main goal of this work is to encourage the

practical use of RandNLA approaches to solve Big Data bottlenecks at industrial

level. Our extensive evaluation tests are complemented by a thorough theoretical

analysis that proves the accuracy of the proposed algorithms and highlights their

scalability as the volume of data increases. Finally, the low computational time and

memory consumption, combined with simple implementation schemes that can easily

be extended in parallel and distributed environments, render our algorithms suitable

for use in the development of highly efficient real-world software.

APA, Harvard, Vancouver, ISO, and other styles

37

Rodrigues, David Manuel de Sousa. "Detecção de comunidades no sistema de correio electrónico universitário." Master's thesis, 2009. http://hdl.handle.net/10071/2874.

Full text

Abstract:

O estudo de sistemas estruturados em redes sociais conheceu inúmeros desenvolvimentos na aplicação da teoria de grafos às ciências sociais. Um dos aspectos recentes tem sido o da detecção de módulos, ou comunidades, em redes sociais. Diversos algoritmos e estratégias tem sido desenvolvidos para identificar a estrutura existente por detrás das interacções sociais. Atrav´es de um estudo de caso, mostrámos a existência de comunidades de comunicação informal que utiliza a rede de correio electrónico do ISCTE, através da aplicação de algoritmos hierárquicos de detecção de comunidades. Analisámos a estrutura hierárquica da rede através de k-cores e verificámos que a as comunidades de comunicação informal formadas ultrapassam as fronteiras dos departamentos institucionais através do método de percolação de cliques. `As comunidades detectadas aplicámos uma medida de variação de informação para determinar a distancia entre os diversos departamentos. Construímos um modelo de simulação multi-agente, para mimar o sistema de comunicação informal através de correio electrónico, CIUCEU, que nos permitiu verificar a influencia da vizinhança “social” dos agentes na criação e manutenção da estrutura da rede de professores do ISCTE. Analisámos ainda a utilização de simulações alimentadas por dados reais, concluindo sobre as implicações da utilização de dados reais sobre o desenho da simulação.
The study of structured systems in social networks has gone through several developments by the use of graph theory in social sciences. On aspect that has been given considerable attention in recent years is the module or community detection in social networks. Several algorithms and strategies have been developed to identify the structure behind social interaction. Through a case study we show the existence of communities based on informal communication that use the email system at ISCTE. We applied a set of hierarchical algorithms to detect communities. Also, we analyzed the hierarchical structure through the k-cores method and verified the transitivity of the communities detected through clique percolation to put in evidence that informal communities are transversal to the institution departments. We also used a information variation measure to compare distances between different clusterings. We built a multi-agent simulation to model the informal communication mechanism of the email system, CIUCEU. This is used to verify the dependence of the system on the notion of social neighborhood, in the teachers network of ISCTE. We also analyzed the usage of real data and concluded on its implications of the sampling and drawing os multi-agent simulations.

APA, Harvard, Vancouver, ISO, and other styles

38

Bhattacharya, Indranil. "Feature Selection under Multicollinearity & Causal Inference on Time Series." Thesis, 2017. http://etd.iisc.ernet.in/2005/3980.

Full text

Abstract:

In this work, we study and extend algorithms for Sparse Regression and Causal Inference problems. Both the problems are fundamental in the area of Data Science. The goal of regression problem is to nd out the \best" relationship between an output variable and input variables, given samples of the input and output values. We consider sparse regression under a high-dimensional linear model with strongly correlated variables, situations which cannot be handled well using many existing model selection algorithms. We study the performance of the popular feature selection algorithms such as LASSO, Elastic Net, BoLasso, Clustered Lasso as well as Projected Gradient Descent algorithms under this setting in terms of their running time, stability and consistency in recovering the true support. We also propose a new feature selection algorithm, BoPGD, which cluster the features rst based on their sample correlation and do subsequent sparse estimation using a bootstrapped variant of the projected gradient descent method with projection on the non-convex L0 ball. We attempt to characterize the efficiency and consistency of our algorithm by performing a host of experiments on both synthetic and real world datasets. Discovering causal relationships, beyond mere correlation, is widely recognized as a fundamental problem. The Causal Inference problems use observations to infer the underlying causal structure of the data generating process. The input to these problems is either a multivariate time series or i.i.d sequences and the output is a Feature Causal Graph where the nodes correspond to the variables and edges capture the direction of causality. For high dimensional datasets, determining the causal relationships becomes a challenging task because of the curse of dimensionality. Graphical modeling of temporal data based on the concept of \Granger Causality" has gained much attention in this context. The blend of Granger methods along with model selection techniques, such as LASSO, enables efficient discovery of a \sparse" sub-set of causal variables in high dimensional settings. However, these temporal causal methods use an input parameter, L, the maximum time lag. This parameter is the maximum gap in time between the occurrence of the output phenomenon and the causal input stimulus. How-ever, in many situations of interest, the maximum time lag is not known, and indeed, finding the range of causal e ects is an important problem. In this work, we propose and evaluate a data-driven and computationally efficient method for Granger causality inference in the Vector Auto Regressive (VAR) model without foreknowledge of the maximum time lag. We present two algorithms Lasso Granger++ and Group Lasso Granger++ which not only constructs the hypothesis feature causal graph, but also simultaneously estimates a value of maxlag (L) for each variable by balancing the trade-o between \goodness of t" and \model complexity".

APA, Harvard, Vancouver, ISO, and other styles

39

Klimanis, Nils. "Generic Programming and Algebraic Multigrid for Stabilized Finite Element Methods." Doctoral thesis, 2006. http://hdl.handle.net/11858/00-1735-0000-0006-B38C-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic '080201 Analysis of Algorithms and Complexity'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles