Дисертації: "Time series search"

1

Barsk, Viktor. "Time Series Search Using Traits." Thesis, Umeå universitet, Institutionen för datavetenskap, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-128580.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Time series data occurs in many real world applications. For examplea system might have a database with a large number of time series, and a user could have a query like Find all stocks tha tbehave ”similarly” to stock A. The meaning of ”similarly” can vary between different users, use cases and domains. The goal of this thesis is to develop a method for time series search that can search based on domain specific patterns. We call these domain specific patterns traits. We have chosen to apply a trait based approach on top of a interest point based search method. First the search is conducted using a interest point method and then the results are ranked using the traits. The traits are extracted from sections of the time series and converted to a string representing its structure. The strings are then compared using Levenshtein distance to rank the search results. We have developed two types of traits. The new time series search method can be useful in many applications where a user is not looking for point-wise similarity, but rather looks at the general structure and some specific patterns. Using a trait based approach can better translate to how a user perceives time series search. The method can also yield more relevant results, since this new method can find results that a classic point-wise based search would rule out.

2

Xia, Betty Bin. "Similarity search in time series data sets." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp04/mq24275.pdf.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Bodwick, M. K. "Multivariate time series : The search for structure." Thesis, Lancaster University, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.233978.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Ahsan, Ramoza. "Time Series Data Analytics." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/529.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Given the ubiquity of time series data, and the exponential growth of databases, there has recently been an explosion of interest in time series data mining. Finding similar trends and patterns among time series data is critical for many applications ranging from financial planning, weather forecasting, stock analysis to policy making. With time series being high-dimensional objects, detection of similar trends especially at the granularity of subsequences or among time series of different lengths and temporal misalignments incurs prohibitively high computation costs. Finding trends using non-metric correlation measures further compounds the complexity, as traditional pruning techniques cannot be directly applied. My dissertation addresses these challenges while meeting the need to achieve near real-time responsiveness. First, for retrieving exact similarity results using Lp-norm distances, we design a two-layered time series index for subsequence matching. Time series relationships are compactly organized in a directed acyclic graph embedded with similarity vectors capturing subsequence similarities. Powerful pruning strategies leveraging the graph structure greatly reduce the number of time series as well as subsequence comparisons, resulting in a several order of magnitude speed-up. Second, to support a rich diversity of correlation analytics operations, we compress time series into Euclidean-based clusters augmented by a compact overlay graph encoding correlation relationships. Such a framework supports a rich variety of operations including retrieving positive or negative correlations, self correlations and finding groups of correlated sequences. Third, to support flexible similarity specification using computationally expensive warped distance like Dynamic Time Warping we design data reduction strategies leveraging the inexpensive Euclidean distance with subsequent time warped matching on the reduced data. This facilitates the comparison of sequences of different lengths and with flexible alignment still within a few seconds of response time. Comprehensive experimental studies using real-world and synthetic datasets demonstrate the efficiency, effectiveness and quality of the results achieved by our proposed techniques as compared to the state-of-the-art methods.

5

Bardwell, Lawrence. "Efficient search methods for high dimensional time-series." Thesis, Lancaster University, 2018. http://eprints.lancs.ac.uk/89685/.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This thesis looks at developing efficient methodology for analysing high dimensional time-series, with an aim of detecting structural changes in the properties of the time series that may affect only a subset of dimensions. Firstly, we develop a Bayesian approach to analysing multiple time-series with the aim of detecting abnormal regions. These are regions where the properties of the data change from some normal or baseline behaviour. We allow for the possibility that such changes will only be present in a, potentially small, subset of the time-series. A motivating application for this problem comes from detecting copy number variation (CNVs) in genetics, using data from multiple individuals. Secondly, we present a novel approach to detect sets of most recent changepoints in panel data which aims to pool information across time-series, so that we preferentially infer a most recent change at the same time point in multiple series. Lastly, an approach to fit a sequence of piece-wise linear segments to a univariate time series is considered. Two additional constraints on the resulting segmentation are imposed which are practically useful: (i) we require that the segmentation is robust to the presence of outliers; (ii) that there is an enforcement of continuity between the linear segments at the changepoint locations. These constraints add significantly to the computational complexity of the resulting recursive solution. Several steps are investigated to reduce the computational burden.

6

Schäfer, Patrick. "Scalable time series similarity search for data analytics." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät, 2015. http://dx.doi.org/10.18452/17338.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Eine Zeitreihe ist eine zeitlich geordnete Folge von Datenpunkten. Zeitreihen werden typischerweise über Sensormessungen oder Experimente erfasst. Sensoren sind so preiswert geworden, dass sie praktisch allgegenwärtig sind. Während dadurch die Menge an Zeitreihen regelrecht explodiert, lag der Schwerpunkt der Forschung in den letzten Jahrzehnten auf der Analyse von (a) vorgefilterten und (b) kleinen Zeitreihendatensätzen. Die Analyse realer Zeitreihendatensätze wirft zwei Probleme auf: Erstens setzen aktuelle Ähnlichkeitsmodelle eine Vorfilterung der Zeitreihen voraus. Das beinhaltet die Extraktion charakteristischer Teilsequenzen und das Entfernen von Rauschen. Diese Vorverarbeitung muss durch einen Spezialisten erfolgen. Sie kann zeit- und kostenintensiver als die anschließende Analyse und für große Datensätze unrentabel werden. Zweitens führte die Verbesserung der Genauigkeit aktueller Ähnlichkeitsmodelle zu einem unverhältnismäßig hohen Anstieg der Komplexität (quadratisch bis biquadratisch). Diese Dissertation behandelt beide Probleme. Es wird eine symbolische Zeitreihenrepräsentation vorgestellt. Darauf aufbauend werden drei verschiedene Ähnlichkeitsmodelle eingeführt. Diese erweitern den aktuellen Stand der Forschung insbesondere dadurch, dass sie vorverarbeitungsfrei, unempfindlich gegenüber Rauschen und skalierbar sind. Anhand von 91 realen Datensätzen und Benchmarkdatensätzen wird zusätzlich gezeigt, dass die hier eingeführten Modelle auf den meisten Datenätzen die höchste Genauigkeit im Vergleich zu 15 aktuellen Ähnlichkeitsmodellen liefern. Sie sind teilweise drei Größenordnungen schneller und benötigen kaum Vorfilterung.
A time series is a collection of values sequentially recorded from sensors or live observations over time. Sensors for recording time series have become cheap and omnipresent. While data volumes explode, research in the field of time series data analytics has focused on the availability of (a) pre-processed and (b) moderately sized time series datasets in the last decades. The analysis of real world datasets raises two major problems: Firstly, state-of-the-art similarity models require the time series to be pre-processed. Pre-processing aims at extracting approximately aligned characteristic subsequences and reducing noise. It is typically performed by a domain expert, may be more time consuming than the data mining part itself, and simply does not scale to large data volumes. Secondly, time series research has been driven by accuracy metrics and not by reasonable execution times for large data volumes. This results in quadratic to biquadratic computational complexities of state-of-the-art similarity models. This dissertation addresses both issues by introducing a symbolic time series representation and three different similarity models. These contribute to state of the art by being pre-processing-free, noise-robust, and scalable. Our experimental evaluation on 91 real-world and benchmark datasets shows that our methods provide higher accuracy for most datasets when compared to 15 state-of-the-art similarity models. Meanwhile they are up to three orders of magnitude faster, require less pre-processing for noise or alignment, or scale to large data volumes.

7

Mitchell, F. "Painless knowledge acquisition for time series data." Thesis, University of Aberdeen, 1997. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU100889.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Knowledge Acquisition has long been acknowledged as the bottleneck in producing Expert Systems. This is because, until relatively recently, the KA (Knowledge Acquisition) process has concentrated on extracting knowledge from a domain expert, which is a very time consuming process. Support tools have been constructed to help this process, but these have not been able to reduce the time radically. However, in many domains, the expert is not the only source of knowledge, nor indeed the best source of knowledge. This is particularly true in industrial settings where performance information is routinely archived. This information, if processed correctly, can provide a substantial part of the knowledge required to build a KB (Knowledge Base). In this thesis I discuss current KA approaches and then go on to outline a methodology which uses KD (Knowledge Discovery) techniques to mine archived time series data to produce fault detection and diagnosis KBs with minimal expert input. This methodology is implemented in the TIGON system, which is the focus of this thesis. TIGON uses archived information (in TIGON's case the information is from a gas turbine engine) along with guidance from the expert to produce KBs for detecting and diagnosing faults in a gas turbine engine. TIGON's performance is also analysed in some detail. A comparison with other related work is also included.

8

Charapko, Aleksey. "Time Series Similarity Search in Distributed Key-Value Data Stores Using R-Trees." UNF Digital Commons, 2015. http://digitalcommons.unf.edu/etd/565.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Time series data are sequences of data points collected at certain time intervals. The advance in mobile and sensor technologies has led to rapid growth in the available amount of time series data. The ability to search large time series data sets can be extremely useful in many applications. In healthcare, a system monitoring vital signals can perform a search against the past data and identify possible health threatening conditions. In engineering, a system can analyze performances of complicated equipment and identify possible failure situations or needs of maintenance based on historical data. Existing search methods for time series data are limited in many ways. Systems utilizing memory-bound or disk-bound indexes are restricted by the resources of a single machine or hard drive. Systems that do not use indexes must search through the entire database whenever a search is requested. The proposed system uses multidimensional index in the distributed storage environment to break the bound of one physical machine and allow for high data scalability. Utilizing an index allows the system to locate the patterns similar to the query without having to examine the entire dataset, which can significantly reduce the amount of computing resources required. The system uses an Apache HBase distributed key-value database to store the index and time series data across a cluster of machines. Evaluations were conducted to examine the system’s performance using synthesized data up to 30 million data points. The evaluation results showed that, despite some drawbacks inherited from an R-tree data structure, the system can efficiently search and retrieve patterns in large time series datasets.

9

Muhammad, Fuad Muhammad Marwan. "Similarity Search in High-dimensional Spaces with Applications to Time Series Data Mining and Information Retrieval." Phd thesis, Université de Bretagne Sud, 2011. http://tel.archives-ouvertes.fr/tel-00619953.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Nous présentons l'un des principaux problèmes dans la recherche d'informations et de data mining, ce qui est le problème de recherche de similarité. Nous abordons ce problème dans une perspective essentiellement métrique. Nous nous concentrons sur des données de séries temporelles, mais notre objectif général est de développer des méthodes et des algorithmes qui peuvent être étendus aux autres types de données. Nous étudions de nouvelles méthodes pour traiter le problème de recherche de similarité dans des espaces haut-dimensionnels. Les nouvelles méthodes et algorithmes que nous introduisons sont largement testés et ils montrent une supériorité sur les autres méthodes et algorithmes dans la littérature.

10

Schäfer, Patrick [Verfasser], Alexander [Akademischer Betreuer] Reinefeld, Ulf [Akademischer Betreuer] Leser, and Artur [Akademischer Betreuer] Andrzejak. "Scalable time series similarity search for data analytics / Patrick Schäfer. Gutachter: Alexander Reinefeld ; Ulf Leser ; Artur Andrzejak." Berlin : Mathematisch-Naturwissenschaftliche Fakultät, 2015. http://d-nb.info/1078309620/34.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Arzoky, Mahir. "Munch : an efficient modularisation strategy on sequential source code check-ins." Thesis, Brunel University, 2015. http://bura.brunel.ac.uk/handle/2438/13808.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

As developers are increasingly creating more sophisticated applications, software systems are growing in both their complexity and size. When source code is easy to understand, the system can be more maintainable, which leads to reduced costs. Better structured code can also lead to new requirements being introduced more efficiently with fewer issues. However, the maintenance and evolution of systems can be frustrating; it is difficult for developers to keep a fixed understanding of the system’s structure as the structure can change during maintenance. Software module clustering is the process of automatically partitioning the structure of the system using low-level dependencies in the source code, to improve the system’s structure. There have been a large number of studies using the Search Based Software Engineering approach to solve the software module clustering problem. A software clustering tool, Munch, was developed and employed in this study to modularise a unique dataset of sequential source code software versions. The tool is based on Search Based Software Engineering techniques. The tool constitutes of a number of components that includes the clustering algorithm, and a number of different fitness functions and metrics that are used for measuring and assessing the quality of the clustering decompositions. The tool will provide a framework for evaluating a number of clustering techniques and strategies. The dataset used in this study is provided by Quantel Limited, it is from processed source code of a product line architecture library that has delivered numerous products. The dataset analysed is the persistence engine used by all products, comprising of over 0.5 million lines of C++. It consists of 503 software versions. This study looks to investigate whether search-based software clustering approaches can help stakeholders to understand how inter-class dependencies of the software system change over time. It performs efficient modularisation on a time-series of source code relationships, taking advantage of the fact that the nearer the source code in time the more similar the modularisation is expected to be. This study introduces a seeding concept and highlights how it can be used to significantly reduce the runtime of the modularisation. The dataset is not treated as separate modularisation problems, but instead the result of the previous modularisation of the graph is used to give the next graph a head start. Code structure and sequence is used to obtain more effective modularisation and reduce the runtime of the process. To evaluate the efficiency of the modularisation numerous experiments were conducted on the dataset. The results of the experiments present strong evidence to support the seeding strategy. To reduce the runtime further, statistical techniques for controlling the number of iterations of the modularisation, based on the similarities between time adjacent graphs, is introduced. The convergence of the heuristic search technique is examined and a number of stopping criterions are estimated and evaluated. Extensive experiments were conducted on the time-series dataset and evidence are presented to support the proposed techniques. In addition, this thesis investigated and evaluated the starting clustering arrangement of Munch’s clustering algorithm, and introduced and experimented with a number of starting clustering arrangements that includes a uniformly random clustering arrangement strategy. Moreover, this study investigates whether the dataset used for the modularisation resembles a random graph by computing the probabilities of observing certain connectivity. This thesis demonstrates how modularisation is not possible with data that resembles random graphs, and demonstrates that the dataset being used does not resemble a random graph except for small sections where there were large maintenance activities. Furthermore, it explores and shows how the random graph metric can be used as a tool to indicate areas of interest in the dataset, without the need to run the modularisation. Last but not least, there is a huge amount of software code that has and will be developed, however very little has been learnt from how the code evolves over time. The intention of this study is also to help developers and stakeholders to model the internal software and to aid in modelling development trends and biases, and to try and predict the occurrence of large changes and potential refactorings. Thus, industrial feedback of the research was obtained. This thesis presents work on the detection of refactoring activities, and discusses the possible applications of the findings of this research in industrial settings.

12

Zhang, Zhu, Xiaolong Zheng, Daniel Dajun Zeng, and Scott J. Leischow. "Tracking Dabbing Using Search Query Surveillance: A Case Study in the United States." JMIR PUBLICATIONS, INC, 2016. http://hdl.handle.net/10150/621512.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background: Dabbing is an emerging method of marijuana ingestion. However, little is known about dabbing owing to limited surveillance data on dabbing. Objective: The aim of the study was to analyze Google search data to assess the scope and breadth of information seeking on dabbing. Methods: Google Trends data about dabbing and related topics (eg, electronic nicotine delivery system [ENDS], also known as e-cigarettes) in the United States between January 2004 and December 2015 were collected by using relevant search terms such as "dab rig." The correlation between dabbing (including topics: dab and hash oil) and ENDS (including topics: vaping and e-cigarette) searches, the regional distribution of dabbing searches, and the impact of cannabis legalization policies on geographical location in 2015 were analyzed. Results: Searches regarding dabbing increased in the United States over time, with 1,526,280 estimated searches during 2015. Searches for dab and vaping have very similar temporal patterns, where the Pearson correlation coefficient (PCC) is .992 (P<.001). Similar phenomena were also obtained in searches for hash oil and e-cigarette, in which the corresponding PCC is .931 (P<.001). Dabbing information was searched more in some western states than other regions. The average dabbing searches were significantly higher in the states with medical and recreational marijuana legalization than in the states with only medical marijuana legalization (P=.02) or the states without medical and recreational marijuana legalization (P=.01). Conclusions: Public interest in dabbing is increasing in the United States. There are close associations between dabbing and ENDS searches. The findings suggest greater popularity of dabs in the states that legalized medical and recreational marijuana use. This study proposes a novel and timely way of cannabis surveillance, and these findings can help enhance the understanding of the popularity of dabbing and provide insights for future research and informed policy making on dabbing.

13

Kemp, Kirsty M. "Temporal dynamics in the deep sea : time-series at food falls, seasonality in condition of grenadiers, and tides as time signals." Thesis, University of Aberdeen, 2006. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU222698.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The deep demersal community of the bathyal Porcupine Seabight is subject to environmental forcing on diel, seasonal and annual scales, in addition to the stochastic and transient influence of nutritional windfalls from the photic zone. The current regime at bathyal depth in the Porcupine Seabight is characterised by oscillations in current flow with periods of 12.4h and 14.8d. Increased current velocity and particle suspension in summer months synchronises well with the seasonal input of phytoplankton to the seafloor. These physical characteristics may constitute time signals in the deep ocean environment. Consumption and succession processes at bathyal food falls in the North Atlantic are suggestive of a fundamental difference in the community response between the North Atlantic and North Pacific oceans. The sinking of small cetacean carcasses constitutes a transient environmental impact on the local community structure which is not limited to the scavenging fauna. There is limited evidence of a response to the seasonal increase in available organic carbon in the white muscle of North Atlantic macrourids. This is in accordance with results from Pacific macrourids and suggests that the seasonal food pulse, experienced by the deep benthos under productive surface waters, is not greatly manifested at higher trophic levels. The successful adaptation of existing baited camera technology to incorporate an autonomous periodic bait-release system has enabled long-term high frequency time-series observations of deep-sea scavenging demersal fish and crustaceans to be made for the first time. An understanding of temporal environmental cues, and of the resultant interactions between organisms and their environment, effectively pervades the study of any aspect of organismal or population ecology.

14

Matowe, Lloyd K. "An evaluation of the use of time series analysis designs in clinical guidelines implementation studies." Thesis, University of Aberdeen, 2001. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU137968.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Time-series analysis designs strengthen before and after studies and are regarded as easy and cheap to use. These designs have recently become more popular in guideline implementation studies but there is suspicion that time series analysis designs are used inappropriately or without sufficient understanding or the underlying methodology. In this thesis, we attempt to evaluate their use by means of a systematic review of published studies, and by actively using time series analysis to evaluate the effect of dissemination of the 3rd edition of the Royal College of Radiologists' guidelines on imaging referral patterns from primary care in the North East of Scotland. The systematic review established that indeed many time series studies are conducted inappropriately with key issues such as the use of adequate number of data points and adjustment for trends not taken into consideration. Often results are tested using non-statistical analyses. Our findings suggest that there should be an increased awareness among investigators of the correct statistical technique for performing and analysing time series analyses. From the guideline evaluation study, it was found that the passive dissemination of the imaging guideline in the North East of Scotland did not affect GPs' imaging referral patterns. This may suggest the need for reinforcement with more active dissemination strategies. It was also established that time series analysis can be complex, requiring a clear understanding before use if researchers are to achieve the best from them. Compared to time series analysis before and after studies were shown to be unreliable with the potential of giving misleading results.

15

Granell, Albin, and Filip Carlsson. "How Google Search Trends Can Be Used as Technical Indicators for the S&P500-Index : A Time Series Analysis Using Granger’s Causality Test." Thesis, KTH, Matematisk statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-228740.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This thesis studies whether Google search trends can be used as indicators for movements in the S&P500 index. Using Granger's causality test, the level of causality between movements in the S&P500 index and Google search volumes for certain keywords is analyzed. The result of the analysis is used to form an investment strategy entirely based on Google search volumes, which is then backtested over a five year period using historic data. The causality tests show that 8 of 30 words indicate causality at a 10% level of significance, where one word, mortgage, indicates causality at a 1% level of significance. Several investment strategies based on search volumes yield higher returns than the index itself over the considered five year period, where the best performing strategy beats the index with over 60 percentage units.
Denna uppsats studerar huruvida Google-söktrender kan användas som indikatorer för rörelser i S&P500-indexet. Genom Grangers kausalitetstest studeras kausalitetsnivån mellan rörelser i S&P500 och Google-sökvolymer för särskilt utvalda nyckelord. Resultaten i denna analys används i sin tur för att utforma en investeringsstrategi enbart baserad på Google-sökvolymer, som med hjälp av historisk data prövas över en femårsperiod. Resultaten av kausalitetstestet visar att 8 av 30 ord indikerar en kausalitet på en 10 % -ig signiﬁkansnivå, varav ett av orden, mortgage, påvisar kausalitet på en 1 % -ig signiﬁkansnivå. Flera investeringsstrategier baserade på sökvolymer genererar högre avkastning än indexet självt över den prövade femårsperioden, där den bästa strategin slår index med över 60 procentenheter.

16

Jiao, Yang. "Applications of artificial intelligence in e-commerce and finance." Thesis, Evry, Institut national des télécommunications, 2018. http://www.theses.fr/2018TELE0002/document.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

L'Intelligence Artificielle est présente dans tous les aspects de notre vie à l'ère du Big Data. Elle a entraîné des changements révolutionnaires dans divers secteurs, dont le commerce électronique et la finance. Dans cette thèse, nous présentons quatre applications de l'IA qui améliorent les biens et services existants, permettent l'automatisation et augmentent considérablement l'efficacité de nombreuses tâches dans les deux domaines. Tout d'abord, nous améliorons le service de recherche de produits offert par la plupart des sites de commerce électronique en utilisant un nouveau système de pondération des termes pour mieux évaluer l'importance des termes dans une requête de recherche. Ensuite, nous construisons un modèle prédictif sur les ventes quotidiennes en utilisant une approche de prévision des séries temporelles et tirons parti des résultats prévus pour classer les résultats de recherche de produits afin de maximiser les revenus d'une entreprise. Ensuite, nous proposons la difficulté de la classification des produits en ligne et analysons les solutions gagnantes, consistant en des algorithmes de classification à la pointe de la technologie, sur notre ensemble de données réelles. Enfin, nous combinons les compétences acquises précédemment à partir de la prédiction et de la classification des ventes basées sur les séries temporelles pour prédire l'une des séries temporelles les plus difficiles mais aussi les plus attrayantes : le stock. Nous effectuons une étude approfondie sur chaque titre de l'indice S&P 500 en utilisant quatre algorithmes de classification à la pointe de la technologie et nous publions des résultats très prometteurs
Artificial Intelligence has penetrated into every aspect of our lives in this era of Big Data. It has brought revolutionary changes upon various sectors including e-commerce and finance. In this thesis, we present four applications of AI which improve existing goods and services, enables automation and greatly increase the efficiency of many tasks in both domains. Firstly, we improve the product search service offered by most e-commerce sites by using a novel term weighting scheme to better assess term importance within a search query. Then we build a predictive model on daily sales using a time series forecasting approach and leverage the predicted results to rank product search results in order to maximize the revenue of a company. Next, we present the product categorization challenge we hold online and analyze the winning solutions, consisting of the state-of-the-art classification algorithms, on our real dataset. Finally, we combine skills acquired previously from time series based sales prediction and classification to predict one of the most difficult but also the most attractive time series: stock. We perform an extensive study on every single stocks of S&P 500 index using four state-of-the-art classification algorithms and report very promising results

17

Jiao, Yang. "Applications of artificial intelligence in e-commerce and finance." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2018. http://www.theses.fr/2018TELE0002.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

L'Intelligence Artificielle est présente dans tous les aspects de notre vie à l'ère du Big Data. Elle a entraîné des changements révolutionnaires dans divers secteurs, dont le commerce électronique et la finance. Dans cette thèse, nous présentons quatre applications de l'IA qui améliorent les biens et services existants, permettent l'automatisation et augmentent considérablement l'efficacité de nombreuses tâches dans les deux domaines. Tout d'abord, nous améliorons le service de recherche de produits offert par la plupart des sites de commerce électronique en utilisant un nouveau système de pondération des termes pour mieux évaluer l'importance des termes dans une requête de recherche. Ensuite, nous construisons un modèle prédictif sur les ventes quotidiennes en utilisant une approche de prévision des séries temporelles et tirons parti des résultats prévus pour classer les résultats de recherche de produits afin de maximiser les revenus d'une entreprise. Ensuite, nous proposons la difficulté de la classification des produits en ligne et analysons les solutions gagnantes, consistant en des algorithmes de classification à la pointe de la technologie, sur notre ensemble de données réelles. Enfin, nous combinons les compétences acquises précédemment à partir de la prédiction et de la classification des ventes basées sur les séries temporelles pour prédire l'une des séries temporelles les plus difficiles mais aussi les plus attrayantes : le stock. Nous effectuons une étude approfondie sur chaque titre de l'indice S&P 500 en utilisant quatre algorithmes de classification à la pointe de la technologie et nous publions des résultats très prometteurs
Artificial Intelligence has penetrated into every aspect of our lives in this era of Big Data. It has brought revolutionary changes upon various sectors including e-commerce and finance. In this thesis, we present four applications of AI which improve existing goods and services, enables automation and greatly increase the efficiency of many tasks in both domains. Firstly, we improve the product search service offered by most e-commerce sites by using a novel term weighting scheme to better assess term importance within a search query. Then we build a predictive model on daily sales using a time series forecasting approach and leverage the predicted results to rank product search results in order to maximize the revenue of a company. Next, we present the product categorization challenge we hold online and analyze the winning solutions, consisting of the state-of-the-art classification algorithms, on our real dataset. Finally, we combine skills acquired previously from time series based sales prediction and classification to predict one of the most difficult but also the most attractive time series: stock. We perform an extensive study on every single stocks of S&P 500 index using four state-of-the-art classification algorithms and report very promising results

18

Serrà, Julià Joan. "Identification of versions of the same musical composition by processing audio descriptions." Doctoral thesis, Universitat Pompeu Fabra, 2011. http://hdl.handle.net/10803/22674.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This work focuses on the automatic identification of musical piece versions (alternate renditions of the same musical composition like cover songs, live recordings, remixes, etc.). In particular, we propose two core approaches for version identification: model-free and model-based ones. Furthermore, we introduce the use of post-processing strategies to improve the identification of versions. For all that we employ nonlinear signal analysis tools and concepts, complex networks, and time series models. Overall, our work brings automatic version identification to an unprecedented stage where high accuracies are achieved and, at the same time, explores promising directions for future research. Although our steps are guided by the nature of the considered signals (music recordings) and the characteristics of the task at hand (version identification), we believe our methodology can be easily transferred to other contexts and domains.
Aquest treball es centra en la identificació automàtica de versions musicals (interpretacions alternatives d'una mateixa composició: 'covers', directes, remixos, etc.). En concret, proposem dos tiupus d'estratègies: la lliure de model i la basada en models. També introduïm tècniques de post-processat per tal de millorar la identificació de versions. Per fer tot això emprem conceptes relacionats amb l'anàlisi no linial de senyals, xarxes complexes i models de sèries temporals. En general, el nostre treball porta la identificació automàtica de versions a un estadi sense precedents on s'obtenen bons resultats i, al mateix temps, explora noves direccions de futur. Malgrat que els passos que seguim estan guiats per la natura dels senyals involucrats (enregistraments musicals) i les característiques de la tasca que volem solucionar (identificació de versions), creiem que la nostra metodologia es pot transferir fàcilment a altres àmbits i contextos.

19

Andrade, Claudinei Garcia de. "Consultas por similaridade e mineração de regras de associação: maximizando o conhecimento extraído de séries temporais." Universidade Federal de São Carlos, 2014. https://repositorio.ufscar.br/handle/ufscar/583.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Made available in DSpace on 2016-06-02T19:06:18Z (GMT). No. of bitstreams: 1 6337.pdf: 1365151 bytes, checksum: 464969011137271e4d5d5088872c236b (MD5) Previous issue date: 2014-08-28
A time series analysis presents challenges. There is a difficulty to manipulate the data by requiring a large computational cost, or even, by the difficulty of finding subsequences that have the same characteristics. However, this analysis is important for understanding the evolution of various phenomena such as climate change, changes in financial markets among others. This project proposed the development of a method for performing similarity queries in time series that have better performance and accuracy than the state-of-art and a method of mining association rules in series using similarity. The experiments performed have applied the proposed methods in real data sets, bringing relevant knowledge, indicating that both methods are suitable for analysis by similarity of one-dimensional and multidimensional time series.
A analise de séries temporais apresenta certos desafios. Seja pela dificuldade na manipulação dos dados, por exigir um grande custo computacional, ou mesmo pela dificuldade de se en¬contrar subsequências que apresentam as mesmas características. No entanto, essa analise e importante para o entendimento da evolução de diversos fenômenos como as mudanças climaticas, as variações no mercado financeiro entre outros. Este projeto de mestrado propos o desenvolvimento de um método para a realização de consultas por similaridade em series temporais que apresentam melhor desempenho e acurâcia que o estado-da-arte e um método de mineração de regras de associação em series utilizando similaridade. Os experimentos feitos aplicaram os métodos propostos em conjuntos de dados reais, trazendo conhecimento relevante, indicando que os metodos são adequados para analise por similaridade de series temporais unidimensionais e multidimensionais.

20

Vuillemin, Benoit. "Recherche de règles de prédiction dans un contexte d'Intelligence Ambiante." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSE1120.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Cette thèse traite du sujet de l’intelligence ambiante, fusion entre l’intelligence artificielle et l’internet des objets. L’objectif de ce travail est d’extraire des règles de prédiction à partir des données fournies par les objets connectés dans un environnement, afin de proposer aux utilisateurs des automatisations. Notre principale motivation repose sur la confidentialité, les interactions entre utilisateurs et l’explicabilité du fonctionnement du système. Dans ce contexte, plusieurs contributions ont été apportées. La première est une architecture d’intelligence ambiante qui fonctionne localement et traite les données provenant d’un seul environnement connecté. La seconde est un processus de discrétisation sans a priori sur les données d’entrée, permettant de prendre en compte les différentes données provenant de divers objets. La troisième est un nouvel algorithme de recherche de règles sur une série temporelle, qui évite les limitations des algorithmes de l’état de l’art. L’approche a été validée par des tests sur deux bases de données réelles. Enfin, les perspectives de développement du système sont présentées
This thesis deals with the subject of Ambient Intelligence, the fusion between Artificial Intelligence and the Internet of Things. The goal of this work is to extract prediction rules from the data provided by connected objects in an environment, in order to propose automation to users. Our main concern relies on privacy, user interactions, and the explainability of the system’s operation. In this context, several contributions were made. The first is an ambient intelligence architecture that operates locally, and processes data from a single connected environment. The second is a discretization process without a priori on the input data, allowing to take into account different kinds of data from various objects. The third is a new algorithm for searching rules over a time series, which avoids the limitations of stateoftheart algorithms. The approach was validated by tests on two real databases. Finally, prospects for future developments in the system are presented

21

Vermoyal, Marie-Corinne. "La série adjectivale dans A la Recherche du Temps Perdu. Du fait de langue au fait de vision : « Cette multiforme et puissante unité »." Thesis, Paris 4, 2015. http://www.theses.fr/2015PA040118.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

La série adjectivale est une structure bien connue des lecteurs de Proust. La Recherche comprend plus de trois mille séries adjectivales ; elles combinent deux, trois, jusqu’à dix-sept adjectifs. Les variations sémantiques et syntaxiques sont nombreuses entre les séries. Devons-nous parler de série adjectivale ou de séries, au pluriel ? Quel point commun ont toutes ces séries entre elles ? Peut-on dire que la série adjectivale est une figure de style ?Notre but sera de montrer comment s’articulent ces deux phénomènes stylistiques qui font de la série adjectivale une microstructure complexe : l’effet artiste et le fait de vision. Notre thèse consiste à démontrer que la série adjectivale est un phénomène stylistique, un fait qui ne peut se comprendre qu’en analysant le fonctionnement cognitif d’une vision du monde. Dans une première partie nous analysons la série adjectivale comme un fait de langue, une syntaxe complexe ; dans une seconde partie, nous étudions les effets stylistiques produits par la série adjectivale ; puis nous démontrons que le fait syntaxique est l’expression d’un rapport phénoménologique entre le narrateur et le monde qui l’entoure
Adjectival series are a well-known by Proust’s readers. We find more than three thousand adjectival series in In Search of the Lost time ; some combine two, three, four adjectives, until seveteen adjectives ; we notice semantical variations and syntactical differences. Should we speak about adjectival series or serie ? What do these series have in common ? Is the adjectival series a stylistic figure ? We want to prove that the adjectival serie is part of these two both stylistics phenomenons which are artistical writting effects and vision of the world. We analyse this stylistic fact according to psychomecanical linguistic, as the expression of an original way to feel. In the first part of research we will show that the adjectival serie is a complex syntactic fact ; in the second part we analyse the adjectival serie as a stylistic effect ; then, we demonstrate that the syntactic fact express a phenomenological link between the narrator and the world

22

Schroeder, Pascal. "Performance guaranteeing algorithms for solving online decision problems in financial systems." Electronic Thesis or Diss., Université de Lorraine, 2019. http://www.theses.fr/2019LORR0143.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Cette thèse contient quelques problèmes de décision financière en ligne et des solutions. Les problèmes sont formulés comme des problèmes en ligne (OP) et des algorithmes en ligne (OA) sont créés pour résoudre. Comme il peut y avoir plusieurs OAs pour le même OP, il doit y avoir un critère afin de pouvoir faire des indications au sujet de la qualité d’un OA. Dans cette thèse ces critères sont le ratio compétitif (c), la différence compétitive (cd) et la performance numérique. Un OA qui a un c ou cd plus bas qu’un autre est à préférer. Un OA qui possède le c le plus petit est appelé optimal. Nous considérons les OPs suivants. Le problème de conversion en ligne (OCP), le problème de sélection de portefeuille en ligne (PSP) et le problème de gestion de trésorerie en ligne (CMP). Après le premier chapitre d’introduction, les OPs, la notation et l’état des arts dans le champ des OPs sont présentés. Dans le troisième chapitre on résoudre trois variantes des OCP avec des prix interconnectés. En Chapitre 4 on travaille encore sur le problème de recherche de série chronologie avec des prix interconnectés et on construit des nouveaux OAs. À la fin de ce chapitre l’OA k-DIV est créé pour le problème de recherche générale des k maximums. k-DIV est aussi optimal. En Chapitre 5 on résout le PSP avec des prix interconnectés. L’OA créé s’appelle OPIP et est optimal. En utilisant les idées de OPIP, on construit un OA pour le problème d’échange bidirectionnel qui s’appelle OCIP et qui est optimal. Avec OPIP, on construit un OA optimal pour le problème de recherche bidirectionnel (OA BUND) sachant les valeurs de θ_1 et θ_2. Pour des valeurs inconnues, on construit l’OA RUN qui est aussi optimal. Les chapitres 6 et 7 traitent sur le CMP. Dans les deux chapitres il y a des tests numériques afin de pouvoir comparer la performance des OAs nouveaux avec celle des OAs déjà établies. En Chapitre 6 on construit des OAs optimaux ; en chapitre 7 on construit des OA qui minimisent cd. L’OA BCSID résoudre le CMP avec des demandes interconnectées ; LOA aBBCSID résoudre le problème lorsqu’ on connaît les valeurs de θ_1,θ_2,m et M. L’OA n’est pas optimal. En Chapitre 7 on résout le CMP par rapport à m et M en minimisant cd (OA MRBD). Ensuite on construit l’OA HMRID et l’OA MRID pour des demandes interconnectées. MRID minimise cd et HMRID est un bon compromis entre la performance numérique et la minimisation de cd
This thesis contains several online financial decision problems and their solutions. The problems are formulated as online problems (OP) and online algorithms (OA) are created to solve them. Due to the fact that there can be various OA for the same OP, there must be some criteria with which one can make statements about the quality of an OA. In this thesis these criteria are the competitive ratio (c), the competitive difference (cd) and the numerical performance. An OA with a lower c is preferable to another one with a higher value. An OA that has the lowest c is called optimal. We consider the following OPS. The online conversion problem (OCP), the online portfolio selection problem (PSP) and the cash management problem (CMP). After the introductory chapter, the OPs, the notation and the state of the art in the field of OPs is presented. In the third chapter, three variants of the OCP with interrelated prices are solved. In the fourth chapter the time series search with interrelated prices is revisited and new algorithms are created. At the end of the chapter, the optimal OA k-DIV for the general k-max search with interrelated prices is developed. In Chapter 5 the PSP with interrelated prices is solved. The created OA OPIP is optimal. Using the idea of OPIP, an optimal OA for the two-way trading is created (OCIP). Having OCIP, an optimal OA for the bi-directional search knowing the values of θ_1 and θ_2 is created (BUND). For unknown θ_1 and θ_2, the optimal OA RUNis created. The chapter ends with an empirical (for OPIP) and experimental (for OCIP, BUND and RUN) testing. Chapters 6 and 7 deal with the CMP. In both of them, a numerical testing is done in order to compare the numerical performance of the new OAs to the one of the already established ones. In Chapter 6 an optimal OA is constructed; in Chapter 7, OAs are designed which minimize cd. The OA BCSID solves the CMP with interrelated demands to optimality. The OA aBBCSID solves the CMP when the values of de θ_1, θ_2,m and M are known; however, this OA is not optimal. In Chapter 7 the CMP is solved, knowing m and M and minimizing cd (OA MRBD). For the interrelated demands, a heuristic OA (HMRID) and a cd-minimizing OA (MRID) is presented. HMRID is good compromise between the numerical performance and the minimization of cd. The thesis concludes with a short discussion about shortcomings of the considered OPs and the created OAs. Then some remarks about future research possibilities in this field are given

23

"Efficient similarity search in time series data." Thesis, 2007. http://library.cuhk.edu.hk/record=b6074201.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Time series data is ubiquitous in real world, and the similarity search in time series data is of great importance to many applications. This problem consists of two major parts: how to define the similarity between time series and how to search for similar time series efficiently. As for the similarity measure, the Euclidean distance is a good starting point; however, it also has several limitations. First, it is sensitive to the shifting and scaling transformations. Under a geometric model, we analyze this problem extensively and propose an angle-based similarity measure which is invariant to the shifting and scaling transformations. We then extend the conical index to support for the proposed angle-based similarity measure efficiently. Besides the distortions in amplitude axis, the Euclidean distance is also sensitive to the distortion in time axis; Dynamic Time Warping (DTW) distance is a very good similarity measure which is invariant to the time distortion. However, the time complexity of DTW is high which inhibits its application on large datasets. The index method under DTW distance is a common solution for this problem, and the lower-bound technique plays an important role in the indexing of DTW. We explain the existing lower-bound functions under a unified frame work and propose a group of new lower-bound functions which are much better. Based on the proposed lower-bound functions, an efficient index structure under DTW distance is implemented. In spite of the great success of DTW, it is not very suitable for the time scaling search problem where the time distortion is too large. We modify the traditional DTW distance and propose the Segment-wise Time Warping (STW) distance to adapt to the time scaling search problem. Finally, we devise an efficient search algorithm for the problem of online pattern detection in data streams under DTW distance.
Zhou, Mi.
"January 2007."
Adviser: Man Hon Wong.
Source: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6100.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2007.
Includes bibliographical references (p. 167-180).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract in English and Chinese.
School code: 1307.

24

GHIRETTI, ALESSANDRO. "Robust time series analysis with the Forward Search." Doctoral thesis, 2019. http://hdl.handle.net/2158/1150839.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Kadiyala, Srividya. "Efficient and scalable search for similar patterns in time series data." Thesis, 2006. http://spectrum.library.concordia.ca/8949/1/MR14323.pdf.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Popularity of time series databases for predicting future events and trends in applications such as market analysis and weather forecast require the development of more reliable, fast; and memory efficient indexes. In this thesis, we consider searching similar patterns in time series data for variable length queries. Recently an indexing technique called Multi-Resolution Index (MRI) has been proposed to solve this problem [Kah01, Kah04] which uses compression to reduce the index size. However, the processor workload and memory curtails the opportunity of utilizing compression as an additional step. Motivated by the need and limitations of existing techniques, the main objective of this thesis is to develop an alternative multi-resolution index structure and algorithm, to which we refer as Compact MRI (CMRI). This new technique takes advantage of an existing dimensionality reduction technique called Adaptive Piecewise Constant Approximation (APCA) [Keo01]. Advantages of CMRI is that it utilizes less space without requiring any compression and gains high precision. We have implemented MRI and CMRI and performed extensive experiments to compare them. To evaluate the precision and performance of CMRI, we have used both real and synthetic data, and compared the results with MRI. The experimental results indicate that CMRI improves precision, ranging from 0.75 to 0.89 on real data, and from 0.80 to 0.95 on synthetic data. Furthermore, CMRI is superior over MRI in performance as the number of disk I/Os required by CMRI is close to minimal. Compared to sequential scan, CMRI is 4 to 30 times faster, observed on both real and synthetic data.

26

Chinag, Yueh-Huey, and 江玥慧. "Similarity Search in Time-Series Databases:Using the Database of TAIWAN Stock Price." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/00499608218099710014.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立臺灣大學
資訊管理學系
86
The trends of database technology could be divided into two aspects. First, the data types in database become more complex data types, such as time-series data, spatial data, and multimedia data. Second, the developement of similarity search can make the database system smarter and more flexible. Based on the two aspects, this thesis proposes a model which could process the similarity search in time-series database. This model transforms a great amount of time-series data to parallelograms and processes a fast similarity search with these parallelograms. This thesis applies the model to TAIWAN stock price database. Finally, a comparison is made between this model and R-tree model which was proposed by the past research. Results reveal that this model is more efficient in searching and indexing and saves more storage spaces.

27

Jamiolkowski, Viktor. "Development of a Time Series Similarity Search Application for Unlabeled Glucose Measurements." Master's thesis, 2021. http://hdl.handle.net/10362/127132.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

28

SUNEJA, KRITI. "FPGA BASED HARDWARE DESIGN OF SIMILARITY SEARCH ALGORITHMS FOR TIME SERIES PROCESSING APPLICATIONS." Thesis, 2016. http://dspace.dtu.ac.in:8080/jspui/handle/repository/14745.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Time series is a huge collection of data indexed sequentially with respect to time. It is being produced at an extremely high rate from almost every domain including stock market, music industry, biomedical industry, etc. Data mining from temporal database requires similarity measures which can distinguish between two or more time series. Many distance functions, such as Dynamic Time Warping, Edit distance on Real Sequences, Move Split Merge, etc., which work efficiently in software for retrieval of similarity in temporal sequences exist. Since the time series is a massive dataset, features need to be extracted before its analysis. In this work, synthesis of five similarity measures has been done using the device xc3s400- 4-pq208 in Xilinx with Verilog Hardware Description Language (HDL) and a comparison has been made to show the outperformance of one over the other based on the critical parameters of hardware utilization and delay. The purpose behind this project is to make these similarity measures available as portable devices for time series analysis in various domains. Simulations were performed in ModelSim. To compare the efficacy of these similarity measures in distinguishing the time series, an application of detection of plagiarism in music has been implemented in MATLAB, where all the five algorithms were used to compute distance between plagiarized, unplagiarized, and same pair of songs. Algorithm which could clearly distinguish these three sets of data, as well as performed fairly well in hardware performance, was given the highest score to be used as a separate entity in real time applications. Also, a comparison was made between the execution time in hardware and software to ensure the speed up of FPGA based implemented algorithms over software. The results showed that while hardware implemented DTW can attain the highest frequency of 18.9 MHz, it is only 9.6 KHz for MATLAB implemented DTW for four element length sequence. Obtained results suggest that DTW was best for plagiarism detection and LCSS stood second. However, LCSS performed best in hardware utilization and delay. Thus, it is a bestfit algorithm for commercial use.

29

Huang, Yi-Lun, and 黃義倫. "A Framework for Efficient Similarity Search over Power Time Series considering Data Security." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/41723197646565532351.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立中興大學
資訊科學與工程學系
105
In the forthcoming years, the wide deployment of smart grids can be foreseen. Smart grids providing the capability of measuring real-time electricity usage play an important role in future intelligent cities. Along with this viewpoint, mining and analyzing the energy usage data from smart grids bring the opportunities for smart energy management and adaptive power generation. However, storing massive data from a city scale smart grid can be a challenge. One solution for the challenge is to leverage the cloud storage for storing smart grid data. However, storing power usage data in a cloud brings privacy concerns; by analyzing the power meter data, one can readily discover the activity pattern of a user. An idea is to encrypt the smart grid data storing at a cloud. However, this brings the feasibility concern on mining and analyzing the encrypted data. Aiming at this issue, in this thesis, we propose a framework that protects the data privacy and preserves the flexibility of efficiently issuing similarity queries over encrypted data storing at cloud storages. Our framework based on Discrete Fourier Transform (DFT) and Locality Sensitive Hashing (LSH) techniques provide an approximation mechanism for computing results of similarity queries. Experiments with real data collected from real deployed smart meters are conducted, and the experiment results demonstrate the feasibility of the proposed framework.

30

Yen, Yu-Wen, and 嚴昱文. "An Adaptive Learning Object Management and Search Mechanism based on Time-Series Mining." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/92568017886315087602.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

博士
淡江大學
資訊工程學系博士班
102
Recent advances in information technology have turned out World Wide Web to be the main platform for interactions where participants – users and corresponding events – are triggered. Although the participants vary in accordance with scenarios, a considerable size of data will be generated. This phenomenon indeed causes the complexity in information retrieval, management, and reuse, and meanwhile, turns down the value of this data. In this thesis, we attempt to achieve efficient management of user-generated data and its derivative contexts for human supports. This thesis concentrates on the meaningful reuse of user-generated data, especially its usage for learning purpose, through an efficient and purpose-built data management process. First, an intelligent state machine, which is the essence to the scenario of user-generated data processing, was developed to identify, especially those frequently-accessed and with timely manner, relations of data and its derivative contexts. To accelerate the accuracy in data correlation modeling, a temporal mining algorithm is then defined. This algorithm is applied to highlight the event that a data item is being accessed, and further examines its relative attributes with other correlated items. Last, but not the least, we present a conceptual scenario of human-centric search to demonstrate the proposed approach. The performance and feasibility can be revealed by the experiments that were conducted on the data collected from open social networks (e.g., Facebook, Twitter, etc.) in the past few years with size around 500 users and 8,000,000 shared contents from them.

31

Liao, Wei-Ju, and 廖韋茹. "Adjacent Difference Value (ADV) Method for Dynamic Segmentation in Time Series Data Search." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/shpc5x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立臺灣科技大學
工業管理系
106
Heuristic Ordered Time series Symbolic Aggregate approXimation (HOT SAX) is a well-known symbolic representation approach used to detect the anomalies existing in time series data. Time series anomalies are the unusual patterns which does not follow the tendency of time series data. In other words, time series anomalies are the subsequences which have the less level of similarity with other subsequence. However, time series data usually covers a long interval of time, how to compute the large amount of data is the first difficulty to be overcome. Since HOT SAX reduces the dimensionality of data and then searches anomalies in the reduced data with two heuristic algorithms, HOT SAX can detect the anomalies efficiently. Nevertheless, because HOT SAX searches anomalies through sliding a fixed-length window on the entire data comprehensively, the results of anomalies detected would change when setting a different length of window and the optimal length of the window is hard to determine. Hence, the determination of optimal length of the sliding window is the main concern of HOT SAX. Therefore, to solve the main concern of HOT SAX, Adjacent Mean Difference (ADV) segmentation method, which can segment the time series data dynamically without setting any parameter, was proposed in this research. Essentially, ADV partitions data into multiple subsequences with different lengths based on the transitions between data points. To complete the detection of anomalies, FastDTW was used to compare the level of similarity between every subsequence. The experiments demonstrate that ADV is an easy and efficient method. And the comparison with HOT SAX shows that ADV is really useful and can be used to detect anomalies with better computational efficiency.

32

Ruan, Zheng-Zhi, and 阮正治. "Using Genetic Algorithms to Search for the Structure Change of Non-linear Time Series." Thesis, 1996. http://ndltd.ncl.edu.tw/handle/91317582045411619002.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Juan, Cheng-Chi, and 阮正治. "Using Genetic Algorithms to Search for the Structure Change of Non-linear Time Series." Thesis, 1996. http://ndltd.ncl.edu.tw/handle/17420627073865679811.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Cheng, Chuam-Yao, and 鄭傳耀. "A Novel Stock Index Forecasting Model Based on Improved Fuzzy Time Series and Group Search Optimizer." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/7bcrjj.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立勤益科技大學
資訊工程系
103
In this paper, we propose a new improved fuzzy time series method to implement the TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) forecast. This method combines the traditional fuzzy time series with ratio value and improved group search optimizer . First, we use the traditional fuzzy time series with the linguistic variable analysis in fuzzy logic theory to observe the uncertain data which can be modeled as fuzzified variable. Next, we calculate the between the adjacent data in time series to obtain the ratio value.A ratio value correspondence a fuzzy logical relationship(FLR), so each fuzzy logical relationship have different values, fuzzy logical relationship will form fuzzy logical relationship group(FLRG), The information obtained above to establish a fuzzy rule table for forecast, during the forecasting process, we will use algorithms to constantly adjust the interval range . To verify the efficacy of the proposed method, we take the TAIEX forecast as experiment. In our experimental results, it shows that the proposed method can achieve accurate predictions more than the other method and also is simpler than the other method in computation.

35

Fan, Cheng-Chung, and 范正忠. "Similarity Search in Time-series Data by HAAR Wavelet Transform--Taking TAIWAN Stock Market for Example." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/77843509749021335582.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立臺灣大學
資訊管理研究所
89
Time series data searching and mining has become a growing trend in temporal/time-series database management systems. How to store and retrieve a time-series data becomes an important research topic. To increase the performance, many approaches have been proposed to use Discrete Fourier Transform (DFT) to transform a time series data into a lower-dimensional feature vector. In this thesis, we propose an approach to index time series data by Haar Wavelet Transform (HWT). In the proposed approach, a time series data is transformed to a feature vector by the HWT and the first 8~11 HWT coefficients are selected as an index. Then the indices are stored into an SR-Tree. To compare the performance of the proposed approach with that of the DFT approach, we use two types of queries, range query and nearest neighbor query, to conduct some experiments on a set of time series data. According to the experimental results, it has been shown that the approach based on HWT outperforms the approach based on DFT in terms of the precision and number of disk accesses.

36

Syu, Yang, and 許揚. "Search Based Approach for Dynamic QoS Time Series Forecasting of Web Services by Using Genetic Programming." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/5zyegg.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

博士
國立臺北科技大學
資訊工程系所
105
Currently, many service operations performed in service-oriented software engineering (SOSE), such as service composition and discovery, depend heavily on Quality of Service (QoS). Due to factors such as varying loads, the real value of some dynamic QoS attributes (e.g., response time and availability) changes over time. However, most of the existing QoS-based studies and approaches do not consider such changes; instead, they are assumed to rely on the unrealistic and static QoS information provided by service providers, which may seriously impair their outcomes. To predict dynamic QoS values, the objective is to devise an approach that can generate a predictor to perform QoS forecasting based on past QoS observations. We use genetic programming (GP), which is a type of evolutionary computing used in search-based software engineering (SBSE), to forecast the QoS attributes of web services. In our proposed approach, GP is used to search and evolve expression-based, one-step-ahead QoS predictors. To evaluate the performance (accuracy) of our GP-based approach, we also implement most current time series forecasting methods; a comparison between our approach and these other methods is discussed in the context of real-world QoS data. Compared with common time series forecasting methods, our approach is found to be the most suitable and stable solution for the defined QoS forecasting problem. In addition to the numerical results of the experiments, we also analyze and provide detailed descriptions of the advantages and benefits of using GP to perform QoS forecasting. Additionally, possible validity threats using the GP approach and its validity for SBSE are discussed and evaluated. This dissertation thoroughly and completely demonstrates that under a realistic situation (with real-world QoS data), the proposed GP-based QoS forecasting approach provides effective, efficient, and accurate forecasting and can be considered as an instance of SBSE.

37

Kusiak, Caroline. "Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data." 2018. https://scholarworks.umass.edu/masters_theses_2/708.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Dengue fever affects over 390 million people annually worldwide and is of particu- lar concern in Southeast Asia where it is one of the leading causes of hospitalization. Modeling trends in dengue occurrence can provide valuable information to Public Health officials, however many challenges arise depending on the data available. In Thailand, reporting of dengue cases is often delayed by more than 6 weeks, and a small fraction of cases may not be reported until over 11 months after they occurred. This study shows that incorporating data on Google Search trends can improve dis- ease predictions in settings with severely underreported data. We compare penalized regression approaches to seasonal baseline models and illustrate that incorporation of search data can improve prediction error. This builds on previous research show- ing that search data and recent surveillance data together can be used to create accurate forecasts for diseases such as influenza and dengue fever. This work shows that even in settings where timely surveillance data is not available, using search data in real-time can produce more accurate short-term forecasts than a seasonal baseline prediction. However, forecast accuracy degrades the further into the future the forecasts go. The relative accuracy of these forecasts compared to a seasonal average forecast varies depending on location. Overall, these data and models can improve short-term public health situational awareness and should be incorporated into larger real-time forecasting efforts.

38

Lutz, Ronny Bernd. "The search for substellar companions to subdwarf B stars in connection with evolutionary aspects." Doctoral thesis, 2011. http://hdl.handle.net/11858/00-1735-0000-0006-B53E-1.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Liu, Ta-jen, and 劉大仁. "Speed-up Algorithms for Similarity Searches in Time Series Data and It’s Application to Content-based Music Retrieval." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/69072558927803968595.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立高雄第一科技大學
電腦與通訊工程所
92
In this thesis, two algorithms for similarity searches in time-series data for applying to content-based music retrieval are proposed. A content-based music retrieval system (CBMR) was an innovative way of retrieving songs by melodies rather than by keywords. This study inspirited from previous researches aims at enhancing the performance of CBMR systems on retrieval accuracy and execution speed by using dynamic time warping with two-level filtering techniques. The similarity measure metric plays an important role in the processing of time-series data, which is known to be high computational complexity due to their high-dimensional characteristics. For this purpose, in this thesis, we proposed two filtering methods to reduce the dimensionality of the time-series data and speed up the computation of similarity between two time-series data. In order to verify the effectiveness of the proposed methods, an application to CBMR is also proposed. Experimental results show that the proposed methods achieve good performance as in terms of computational complexity and retrieval accuracy.

40

McIlhagga, William H. "Serial correlations and 1/f power spectra in visual search reaction times." 2008. http://hdl.handle.net/10454/4761.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In a visual search experiment, the subject must find a target item hidden in a display of other items, and their performance is measured by their reaction time (RT). Here I look at how visual search reaction times are correlated with past reaction times. Target-absent RTs (i.e. RTs to displays that have no target) are strongly correlated with past target-absent RTs and, treated as a time series, have a 1/f power spectrum. Target-present RTs, on the other hand, are effectively uncorrelated with past RTs. A model for visual search is presented which generates search RTs with this pattern of correlations and power spectra. In the model, search is conducted by matching search items up with ¿categorizers,¿ which take a certain time to categorize each item as target or distractor; the RT is the sum of categorization times. The categorizers are drawn at random from a pool of active categorizers. After each search, some of the categorizers in the active pool are replaced with categorizers drawn from a larger population of unused categorizers. The categorizers that are not replaced are responsible for the RT correlations and the 1/f power spectrum.

Дисертації з теми "Time series search"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями