Journal articles on the topic 'Noisy Time Series Clustering'

To see the other types of publications on this topic, follow the link: Noisy Time Series Clustering.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Noisy Time Series Clustering.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Tkachenko, Anastasiia Yevhenivna, Liudmyla Olehivna Kyrychenko, and Tamara Anatoliivna Radyvylova. "Clustering Noisy Time Series." System technologies 3, no. 122 (October 10, 2019): 133–39. http://dx.doi.org/10.34185/1562-9945-3-122-2019-15.

Full text
Abstract:
One of the urgent tasks of machine learning is the problem of clustering objects. Clustering time series is used as an independent research technique, as well as part of more complex data mining methods, such as rule detection, classification, anomaly detection, etc.A comparative analysis of clustering noisy time series is carried out. The clustering sample contained time series of various types, among which there were atypical objects. Clustering was performed by k-means and DBSCAN methods using various distance functions for time series.A numerical experiment was conducted to investigate the application of the k-means and DBSCAN methods to model time series with additive white noise. The sample on which clustering was carried out consisted of m time series of various types: harmonic realizations, parabolic realizations, and “bursts”.The work was carried out clustering noisy time series of various types.DBSCAN and k-means methods with different distance functions were used. The best results were shown by the DBSCAN method with the Euclidean metric and the CID function.Analysis of the results of the clustering of time series allows determining the key differences between the methods: if you can determine the number of clusters and you do not need to separate atypical time series, the k-means method shows fairly good results; if there is no information on the number of clusters and there is a problem of isolating non-typical rows, it is advisable to use the DBSCAN method.
APA, Harvard, Vancouver, ISO, and other styles
2

Yelibi, Lionel, and Tim Gebbie. "Agglomerative likelihood clustering." Journal of Statistical Mechanics: Theory and Experiment 2021, no. 11 (November 1, 2021): 113408. http://dx.doi.org/10.1088/1742-5468/ac3661.

Full text
Abstract:
Abstract We consider the problem of fast time-series data clustering. Building on previous work modeling, the correlation-based Hamiltonian of spin variables we present an updated fast non-expensive agglomerative likelihood clustering algorithm (ALC). The method replaces the optimized genetic algorithm based approach (f-SPC) with an agglomerative recursive merging framework inspired by previous work in econophysics and community detection. The method is tested on noisy synthetic correlated time-series datasets with a built-in cluster structure to demonstrate that the algorithm produces meaningful non-trivial results. We apply it to time-series datasets as large as 20 000 assets and we argue that ALC can reduce computation time costs and resource usage costs for large scale clustering for time-series applications while being serialized, and hence has no obvious parallelization requirement. The algorithm can be an effective choice for state-detection for online learning in a fast non-linear data environment, because the algorithm requires no prior information about the number of clusters.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Zheng, Ping Tang, Lianzhi Huo, and Zengguang Zhou. "MODIS NDVI time series clustering under dynamic time warping." International Journal of Wavelets, Multiresolution and Information Processing 12, no. 05 (September 2014): 1461011. http://dx.doi.org/10.1142/s0219691314610116.

Full text
Abstract:
For MODIS NDVI time series with cloud noise and time distortion, we propose an effective time series clustering framework including similarity measure, prototype calculation, clustering algorithm and cloud noise handling. The core of this framework is dynamic time warping (DTW) distance and its corresponding averaging method, DTW barycenter averaging (DBA). We used 12 years of MODIS NDVI time series to perform annual land-cover clustering in Poyang Lake Wetlands. The experimental result shows that our method performs better than classic clustering based on ordinary Euclidean methods.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Yunsheng, Qingzhang Shi, Jiawei Zhu, Jian Peng, and Haifeng Li. "Time Series Clustering with Topological and Geometric Mixed Distance." Mathematics 9, no. 9 (May 6, 2021): 1046. http://dx.doi.org/10.3390/math9091046.

Full text
Abstract:
Time series clustering is an essential ingredient of unsupervised learning techniques. It provides an understanding of the intrinsic properties of data upon exploiting similarity measures. Traditional similarity-based methods usually consider local geometric properties of raw time series or the global topological properties of time series in the phase space. In order to overcome their limitations, we put forward a time series clustering framework, referred to as time series clustering with Topological-Geometric Mixed Distance (TGMD), which jointly considers local geometric features and global topological characteristics of time series data. More specifically, persistent homology is employed to extract topological features of time series and to compute topological similarities among persistence diagrams. The geometric properties of raw time series are captured by using shape-based similarity measures such as Euclidean distance and dynamic time warping. The effectiveness of the proposed TGMD method is assessed by extensive experiments on synthetic noisy biological and real time series data. The results reveal that the proposed mixed distance-based similarity measure can lead to promising results and that it performs better than standard time series analysis techniques that consider only topological or geometrical similarity.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Zheng, Ping Tang, Weixiong Zhang, and Liang Tang. "Satellite Image Time Series Clustering via Time Adaptive Optimal Transport." Remote Sensing 13, no. 19 (October 6, 2021): 3993. http://dx.doi.org/10.3390/rs13193993.

Full text
Abstract:
Satellite Image Time Series (SITS) have become more accessible in recent years and SITS analysis has attracted increasing research interest. Given that labeled SITS training samples are time and effort consuming to acquire, clustering or unsupervised analysis methods need to be developed. Similarity measure is critical for clustering, however, currently established methods represented by Dynamic Time Warping (DTW) still exhibit several issues when coping with SITS, such as pathological alignment, sensitivity to spike noise, and limitation on capacity. In this paper, we introduce a new time series similarity measure method named time adaptive optimal transport (TAOT) to the application of SITS clustering. TAOT inherits several promising properties of optimal transport for the comparing of time series. Statistical and visual results on two real SITS datasets with two different settings demonstrate that TAOT can effectively alleviate the issues of DTW and further improve the clustering accuracy. Thus, TAOT can serve as a usable tool to explore the potential of precious SITS data.
APA, Harvard, Vancouver, ISO, and other styles
6

Jacob, Rinku, K. P. Harikrishnan, R. Misra, and G. Ambika. "Weighted recurrence networks for the analysis of time-series data." Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 475, no. 2221 (January 2019): 20180256. http://dx.doi.org/10.1098/rspa.2018.0256.

Full text
Abstract:
Recurrence networks (RNs) have become very popular tools for the nonlinear analysis of time-series data. They are unweighted and undirected complex networks constructed with specific criteria from time series. In this work, we propose a method to construct a ‘weighted recurrence network’ from a time series and show that it can reveal useful information regarding the structure of a chaotic attractor which the usual unweighted RN cannot provide. Especially, a network measure, the node strength distribution, from every chaotic attractor follows a power law (with exponential cut off at the tail) with an index characteristic to the fractal structure of the attractor. This provides a new class among complex networks to which networks from all standard chaotic attractors are found to belong. Two other prominent network measures, clustering coefficient and characteristic path length, are generalized and their utility in discriminating chaotic dynamics from noise is highlighted. As an application of the proposed measure, we present an analysis of variable star light curves whose behaviour has been reported to be strange non-chaotic in a recent study. Our numerical results indicate that the weighted recurrence network and the associated measures can become potentially important tools for the analysis of short and noisy time series from the real world.
APA, Harvard, Vancouver, ISO, and other styles
7

D’Urso, Pierpaolo, Livia De Giovanni, Riccardo Massari, and Dario Di Lallo. "Noise fuzzy clustering of time series by autoregressive metric." METRON 71, no. 3 (November 2013): 217–43. http://dx.doi.org/10.1007/s40300-013-0024-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Huang, Mengxing, Qili Bao, Yu Zhang, and Wenlong Feng. "A Hybrid Algorithm for Forecasting Financial Time Series Data Based on DBSCAN and SVR." Information 10, no. 3 (March 7, 2019): 103. http://dx.doi.org/10.3390/info10030103.

Full text
Abstract:
Financial prediction is an important research field in financial data time series mining. There has always been a problem of clustering massive financial time series data. Conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several financial forecasting models. In this paper, a new hybrid algorithm is proposed based on Optimization of Initial Points and Variable-Parameter Density-Based Spatial Clustering of Applications with Noise (OVDBCSAN) and support vector regression (SVR). At the initial point of optimization, ε and MinPts, which are global parameters in DBSCAN, mainly deal with datasets of different densities. According to different densities, appropriate parameters are selected for clustering through optimization. This algorithm can find a large number of similar classes and then establish regression prediction models. It was tested extensively using real-world time series datasets from Ping An Bank, the Shanghai Stock Exchange, and the Shenzhen Stock Exchange to evaluate accuracy. The evaluation showed that our approach has major potential in clustering massive financial time series data, therefore improving the accuracy of the prediction of stock prices and financial indexes.
APA, Harvard, Vancouver, ISO, and other styles
9

Li, Haibo, Cheng Wang, Gengqian Wei, and Sina Xu. "Mining the Coopetition Relationship of Urban Public Traffic Lines Based on Time Series Correlation." Journal of Physics: Conference Series 2138, no. 1 (December 1, 2021): 012005. http://dx.doi.org/10.1088/1742-6596/2138/1/012005.

Full text
Abstract:
Abstract Along with the evolution of passenger flows within cities, the coordination between public traffic lines should be sustainably optimized with respect to the spatial distribution of the flow, though the lines were planned well at the beginning of the construction. It is critical to determine the coopetition between bus lines to optimize a transit network continuously. A method of mining coopetition relationship (MCBTC, Mining Coopetition relationship between Bus lines based on a Time series Correlation) based on passenger flow is proposed in this study. First, noisy, inconsistent or missing data are eliminated to obtain a passenger flow time series, and the proposed merging algorithm is used to extract the line passenger flow time series (LPFTS, Line Passenger Flow Time Series) by merging the passenger flow of adjacent buses from the same line. Then, to calculate the positive and negative correlation sequence sets, a clustering algorithm is proposed. The two sequence sets represent the competition and cooperation relationships, respectively. The MCBTC method has been tested with a practical data set, and the results show that it is very promising.
APA, Harvard, Vancouver, ISO, and other styles
10

Kuschnerus, Mieke, Roderik Lindenbergh, and Sander Vos. "Coastal change patterns from time series clustering of permanent laser scan data." Earth Surface Dynamics 9, no. 1 (February 19, 2021): 89–103. http://dx.doi.org/10.5194/esurf-9-89-2021.

Full text
Abstract:
Abstract. Sandy coasts are constantly changing environments governed by complex, interacting processes. Permanent laser scanning is a promising technique to monitor such coastal areas and to support analysis of geomorphological deformation processes. This novel technique delivers 3-D representations of the coast at hourly temporal and centimetre spatial resolution and allows us to observe small-scale changes in elevation over extended periods of time. These observations have the potential to improve understanding and modelling of coastal deformation processes. However, to be of use to coastal researchers and coastal management, an efficient way to find and extract deformation processes from the large spatiotemporal data set is needed. To enable automated data mining, we extract time series of surface elevation and use unsupervised learning algorithms to derive a partitioning of the observed area according to change patterns. We compare three well-known clustering algorithms (k-means clustering, agglomerative clustering and density-based spatial clustering of applications with noise; DBSCAN), apply them on the set of time series and identify areas that undergo similar evolution during 1 month. We test if these algorithms fulfil our criteria for suitable clustering on our exemplary data set. The three clustering methods are applied to time series over 30 d extracted from a data set of daily scans covering about 2 km of coast in Kijkduin, the Netherlands. A small section of the beach, where a pile of sand was accumulated by a bulldozer, is used to evaluate the performance of the algorithms against a ground truth. The k-means algorithm and agglomerative clustering deliver similar clusters, and both allow us to identify a fixed number of dominant deformation processes in sandy coastal areas, such as sand accumulation by a bulldozer or erosion in the intertidal area. The level of detail found with these algorithms depends on the choice of the number of clusters k. The DBSCAN algorithm finds clusters for only about 44 % of the area and turns out to be more suitable for the detection of outliers, caused, for example, by temporary objects on the beach. Our study provides a methodology to efficiently mine a spatiotemporal data set for predominant deformation patterns with the associated regions where they occur.
APA, Harvard, Vancouver, ISO, and other styles
11

Kim, Dae-Won, Pavlos Protopapas, and Rahul Dave. "De-Trending Time Series Data for Variability Surveys." Proceedings of the International Astronomical Union 4, S253 (May 2008): 370–73. http://dx.doi.org/10.1017/s1743921308026677.

Full text
Abstract:
AbstractWe present an algorithm for the removal of trends in time series data. The trends could be caused by various systematic and random noise sources such as cloud passages, change of airmass or CCD noise. In order to determine the trends, we select template stars based on a hierarchical clustering algorithm. The hierarchy tree is constructed using the similarity matrix of light curves of stars whose elements are the Pearson correlation values. A new bottom-up merging algorithm is developed to extract clusters of template stars that are highly correlated among themselves, and may thus be used to identify the trends. We then use the multiple linear regression method to de-trend all individual light curves based on these determined trends. Experimental results with simulated light curves which contain artificial trends and events are presented. We also applied our algorithm to TAOS (Taiwan-American Occultation Survey) wide field data observed with a 0.5m f/1.9 telescope equipped with 2k by 2k CCD. With our approach, we successfully removed trends and increased signal to noise in TAOS light curves.
APA, Harvard, Vancouver, ISO, and other styles
12

Qin, Xingli, Jie Yang, Pingxiang Li, Weidong Sun, and Wei Liu. "A Novel Relational-Based Transductive Transfer Learning Method for PolSAR Images via Time-Series Clustering." Remote Sensing 11, no. 11 (June 6, 2019): 1358. http://dx.doi.org/10.3390/rs11111358.

Full text
Abstract:
The combination of transfer learning and remote sensing image processing technology can effectively improve the automation level of image information extraction from a remote sensing time series. However, in the processing of polarimetric synthetic aperture radar (PolSAR) time-series images, the existing transfer learning methods often cannot make full use of the time-series information of the images, relying too much on the labeled samples in the target domain. Furthermore, the speckle noise inherent in synthetic aperture radar (SAR) imagery aggravates the difficulty of the manual selection of labeled samples, so these methods have difficulty in meeting the processing requirements of large data volumes and high efficiency. In lieu of these problems and the spatio-temporal relational knowledge of objects in time-series images, this paper introduces the theory of time-series clustering and proposes a new three-phase time-series clustering algorithm. Due to the full use of the inherent characteristics of the PolSAR images, this algorithm can accurately transfer the labels of the source domain samples to those samples that have not changed in the whole time series without relying on the target domain labeled samples, so as to realize transductive sample label transfer for PolSAR time-series images. Experiments were carried out using three different sets of PolSAR time-series images and the proposed method was compared with two of the existing methods. The experimental results showed that the transfer precision of the proposed method reaches a high level with different data and different objects and it performs significantly better than the existing methods. With strong reliability and practicability, the proposed method can provide a new solution for the rapid information extraction of remote sensing image time series.
APA, Harvard, Vancouver, ISO, and other styles
13

DelSole, Timothy, and Michael K. Tippett. "Comparing climate time series – Part 1: Univariate test." Advances in Statistical Climatology, Meteorology and Oceanography 6, no. 2 (October 12, 2020): 159–75. http://dx.doi.org/10.5194/ascmo-6-159-2020.

Full text
Abstract:
Abstract. This paper proposes a new approach to detecting and describing differences in stationary processes. The approach is equivalent to comparing auto-covariance functions or power spectra. The basic idea is to fit an autoregressive model to each time series and then test whether the model parameters are equal. The likelihood ratio test for this hypothesis has appeared in the statistics literature, but the resulting test depends on maximum likelihood estimates, which are biased, neglect differences in noise parameters, and utilize sampling distributions that are valid only for large sample sizes. This paper derives a likelihood ratio test that corrects for bias, detects differences in noise parameters, and can be applied to small samples. Furthermore, if a significant difference is detected, we propose new methods to diagnose and visualize those differences. Specifically, the test statistic can be used to define a “distance” between two autoregressive processes, which in turn can be used for clustering analysis in multi-model comparisons. A multidimensional scaling technique is used to visualize the similarities and differences between time series. We also propose diagnosing differences in stationary processes by identifying initial conditions that optimally separate predictable responses. The procedure is illustrated by comparing simulations of an Atlantic Meridional Overturning Circulation (AMOC) index from 10 climate models in Phase 5 of the Coupled Model Intercomparison Project (CMIP5). Significant differences between most AMOC time series are detected. The main exceptions are time series from CMIP models from the same institution. Differences in stationary processes are explained primarily by differences in the mean square error of 1-year predictions and by differences in the predictability (i.e., R-square) of the associated autoregressive models.
APA, Harvard, Vancouver, ISO, and other styles
14

An, Lingling, and R. W. Doerge. "Dynamic Clustering of Gene Expression." ISRN Bioinformatics 2012 (October 16, 2012): 1–12. http://dx.doi.org/10.5402/2012/537217.

Full text
Abstract:
It is well accepted that genes are simultaneously involved in multiple biological processes and that genes are coordinated over the duration of such events. Unfortunately, clustering methodologies that group genes for the purpose of novel gene discovery fail to acknowledge the dynamic nature of biological processes and provide static clusters, even when the expression of genes is assessed across time or developmental stages. By taking advantage of techniques and theories from time frequency analysis, periodic gene expression profiles are dynamically clustered based on the assumption that different spectral frequencies characterize different biological processes. A two-step cluster validation approach is proposed to statistically estimate both the optimal number of clusters and to distinguish significant clusters from noise. The resulting clusters reveal coordinated coexpressed genes. This novel dynamic clustering approach has broad applicability to a vast range of sequential data scenarios where the order of the series is of interest.
APA, Harvard, Vancouver, ISO, and other styles
15

KIANI, KHURSHID M. "FORECASTING FORWARD EXCHANGE RATE RISK PREMIUM IN SINGAPORE DOLLAR/US DOLLAR EXCHANGE RATE MARKET." Singapore Economic Review 54, no. 02 (June 2009): 283–98. http://dx.doi.org/10.1142/s0217590809003288.

Full text
Abstract:
In this research, monthly forward exchange rates are evaluated for possible existence of time varying risk premia in Singapore forward foreign exchange rates against US dollar. The time varying risk premia in Singapore dollar is modeled using non-Gaussian signal plus noise models that encompass non-normality and time varying volatility. The results from signal plus noise models show statistically significant evidence of time varying risk premium in Singapore forward exchange rates although we failed to reject the hypotheses of no risk premium in the series. The results from Gaussian versions of these models are not much different and are in line with Wolff (1987) who also used the same methodology in Gaussian settings. Our results show statistically significant evidence of volatility clustering in Singapore forward exchange rates. The results from Gaussian signal plus noise models also show statistically significant evidence of volatility clustering and non-normality in Singapore forward foreign exchange rates. Additional tests on the series show that exclusion of conditional heteroskedasticity from the signal plus noise models leads to false statistical inferences.
APA, Harvard, Vancouver, ISO, and other styles
16

Zhao, Xiaofei, Caiyi Hu, Zhao Liu, and Yangyang Meng. "Weighted Dynamic Time Warping for Grid-Based Travel-Demand-Pattern Clustering: Case Study of Beijing Bicycle-Sharing System." ISPRS International Journal of Geo-Information 8, no. 6 (June 16, 2019): 281. http://dx.doi.org/10.3390/ijgi8060281.

Full text
Abstract:
Many kinds of spatial–temporal data collected by transportation systems, such as user order systems or automated fare-collection (AFC) systems, can be discretized and converted into time-series data. With the technique of time-series data mining, certain travel-demand patterns of different areas in the city can be detected. This study proposes a data-mining model for understanding the patterns and regularities of human activities in urban areas from spatiotemporal datasets. This model uses a grid-based method to convert spatiotemporal point datasets into discretized temporal sequences. Time-series analysis technique dynamic time warping (DTW) is then used to describe the similarity between travel-demand sequences, while the clustering algorithm density-based spatial clustering of applications with noise (DBSCAN), based on modified DTW, is used to detect clusters among the travel-demand samples. Four typical patterns are found, including balanced and unbalanced cases. These findings can help to understand the land-use structure and commuting activities of a city. The results indicate that the grid-based model and time-series analysis model developed in this study can effectively uncover the spatiotemporal characteristics of travel demand from usage data in public transportation systems.
APA, Harvard, Vancouver, ISO, and other styles
17

Peng, Kaijun, Jieqing Tan, and Guochang Zhang. "A Method of Curve Reconstruction Based on Point Cloud Clustering and PCA." Symmetry 14, no. 4 (April 2, 2022): 726. http://dx.doi.org/10.3390/sym14040726.

Full text
Abstract:
In many application fields (closed curve noise data reconstruction, time series data fitting, image edge smoothing, skeleton extraction, etc.), curve reconstruction based on noise data has always been a popular but challenging problem. In a single domain, there are many methods for curve reconstruction of noise data, but a method suitable for multi-domain curve reconstruction has received much less attention in the literature. More importantly, the existing methods have shortcomings in time consumption when dealing with large data and high-density point cloud curve reconstruction. For this reason, we hope to propose a curve fitting algorithm suitable for many fields and low time consumption. In this paper, a curve reconstruction method based on clustering and point cloud principal component analysis is proposed. Firstly, the point cloud is clustered by the K++ means algorithm. Secondly, a denoising method based on point cloud principal component analysis is proposed to obtain the interpolation nodes of curve subdivision. Finally, the fitting curve is obtained by the parametric curve subdivision method. Comparative experiments show that our method is superior to the classical fitting method in terms of time consumption and effect. In addition, our method is not constrained by the shape of the point cloud, and can play a role in time series data, image thinning and edge smoothing.
APA, Harvard, Vancouver, ISO, and other styles
18

Witt, Annette, Bruce D. Malamud, Clara Mangili, and Achim Brauer. "Analysis and modelling of a 9.3 kyr palaeoflood record: correlations, clustering, and cycles." Hydrology and Earth System Sciences 21, no. 11 (November 14, 2017): 5547–81. http://dx.doi.org/10.5194/hess-21-5547-2017.

Full text
Abstract:
Abstract. In this paper, we present a unique 9.5 m palaeo-lacustrine record of 771 palaeofloods which occurred over a period of 9.3 kyr in the Piànico–Sèllere Basin (southern Alps) during an interglacial period in the Pleistocene (sometime from 780 to 393 ka) and analyse its correlation, clustering, and cyclicity properties. We first examine correlations, by applying power-spectral analysis and detrended fluctuation analysis (DFA) to a time series of the number of floods per decade, and find weak long-range persistence: a power-spectral exponent βPS ≈ 0.39 and an equivalent power-spectral exponent from DFA, βDFA ≈ 0.25. We then examine clustering using the one-point probability distribution of the inter-flood intervals and find that the palaeofloods cluster in time as they are Weibull distributed with a shape parameter kW = 0.78. We then examine cyclicity in the time series of number of palaeofloods per year, and find a period of about 2030 years. Using these characterizations of the correlation, clustering, and cyclicity in the original palaeoflood time series, we create a model consisting of the superposition of a fractional Gaussian noise (FGN) with a 2030-year periodic component and then peaks over threshold (POT) applied. We use this POTFGN + Period model to create 2 600 000 synthetic realizations of the same length as our original palaeoflood time series, but with varying intensity of periodicity and persistence, and find optimized model parameters that are congruent with our original palaeoflood series. We create long realizations of our optimized palaeoflood model, and find a high temporal variability of the flood frequency, which can take values of between 0 and > 30 floods century−1. Finally, we show the practical utility of our optimized model realizations to calculate the uncertainty of the forecasted number of floods per century with the number of floods in the preceding century. A key finding of our paper is that neither fractional noise behaviour nor cyclicity is sufficient to model frequency fluctuations of our large and continuous palaeoflood record, but rather a model based on both fractional noise superimposed with a long-range periodicity is necessary.
APA, Harvard, Vancouver, ISO, and other styles
19

Booy, C., and D. R. Morgan. "The effect of clustering of flood peaks on a flood risk analysis for the Red River." Canadian Journal of Civil Engineering 12, no. 1 (March 1, 1985): 150–65. http://dx.doi.org/10.1139/l85-015.

Full text
Abstract:
The nearly 100 year record of spring flood peaks on the Red River at Winnipeg, Manitoba, shows a clustering of high annual peak flows that is possibly, but not likely, due to chance. A similar degree of clustering has been observed in other long-term geophysical records. It can be measured by means of the Hurst statistic. Clustering increases the uncertainty in the parameters of the probability distribution of peak flows estimated from the record. As such it profoundly affects the weight that must be given to the unusually high historical floods that preceded the period of record, in particular the 1826 and the 1852 floods. Incorporating this historical information in the probability analysis requires a time series model that tends to produce the appropriate degree of clustering. A fractional noise model was adopted for this purpose. Bayes' theorem was then used to update the distribution parameters, obtained from the record, with the additional information about the historical floods. The result shows the flood risk to the City of Winnipeg and the Red River Valley to be substantially higher than was estimated by conventional methods that assume serial independence of the peak flows. Key words: Red River floods, flood risk, historical floods, Hurst phenomenon, fractional noise, Bayesian probability distribution, Bayesian updating, time series.
APA, Harvard, Vancouver, ISO, and other styles
20

Feng, Chen, and Bo He. "Construction of complex networks from time series based on the cross correlation interval." Open Physics 15, no. 1 (April 30, 2017): 253–60. http://dx.doi.org/10.1515/phys-2017-0028.

Full text
Abstract:
AbstractIn this paper, a new approach to map time series into complex networks based on the cross correlation interval is proposed for the analysis of dynamic states of time series on different scales. In the proposed approach, a time series is divided into time series segments and each segment is reconstructed to a phase space defined as a node of the complex network. The cross correlation interval, which characterizes the degree of correlation between two phase spaces, is computed as the distance between the two nodes. The clustering coefficient and efficiency are used to determine an appropriate threshold for the construction of a complex network that can effectively describe the dynamic states of a complex system. In order to verify the efficiency of the proposed approach, complex networks are constructed for time series generated from the Lorenz system, for white Gaussian noise time series and for sea clutter time series. The experimental results have demonstrated that nodes in different communities represent different dynamic states . Therefore, the proposed approach can be used to uncover the dynamic characteristics of the complex systems.
APA, Harvard, Vancouver, ISO, and other styles
21

Roushangar, Kiyoumars, Vahid Nourani, and Farhad Alizadeh. "A multiscale time-space approach to analyze and categorize the precipitation fluctuation based on the wavelet transform and information theory concept." Hydrology Research 49, no. 3 (February 12, 2018): 724–43. http://dx.doi.org/10.2166/nh.2018.143.

Full text
Abstract:
AbstractThe present study proposed a time-space framework using discrete wavelet transform-based multiscale entropy (DWE) approach to analyze and spatially categorize the precipitation variation in Iran. To this end, historical monthly precipitation time series during 1960–2010 from 31 rain gauges were used in this study. First, wavelet-based de-noising approach was applied to diminish the effect of noise in precipitation time series which may affect the entropy values. Next, Daubechies (db) mother wavelets (db5–db10) were used to decompose the precipitation time series. Subsequently, entropy concept was applied to the sub-series to measure the uncertainty and disorderliness at multiple scales. According to the pattern of entropy across scales, each cluster was assigned an entropy signature that provided an estimation of the entropy pattern of precipitation in each cluster. Spatial categorization of rain gauges was performed using DWE values as input data to k-means and self-organizing map (SOM) clustering techniques. According to evaluation criteria, it was proved that k-means with clustering number equal to 5 with Silhouette coefficient=0.33, Davis–Bouldin=1.18 and Dunn index=1.52 performed better in determining homogenous areas. Finally, investigating spatial structure of precipitation variation revealed that the DWE had a decreasing and increasing relationship with longitude and latitude, respectively, in Iran.
APA, Harvard, Vancouver, ISO, and other styles
22

Liu, Fuchen, David Choi, Lu Xie, and Kathryn Roeder. "Global spectral clustering in dynamic networks." Proceedings of the National Academy of Sciences 115, no. 5 (January 16, 2018): 927–32. http://dx.doi.org/10.1073/pnas.1718449115.

Full text
Abstract:
Community detection is challenging when the network structure is estimated with uncertainty. Dynamic networks present additional challenges but also add information across time periods. We propose a global community detection method, persistent communities by eigenvector smoothing (PisCES), that combines information across a series of networks, longitudinally, to strengthen the inference for each period. Our method is derived from evolutionary spectral clustering and degree correction methods. Data-driven solutions to the problem of tuning parameter selection are provided. In simulations we find that PisCES performs better than competing methods designed for a low signal-to-noise ratio. Recently obtained gene expression data from rhesus monkey brains provide samples from finely partitioned brain regions over a broad time span including pre- and postnatal periods. Of interest is how gene communities develop over space and time; however, once the data are divided into homogeneous spatial and temporal periods, sample sizes are very small, making inference quite challenging. Applying PisCES to medial prefrontal cortex in monkey rhesus brains from near conception to adulthood reveals dense communities that persist, merge, and diverge over time and others that are loosely organized and short lived, illustrating how dynamic community detection can yield interesting insights into processes such as brain development.
APA, Harvard, Vancouver, ISO, and other styles
23

Yuan, Jili, Xiaolei Lv, Fangjia Dou, and Jingchuan Yao. "Change Analysis in Urban Areas Based on Statistical Features and Temporal Clustering Using TerraSAR-X Time-Series Images." Remote Sensing 11, no. 8 (April 16, 2019): 926. http://dx.doi.org/10.3390/rs11080926.

Full text
Abstract:
The existing unsupervised multitemporal change detection approaches for synthetic aperture radar (SAR) images based on the pixel level usually suffer from the serious influence of speckle noise, and the classification accuracy of temporal change patterns is liable to be affected by the generation method of similarity matrices and the pre-specified cluster number. To address these issues, a novel time-series change detection method with high efficiency is proposed in this paper. Firstly, spatial feature extraction using local statistical information on patches is conducted to reduce the noise and for subsequent temporal grouping. Secondly, a density-based clustering method is adopted to categorize the pixel series in the temporal dimension, in view of its efficiency and robustness. Change detection and classification results are then obtained by a fast differential strategy in the final step. The experimental results and analysis of synthetic and realistic time-series SAR images acquired by TerraSAR-X in urban areas demonstrate the effectiveness of the proposed method, which outperforms other approaches in terms of both qualitative results and quantitative indices of macro F1-scores and micro F1-scores. Furthermore, we make the case that more temporal change information for buildings can be obtained, which includes when the first and last detected change occurred and the frequency of changes.
APA, Harvard, Vancouver, ISO, and other styles
24

Boqing Feng, Boqing Feng, Mohan Liu Boqing Feng, and Jiuqiang Jin Mohan Liu. "Density Space Clustering Algorithm Based on Users Behaviors." 電腦學刊 33, no. 2 (April 2022): 201–9. http://dx.doi.org/10.53106/199115992022043302018.

Full text
Abstract:
<p>At present, insider threat detection requires a series of complex projects, and has certain limitations in practical applications; in order to reduce the complexity of the model, most studies ignore the timing of user behavior and fail to identify internal attacks that last for a period of time. In addition, companies usually categorize the behavior data generated by all users and store them in different databases. How to collaboratively process large-scale heterogeneous log files and extract characteristic data that accurately reflects user behavior is a difficult point in current research. In order to optimize the parameter selection of the DBSCAN algorithm, this paper proposes a Psychometric Data & Attack Threat Density Based Spatial Clustering of Applications with Noise algorithm (PD&AT-DBSCAN). This algorithm can improve the accuracy of clustering results. The simulation results show that this algorithm is better than the traditional DBSCAN algorithm in terms of Rand index and normalized mutual information.</p> <p>&nbsp;</p>
APA, Harvard, Vancouver, ISO, and other styles
25

Boqing Feng, Boqing Feng, Mohan Liu Boqing Feng, and Jiuqiang Jin Mohan Liu. "Density Space Clustering Algorithm Based on Users Behaviors." 電腦學刊 33, no. 2 (April 2022): 201–9. http://dx.doi.org/10.53106/199115992022043302018.

Full text
Abstract:
<p>At present, insider threat detection requires a series of complex projects, and has certain limitations in practical applications; in order to reduce the complexity of the model, most studies ignore the timing of user behavior and fail to identify internal attacks that last for a period of time. In addition, companies usually categorize the behavior data generated by all users and store them in different databases. How to collaboratively process large-scale heterogeneous log files and extract characteristic data that accurately reflects user behavior is a difficult point in current research. In order to optimize the parameter selection of the DBSCAN algorithm, this paper proposes a Psychometric Data & Attack Threat Density Based Spatial Clustering of Applications with Noise algorithm (PD&AT-DBSCAN). This algorithm can improve the accuracy of clustering results. The simulation results show that this algorithm is better than the traditional DBSCAN algorithm in terms of Rand index and normalized mutual information.</p> <p>&nbsp;</p>
APA, Harvard, Vancouver, ISO, and other styles
26

Guo, Ziyan, Kang Yang, Chang Liu, Xin Lu, Liang Cheng, and Manchun Li. "Mapping National-Scale Croplands in Pakistan by Combining Dynamic Time Warping Algorithm and Density-Based Spatial Clustering of Applications with Noise." Remote Sensing 12, no. 21 (November 6, 2020): 3644. http://dx.doi.org/10.3390/rs12213644.

Full text
Abstract:
Croplands are commonly mapped using time series of remotely sensed images. The dynamic time warping (DTW) algorithm is an effective method for realizing this. However, DTW algorithm faces the challenge of capturing complete and accurate representative cropland time series on a national scale, especially in Asian countries where climatic and topographic conditions, cropland types, and crop growth patterns vary significantly. This study proposes an automatic cropland extraction method based on the DTW algorithm and density-based spatial clustering of applications with noise (DBSCAN), hereinafter referred to as ACE-DTW, to map croplands in Pakistan in 2015. First, 422 frames of multispectral Landsat-8 satellite images were selected from the Google Earth Engine to construct monthly normalized difference vegetation index (NDVI) time series. Next, a total of 2409 training samples of six land cover types were generated randomly and explained visually using high-resolution remotely sensed images. Then, a multi-layer DBSCAN was used to classify NDVI time series of training samples into different categories automatically based on their pairwise DTW distances, and the mean NDVI time series of each category was used as the standard time series to represent the characteristics of that category. These standard time series attempted to represent cropland information and maximally distinguished croplands from other possible interference land cover types. Finally, image pixels were classified as cropland or non-cropland based on their DTW distances to the standard time series of the six land cover types. The overall cropland extraction accuracy of ACE-DTW was 89.7%, which exceeded those of other supervised classifiers (classification and regression trees: 78.2%; support vector machines: 78.8%) and existing global cropland datasets (Finer Resolution Observation and Monitoring of Global Land Cover: 87.1%; Global Food Security Support Analysis Data: 83.1%). Further, ACE-DTW could produce relatively complete time series of variable cropland types, and thereby provide a significant advantage in mountain regions with small, fragmented croplands and plain regions with large, high-density patches of croplands.
APA, Harvard, Vancouver, ISO, and other styles
27

Jiang, Xuchu, Peiyao Wei, Yiwen Luo, and Ying Li. "Air Pollutant Concentration Prediction Based on a CEEMDAN-FE-BiLSTM Model." Atmosphere 12, no. 11 (November 3, 2021): 1452. http://dx.doi.org/10.3390/atmos12111452.

Full text
Abstract:
The concentration series of PM2.5 (particulate matter ≤ 2.5 μm) is nonlinear, nonstationary, and noisy, making it difficult to predict accurately. This paper presents a new PM2.5 concentration prediction method based on a hybrid model of complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and bi-directional long short-term memory (BiLSTM). The new method was applied to predict the same kind of particulate pollutant PM10 and heterogeneous gas pollutant O3, proving that the prediction method has strong generalization ability. First, CEEMDAN was used to decompose PM2.5 concentrations at different frequencies. Then, the fuzzy entropy (FE) value of each decomposed wave was calculated, and the near waves were combined by K-means clustering to generate the input sequence. Finally, the combined sequences were put into the BiLSTM model with multiple hidden layers for training. We predicted the PM2.5 concentrations of Seoul Station 116 by the hour, with values of the root mean square error (RMSE), the mean absolute error (MAE), and the symmetric mean absolute percentage error (SMAPE) as low as 2.74, 1.90, and 13.59%, respectively, and an R2 value as high as 96.34%. The “CEEMDAN-FE” decomposition-merging technology proposed in this paper can effectively reduce the instability and high volatility of the original data, overcome data noise, and significantly improve the model’s performance in predicting the real-time concentrations of PM2.5.
APA, Harvard, Vancouver, ISO, and other styles
28

Tingting, Li, Luo Chao, and Shao Rui. "The structure and dynamics of granular complex networks deriving from financial time series." International Journal of Modern Physics C 31, no. 06 (May 7, 2020): 2050087. http://dx.doi.org/10.1142/s0129183120500874.

Full text
Abstract:
High noise and strong volatility are the typical characteristics of financial time series. Combined with pseudo-randomness, nonsteady and self-similarity exhibiting in different time scales, it is a challenging issue for the pattern analysis of financial time series. Different from the existing works, in this paper, financial time series are converted into granular complex networks, based on which the structure and dynamics of network models are revealed. By using variable-length division, an extended polar fuzzy information granule (FIGs) method is used to construct granular complex networks from financial time series. Considering the temporal characteristics of sequential data, static networks and temporal networks are studied, respectively. As to the static network model, some features of topological structures of granular complex networks, such as distribution, clustering and betweenness centrality are discussed. Besides, by using the Markov chain model, the transfer processes among different granules are investigated, where the fluctuation pattern of data in the coming step can be evaluated from the transfer probability of two consecutive granules. Shanghai composite index and foreign exchange data as two examples in real life are applied to carry out the related discussion.
APA, Harvard, Vancouver, ISO, and other styles
29

Chen, Bo, Tianyi Hu, Zishen Huang, and Chunhui Fang. "A spatio-temporal clustering and diagnosis method for concrete arch dams using deformation monitoring data." Structural Health Monitoring 18, no. 5-6 (September 26, 2018): 1355–71. http://dx.doi.org/10.1177/1475921718797949.

Full text
Abstract:
The timely analysis of deformation monitoring data and reasonable diagnosis of the structural health are key tasks in dam health monitoring studies. This article presents a spatio-temporal clustering and health diagnosis method for super-high concrete arch dams that uses deformation monitoring data obtained from plumb meters. The spatio-temporal expression of the deformation monitoring data is proposed first by upgrading a punctuated time series to a curved panel time series, including cross-sectional, dam axial, and temporal changing directions. Second, a comprehensive similarity indicator on three aspects, namely, the absolute distance, incremental distance, and growth rate distance, is constructed after a deep discussion on deformation similarity characteristics both temporally and spatially. Next, the temporal clustering method is proposed by keeping the key features, namely, extreme points and turning points, while eliminating extraneous details, namely, noise points. Finally, the optimal spatio-temporal clustering of dam deformation is achieved by designing a multi-scale fuzzy C-means method of data mining and its iterative algorithm. The proposed method is applied to the Jinping-I hydraulic structure, which is the highest concrete arch dam in the world. The clustering results is quite sensitive in different weight coefficients of the comprehensive similarity indicator and clustering numbers of fuzzy C-means method. The dam deformation behaviors on high-water-level, water-falling, and low-water-level periods are analyzed and diagnosed. The advanced version of proposed methods is verified by comparative analysis on dam health diagnosis results obtained from ordinary deformation distribution figures and the spatio-temporal clustering figures. The proposed method will facilitate the recognition of abnormal deformation areas and associated safety diagnoses.
APA, Harvard, Vancouver, ISO, and other styles
30

Ren, Yan, Christian I. Hong, Sookkyung Lim, and Seongho Song. "Finding Clocks in Genes: A Bayesian Approach to Estimate Periodicity." BioMed Research International 2016 (2016): 1–14. http://dx.doi.org/10.1155/2016/3017475.

Full text
Abstract:
Identification of rhythmic gene expression from metabolic cycles to circadian rhythms is crucial for understanding the gene regulatory networks and functions of these biological processes. Recently, two algorithms, JTK_CYCLE and ARSER, have been developed to estimate periodicity of rhythmic gene expression. JTK_CYCLE performs well for long or less noisy time series, while ARSER performs well for detecting a single rhythmic category. However, observing gene expression at high temporal resolution is not always feasible, and many scientists are interested in exploring both ultradian and circadian rhythmic categories simultaneously. In this paper, a new algorithm, named autoregressive Bayesian spectral regression (ABSR), is proposed. It estimates the period of time-course experimental data and classifies gene expression profiles into multiple rhythmic categories simultaneously. Through the simulation studies, it is shown that ABSR substantially improves the accuracy of periodicity estimation and clustering of rhythmic categories as compared to JTK_CYCLE and ARSER for the data with low temporal resolution. Moreover, ABSR is insensitive to rhythmic patterns. This new scheme is applied to existing time-course mouse liver data to estimate period of rhythms and classify the genes into ultradian, circadian, and arrhythmic categories. It is observed that 49.2% of the circadian profiles detected by JTK_CYCLE with 1-hour resolution are also detected by ABSR with only 4-hour resolution.
APA, Harvard, Vancouver, ISO, and other styles
31

KANNAN, S. R., S. RAMTHILAGAM, R. DEVI, and T. P. HONG. "FUZZY C-MEANS IN FINDING SUBTYPES OF CANCERS IN CANCER DATABASE." Journal of Innovative Optical Health Sciences 07, no. 01 (January 2014): 1450018. http://dx.doi.org/10.1142/s1793545814500187.

Full text
Abstract:
Finding subtypes of cancer in breast cancer database is an extremely difficult task because of heavy noise by measurement error. Most of the recent clustering techniques for breast cancer database to achieve cancerous and noncancerous often weigh down the interpretability of the structure. Hence, this paper tries to find effective Fuzzy C-Means-based clustering techniques to identify the proper subtypes of cancer in breast cancer database. This paper obtains the objective function of effective Fuzzy C-Means clustering techniques by incorporating the kernel induced distance function, Renyi's entropy function, weighted distance measure and neighborhood terms-based spatial context. The effectiveness of the proposed methods are proved through the experimental works on Lung cancer database, IRIS dataset, Wine dataset, Checkerboard dataset, Time Series dataset and Yeast dataset. Finally, the proposed methods are implemented successfully to cluster the breast cancer database into cancerous and noncancerous. The clustering accuracy has been validated through error matrix and silhouette method.
APA, Harvard, Vancouver, ISO, and other styles
32

Yonto, Daniel, L. Michele Issel, and Jean-Claude Thill. "Spatial Analytics Based on Confidential Data for Strategic Planning in Urban Health Departments." Urban Science 3, no. 3 (July 22, 2019): 75. http://dx.doi.org/10.3390/urbansci3030075.

Full text
Abstract:
Spatial data analytics can detect patterns of clustering of events in small geographies across an urban region. This study presents and demonstrates a robust research design to study the longitudinal stability of spatial clustering with small case numbers per census tract and assess the clustering changes over time across the urban environment to better inform public health policy making at the community level. We argue this analysis enables the greater efficiency of public health departments, while leveraging existing data and preserving citizen personal privacy. Analysis at the census tract level is conducted in Mecklenburg County, North Carolina, on hypertension during pregnancy compiled from 2011–2014 birth certificates. Data were derived from per year and per multi-year moving counts by aggregating spatially to census tracts and then assessed for clustering using global Moran’s I. With evidence of clustering, local indicators of spatial association are calculated to pinpoint hot spots, while time series data identified hot spot changes. Knowledge regarding the geographical distribution of diseases is essential in public health to define strategies that improve the health of populations and quality of life. Our findings support that spatial aggregation at the census tract level contributes to identifying the location of at-risk “hot spot” communities to refine health programs, while temporal windowing reduces random noise effects on spatial clustering patterns. With tight state budgets limiting health departments’ funds, using geographic analytics provides for a targeted and efficient approach to health resource planning.
APA, Harvard, Vancouver, ISO, and other styles
33

Cheng, Zixuan, and Li Liu. "Brain Magnetic Resonance Imaging Segmentation Using Possibilistic Clustering Algorithm Combined with Context Constraints." Journal of Medical Imaging and Health Informatics 10, no. 7 (July 1, 2020): 1669–74. http://dx.doi.org/10.1166/jmihi.2020.3093.

Full text
Abstract:
Because the FCM method is simple and effective, a series of research results based on this method are widely used in medical image segmentation. Compared with the traditional FCM, the probability clustering (PCM) algorithm cancels the constraint on the normalization of each sample membership degree in the iterative process, and the clustering effect of the method is improved within a certain range. However, the above two methods only use the gray value of the image pixels in the iterative process, ignoring the context constraint relationship between the high-dimensional image pixels. The two are easily affected by image noise during the segmentation process, resulting in poor robustness, which will affect the segmentation accuracy in practical applications. In order to alleviate this problem, this paper introduces the context constraint information of image based on PCM, and proposes a PCM algorithm that combines context constraints (CCPCM) and successfully applies it to human brain MR image segmentation to further improve the noise immunity of the new algorithm. Expand the applicability of new algorithms in the medical field. Through simulation results on medical images, it is found that compared with the previous classical clustering methods, such as FCM, PCM, etc., the CCPCM has better anti-interference to different noises, and the segmentation boundary is clearer. At the same time, CCPCM algorithm introduces the spatial neighbor information adaptive weighting mechanism in the clustering process, which can adaptively adjust the constraint weight of spatial information and optimize the clustering process, thus improving the segmentation efficiency.
APA, Harvard, Vancouver, ISO, and other styles
34

Frizzo Stefenon, Stéfano, Roberto Zanetti Freire, Leandro dos Santos Coelho, Luiz Henrique Meyer, Rafael Bartnik Grebogi, William Gouvêa Buratto, and Ademir Nied. "Electrical Insulator Fault Forecasting Based on a Wavelet Neuro-Fuzzy System." Energies 13, no. 2 (January 19, 2020): 484. http://dx.doi.org/10.3390/en13020484.

Full text
Abstract:
The surface contamination of electrical insulators can increase the electrical conductivity of these components, which may lead to faults in the electrical power system. During inspections, ultrasound equipment is employed to detect defective insulators or those that may cause failures within a certain period. Assuming that the signal collected by the ultrasound device can be processed and used for both the detection of defective insulators and prediction of failures, this study starts by presenting an experimental procedure considering a contaminated insulator removed from the distribution line for data acquisition. Based on the obtained data set, an offline time series forecasting approach with an Adaptive Neuro-Fuzzy Inference System (ANFIS) was conducted. To improve the time series forecasting performance and to reduce the noise, Wavelet Packets Transform (WPT) was associated to the ANFIS model. Once the ANFIS model associated with WPT has distinct parameters to be adjusted, a complete evaluation concerning different model configurations was conducted. In this case, three inference system structures were evaluated: grid partition, fuzzy c-means clustering, and subtractive clustering. A performance analysis focusing on computational effort and the coefficient of determination provided additional parameter configurations for the model. Taking into account both parametrical and statistical analysis, the Wavelet Neuro-Fuzzy System with fuzzy c-means showed that it is possible to achieve impressive accuracy, even when compared to classical approaches, in the prediction of electrical insulators conditions.
APA, Harvard, Vancouver, ISO, and other styles
35

Hu, Xiang Ping. "Simulation and Application on Data Clustering Based on an Improved FCM Algorithm." Applied Mechanics and Materials 380-384 (August 2013): 1589–92. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.1589.

Full text
Abstract:
An improved data clustering algorithm was proposed based on the Fuzzy C-Means (FCM) algorithm for the purpose of clustering the data precisely and effectively, through progressing the performance of the data clustering to afford the element work for the application of fault diagnosis and target recognition and so on. There was fatal weakness for the traditional FCM algorithm that the algorithm is sensitive to initial value and noise. The chaotic differential evolution FCM algorithm was proposed according to the efficient global search capability of differential evolution algorithm and the traversal characteristic of chaotic time series. The improved algorithm used the Logistics chaotic mapping to search for the optimal solution, and the chaos disturbance was introduced into the evolutionary population to make up for the defects of FCM algorithm. The new method can overcome the problems of initial value sensitiveness with FCM and local convergence with genetic algorithm. Because the new method. Three types of typical vibration data of faults engines was taken as the example for the research and application. The simulation and application result shows that the data clustering performance of the improved FCM algorithm is much better than the traditional FCM algorithm, and the accuracy rates of fault diagnosis in the application was increased by more than twenty percent, it shows good application prospect.
APA, Harvard, Vancouver, ISO, and other styles
36

Doan, C. D., S. Y. Liong, and Dulakshi S. K. Karunasinghe. "Derivation of effective and efficient data set with subtractive clustering method and genetic algorithm." Journal of Hydroinformatics 7, no. 4 (October 1, 2005): 219–33. http://dx.doi.org/10.2166/hydro.2005.0020.

Full text
Abstract:
Success of any forecasting model depends heavily on reliable historical data, among others. Data are needed to calibrate, fine tune and verify any simulation model. However, data are very often contaminated with noise of different levels originating from different sources. This study proposes a scheme that extracts the most representative data from a raw data set. Subtractive Clustering Method (SCM) and Micro Genetic Algorithm (mGA) were used for this purpose. SCM does (a) remove outliers and (b) discard unnecessary or superfluous points while mGA, a search engine, determines the optimal values of the SCM's parameter set. The scheme was demonstrated in: (1) Bangladesh water level forecasting with Neural Network and Fuzzy Logic and (2) forecasting of two chaotic river flow series (Wabash River at Mt. Carmel and Mississippi River at Vicksburg) with the phase space prediction method. The scheme was able to significantly reduce the data set with which the forecasting models yield either equally high or higher prediction accuracy than models trained with the whole original data set. The resulting fuzzy logic model, for example, yields a smaller number of rules which are easier for human interpretation. In phase space prediction of chaotic time series, which is known to require a long data record, a data reduction of up to 40% almost does not affect the prediction accuracy.
APA, Harvard, Vancouver, ISO, and other styles
37

Jayawardena, Nirodha I., Jason West, Neda Todorova, and Bin Li. "Improved algorithm for cleaning high frequency data: An analysis of foreign currency." Corporate Ownership and Control 12, no. 3 (2015): 125–32. http://dx.doi.org/10.22495/cocv12i3c1p1.

Full text
Abstract:
High-frequency data are notorious for their noise and asynchrony, which may bias or contaminate the empirical analysis of prices and returns. In this study, we develop a novel data filtering approach that simultaneously addresses volatility clustering and irregular spacing, which are inherent characteristics of high-frequency data. Using high frequency currency data collected at five-minute intervals, we find the presence of vast microstructure noise coupled with random volatility clusters, and observe an extremely non-Gaussian distribution of returns. To process non-Gaussian high-frequency data for time series modelling, we propose two efficient and robust standardisation methods that cater for volatility clusters, which clean the data and achieve near-normal distributions. We show that the filtering process efficiently cleans high-frequency data for use in empirical settings while retaining the underlying distributional properties
APA, Harvard, Vancouver, ISO, and other styles
38

Zhang, Xian, Diquan Li, Jin Li, Yong Li, Jialin Wang, Shanshan Liu, and Zhimin Xu. "Magnetotelluric Signal-Noise Separation Using IE-LZC and MP." Entropy 21, no. 12 (December 4, 2019): 1190. http://dx.doi.org/10.3390/e21121190.

Full text
Abstract:
Eliminating noise signals of the magnetotelluric (MT) method is bound to improve the quality of MT data. However, existing de-noising methods are designed for use in whole MT data sets, causing the loss of low-frequency information and severe mutation of the apparent resistivity-phase curve in low-frequency bands. In this paper, we used information entropy (IE), the Lempel–Ziv complexity (LZC), and matching pursuit (MP) to distinguish and suppress MT noise signals. Firstly, we extracted IE and LZC characteristic parameters from each segment of the MT signal in the time-series. Then, the characteristic parameters were input into the FCM clustering to automatically distinguish between the signal and noise. Next, the MP de-noising algorithm was used independently to eliminate MT signal segments that were identified as interference. Finally, the identified useful signal segments were combined with the denoised data segments to reconstruct the signal. The proposed method was validated through clustering analysis based on the signal samples collected at the Qinghai test site and the measured sites, where the results were compared to those obtained using the remote reference method and independent use of the MP method. The findings show that strong interference is purposefully removed, and the apparent resistivity-phase curve is continuous and stable. Moreover, the processed data can accurately reflect the geoelectrical information and improve the level of geological interpretation.
APA, Harvard, Vancouver, ISO, and other styles
39

Karasiak, N., M. Fauvel, J. F. Dejoux, C. Monteil, and D. Sheeren. "OPTIMAL DATES FOR DECIDUOUS TREE SPECIES MAPPING USING FULL YEARS SENTINEL-2 TIME SERIES IN SOUTH WEST FRANCE." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-3-2020 (August 3, 2020): 469–76. http://dx.doi.org/10.5194/isprs-annals-v-3-2020-469-2020.

Full text
Abstract:
Abstract. The free to use Sentinel-2 (S2) sensors with 5-day revisit time at high spatial resolution in 10 spectral bands is a revolution in the remote sensing domain. Including 6 spectral bands in the near infrared, with 3 dedicated for the red-edge (where the vegetation significatively increases), these european satellites are very promising for mapping tree species distribution at a national scale. Here, we study the contribution of three one-year S2 Satellite Image Time Series (SITS) for mapping deciduous species distribution in the southwest of France. The annual cycle of vegetation (called phenology) can contribute to the identification of tree species. For some specific dates, species can have different phenological behaviours (senesence, flowering…). To train and validate the maps, we used the Support Vector Machine algorithm with a spatial cross-validation method. To train the algorithm with the same number of samples per species, we decided to undersample each class to the smallest class using a K-means clustering method. Moreover, a Sequential Feature Selection (SFS) has been implemented to detect the optimal dates per species. Our results are promising with high accuracy for Red oak andWillow (average score of the three one-year respectively F1 = 0.99, F1 = 0.94) based on the optimal dates. However, it appears that the performances when using the each full SITS are far below the optimal dates models (average ΔF1 = 0.32). We did not find, except for Willow and Red oak, that the optimal dates were the same for each year. Perspectives is to find an algorithm robust to temporal or spectral noise and to smooth the time series.
APA, Harvard, Vancouver, ISO, and other styles
40

Liu, Xuguang. "A Real-Time Detection Method for Abnormal Data of Internet of Things Sensors Based on Mobile Edge Computing." Mathematical Problems in Engineering 2021 (February 28, 2021): 1–7. http://dx.doi.org/10.1155/2021/6655346.

Full text
Abstract:
Aiming at the anomaly detection problem in sensor data, traditional algorithms usually only focus on the continuity of single-source data and ignore the spatiotemporal correlation between multisource data, which reduces detection accuracy to a certain extent. Besides, due to the rapid growth of sensor data, centralized cloud computing platforms cannot meet the real-time detection needs of large-scale abnormal data. In order to solve this problem, a real-time detection method for abnormal data of IoT sensors based on edge computing is proposed. Firstly, sensor data is represented as time series; K-nearest neighbor (KNN) algorithm is further used to detect outliers and isolated groups of the data stream in time series. Secondly, an improved DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm is proposed by considering spatiotemporal correlation between multisource data. It can be set according to sample characteristics in the window and overcomes the slow convergence problem using global parameters and large samples, then makes full use of data correlation to complete anomaly detection. Moreover, this paper proposes a distributed anomaly detection model for sensor data based on edge computing. It performs data processing on computing resources close to the data source as much as possible, which improves the overall efficiency of data processing. Finally, simulation results show that the proposed method has higher computational efficiency and detection accuracy than traditional methods and has certain feasibility.
APA, Harvard, Vancouver, ISO, and other styles
41

Hennessy, Lachlan, and James Macnae. "Source-dependent bias of sferics in magnetotelluric responses." GEOPHYSICS 83, no. 3 (May 1, 2018): E161—E171. http://dx.doi.org/10.1190/geo2017-0434.1.

Full text
Abstract:
The predominant signals of audio-frequency magnetotellurics (AMT) are called sferics, and they are generated by global lightning activity. When sferic signals are small or infrequent, measurement noise in electric and magnetic fields causes errors in estimated apparent resistivity and phase curves, leading to great model uncertainty. To reduce bias in apparent resistivity and phase, we use a global propagation model to link sferic signals in time series AMT data with commercially available lightning source information including strike time, location, and peak current. We then investigate relationships between lightning strike location, peak current, and the quality of the estimated apparent resistivity and phase curves using the bounded influence remote reference processing code. We use two empirical approaches to preprocessing time-series AMT data before estimation of apparent resistivity and phase: stitching and stacking (averaging). We find that for single-site AMT data, bias can be reduced by processing sferics from the closest and most powerful lightning strikes and omitting the lower amplitude signal-deficient segments in between. We hypothesized that bias can be further reduced by stacking sferics on the assumptions that lightning dipole moments are log-normally distributed whereas the superposed noise is normally distributed. Due to interference between dissimilar sferic waveforms, we tested a hybrid stitching-stacking approached based on clustering sferics using a wavelet-based waveform similarity algorithm. Our results indicate that the best approach to reduce bias was to stitch the closest and highest amplitude data.
APA, Harvard, Vancouver, ISO, and other styles
42

Wang, Wenxuan, Xiaodong Zhu, Yanli Wang, and Bing Wu. "Route Identification Method for On-Ramp Traffic at Adjacent Intersections of Expressway Entrance." Journal of Advanced Transportation 2019 (December 4, 2019): 1–15. http://dx.doi.org/10.1155/2019/6960193.

Full text
Abstract:
To determine the control strategy at intersections adjacent to the expressway on-ramp, a route identification method based on empirical mode decomposition (EMD) and dynamic time warping (DTW) is established. First, the de-noise function of EMD method is applied to eliminate disturbances and extract features and trends of traffic data. Then, DTW is used to measure the similarity of traffic volume time series between intersection approaches and expressway on-ramp. Next, a three-dimensional feature vector is built for every intersection approach traffic flow, including DTW distance, space distance between on-ramp and intersection approach, and intersection traffic volume. Fuzzy C-means clustering method is employed to cluster intersection approaches into classifications and identify critical routes carrying the most traffic to the on-ramp. The traffic data are collected by inductive loops at Xujiahui on-ramp of North and South Viaduct Expressway and surrounding intersections in Shanghai, China. The result shows that the proposed method can achieve route classification among intersections for different time periods in one day, and the clustering result is significantly influenced by three dimensions of traffic flow feature vector. As an illustrative example, micro-simulation models are built with different control strategies. The simulation shows that the coordinated control of critical routes identified by the proposed method has a better performance than coordinated control of arterial roads. Conclusions demonstrated that the proposed route identification method could provide a theoretical basis for the coordinated control of traffic signals among intersections and on-ramp.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhou, Tong, Xintao Liu, Zhen Qian, Haoxuan Chen, and Fei Tao. "Dynamic Update and Monitoring of AOI Entrance via Spatiotemporal Clustering of Drop-Off Points." Sustainability 11, no. 23 (December 3, 2019): 6870. http://dx.doi.org/10.3390/su11236870.

Full text
Abstract:
This paper proposes a novel method for dynamically extracting and monitoring the entrances of areas of interest (AOIs). Most AOIs in China, such as buildings and communities, are enclosed by walls and are only accessible via one or more entrances. The entrances are not marked on most maps for route planning and navigation in an accurate way. In this work, the extraction scheme of the entrances is based on taxi trajectory data with a 30 s sampling time interval. After fine-grained data cleaning, the position accuracy of the drop-off points extracted from taxi trajectory data is guaranteed. Next, the location of the entrances is extracted, combining the density-based spatial clustering of applications with noise (DBSCAN) with the boundary of the AOI under the constraint of the road network. Based on the above processing, the dynamic update scheme of the entrance is designed. First, a time series analysis is conducted using the clusters of drop-off points within the adjacent AOI, and then, a relative heat index ( R H I ) is applied to detect the recent access status (closed or open) of the entrances. The results show the average accuracy of the current extraction algorithm is improved by 24.3% over the K-means algorithm, and the R H I can reduce the limitation of map symbols in describing the access status. The proposed scheme can, therefore, help optimize the dynamic visualization of the entry symbols in mobile navigation maps, and facilitate human travel behavior and way-finding, which is of great help to sustainable urban development.
APA, Harvard, Vancouver, ISO, and other styles
44

LI, JIN, XIAN ZHANG, JINZHE GONG, JINGTIAN TANG, ZHENGYONG REN, GUANG LI, YANLI DENG, and JIN CAI. "SIGNAL-NOISE IDENTIFICATION OF MAGNETOTELLURIC SIGNALS USING FRACTAL-ENTROPY AND CLUSTERING ALGORITHM FOR TARGETED DE-NOISING." Fractals 26, no. 02 (April 2018): 1840011. http://dx.doi.org/10.1142/s0218348x1840011x.

Full text
Abstract:
A new technique is proposed for signal-noise identification and targeted de-noising of Magnetotelluric (MT) signals. This method is based on fractal-entropy and clustering algorithm, which automatically identifies signal sections corrupted by common interference (square, triangle and pulse waves), enabling targeted de-noising and preventing the loss of useful information in filtering. To implement the technique, four characteristic parameters — fractal box dimension (FBD), higuchi fractal dimension (HFD), fuzzy entropy (FuEn) and approximate entropy (ApEn) — are extracted from MT time-series. The fuzzy c-means (FCM) clustering technique is used to analyze the characteristic parameters and automatically distinguish signals with strong interference from the rest. The wavelet threshold (WT) de-noising method is used only to suppress the identified strong interference in selected signal sections. The technique is validated through signal samples with known interference, before being applied to a set of field measured MT/Audio Magnetotelluric (AMT) data. Compared with the conventional de-noising strategy that blindly applies the filter to the overall dataset, the proposed method can automatically identify and purposefully suppress the intermittent interference in the MT/AMT signal. The resulted apparent resistivity-phase curve is more continuous and smooth, and the slow-change trend in the low-frequency range is more precisely reserved. Moreover, the characteristic of the target-filtered MT/AMT signal is close to the essential characteristic of the natural field, and the result more accurately reflects the inherent electrical structure information of the measured site.
APA, Harvard, Vancouver, ISO, and other styles
45

Du, Congju, and Bin Tang. "Novel Unconventional-Active-Jamming Recognition Method for Wideband Radars Based on Visibility Graphs." Sensors 19, no. 10 (May 21, 2019): 2344. http://dx.doi.org/10.3390/s19102344.

Full text
Abstract:
Radar unconventional active jamming, including unconventional deceptive jamming and barrage jamming, poses a serious threat to wideband radars. This paper proposes an unconventional-active-jamming recognition method for wideband radar. In this method, the visibility algorithm of converting the radar time series into graphs, called visibility graphs, is first given. Then, the visibility graph of the linear-frequency-modulation (LFM) signal is proved to be a regular graph, and the rationality of extracting features on visibility graphs is theoretically explained. Therefore, four features on visibility graphs, average degree, average clustering coefficient, Newman assortativity coefficient, and normalized network-structure entropy, are extracted from visibility graphs. Finally, a random-forests (RF) classifier is chosen for unconventional-active-jamming recognition. Experiment results show that recognition probability was over 90% when the jamming-to-noise ratio (JNR) was above 0 dB.
APA, Harvard, Vancouver, ISO, and other styles
46

Carta, Salvatore, Sergio Consoli, Luca Piras, Alessandro Sebastian Podda, and Diego Reforgiato Recupero. "Event detection in finance using hierarchical clustering algorithms on news and tweets." PeerJ Computer Science 7 (May 10, 2021): e438. http://dx.doi.org/10.7717/peerj-cs.438.

Full text
Abstract:
In the current age of overwhelming information and massive production of textual data on the Web, Event Detection has become an increasingly important task in various application domains. Several research branches have been developed to tackle the problem from different perspectives, including Natural Language Processing and Big Data analysis, with the goal of providing valuable resources to support decision-making in a wide variety of fields. In this paper, we propose a real-time domain-specific clustering-based event-detection approach that integrates textual information coming, on one hand, from traditional newswires and, on the other hand, from microblogging platforms. The goal of the implemented pipeline is twofold: (i) providing insights to the user about the relevant events that are reported in the press on a daily basis; (ii) alerting the user about potentially important and impactful events, referred to as hot events, for some specific tasks or domains of interest. The algorithm identifies clusters of related news stories published by globally renowned press sources, which guarantee authoritative, noise-free information about current affairs; subsequently, the content extracted from microblogs is associated to the clusters in order to gain an assessment of the relevance of the event in the public opinion. To identify the events of a day d we create the lexicon by looking at news articles and stock data of previous days up to d−1 Although the approach can be extended to a variety of domains (e.g. politics, economy, sports), we hereby present a specific implementation in the financial sector. We validated our solution through a qualitative and quantitative evaluation, performed on the Dow Jones’ Data, News and Analytics dataset, on a stream of messages extracted from the microblogging platform Stocktwits, and on the Standard & Poor’s 500 index time-series. The experiments demonstrate the effectiveness of our proposal in extracting meaningful information from real-world events and in spotting hot events in the financial sphere. An added value of the evaluation is given by the visual inspection of a selected number of significant real-world events, starting from the Brexit Referendum and reaching until the recent outbreak of the Covid-19 pandemic in early 2020.
APA, Harvard, Vancouver, ISO, and other styles
47

Musin, Artur R. "Economic-mathematical model for predicting financial market dynamics." Statistics and Economics 15, no. 4 (September 4, 2018): 61–69. http://dx.doi.org/10.21686/2500-3925-2018-4-61-69.

Full text
Abstract:
Study purpose.Existing approaches to forecasting dynamics of financial markets, as a rule, reduce to econometric calculations or technical analysis techniques, which in turn is a consequence of preferences among specialists, engaged in theoretical research and professional market participants, respectively. The main study purpose is developing a predictive economic-mathematical model that allows combining both approaches. In other words, this model should be estimated using traditional methods of econometrics and, at the same time, take into account the impact on the pricing process of the effect of clustering participants on behavioral patterns, as the basis of technical analysis. In addition, it is necessary that the created economic-mathematical model should take into account the phenomenon of existing historical trading levels and control the influence they exert on price dynamics, when it falls into local areas of these levels. Such analysis of price behavior patterns in certain areas of historical repeating levels is a popular approach among professional market participants. Besides, an important criterion of developing model’s potential applicability by a wide range of the interested specialists is its general functional form’s simplicity and, in particular, its components.Materials and methods. In the study, the market of the pound sterling exchange rate against the US dollar (GBP/USD) for the whole period of 2017 was chosen as the considered financial series, in order to forecast it. The presented economic-mathematical model was estimated by classical Kalman filter with an embedded neural network. The choice of these assessment tools can be explained by their wide capabilities in dealing with non-stationary, noisy financial market time series. In addition, applying Kalman filter is a popular technique for estimation local-level models, which principle was implemented in the newly model, proposed in article.Results. Using chosen approach of simultaneous applying Kalman filter and artificial neural network, there were obtained statistically significant estimations of all model’s coefficients. The subsequent model application on GBP/USD series from the test dataset allowed demonstrating its high predictive ability comparing with added random walk model, in particular judging by percentage of correct forecast directions. All received results have confirmed that constructed model allows effectively taking into account structural features of considered market and building good forecasts of future price dynamics.Conclusion. The study was focused on developing and improving apparatus of forecasting financial market prices dynamics. In turn, economic-mathematical model presented in that paper can be used both by specialists, carrying out theoretical studies of pricing process in financial markets, and by professional market participants, forecasting the direction of future price movements. High percentage of correct forecast directions makes it possible to use proposed model independently or as a confirmatory tool.
APA, Harvard, Vancouver, ISO, and other styles
48

Mele, Annalisa, Autilia Vitiello, Manuela Bonano, Andrea Miano, Riccardo Lanari, Giovanni Acampora, and Andrea Prota. "On the Joint Exploitation of Satellite DInSAR Measurements and DBSCAN-Based Techniques for Preliminary Identification and Ranking of Critical Constructions in a Built Environment." Remote Sensing 14, no. 8 (April 13, 2022): 1872. http://dx.doi.org/10.3390/rs14081872.

Full text
Abstract:
The need for widespread structural safety checks represents a stimulus for the research of advanced techniques for structural monitoring at the scale of single constructions or wide areas. In this work, a strategy to preliminarily identify and rank possible critical constructions in a built environment is presented, based on the joint exploitation of satellite radar remote sensing measurements and artificial intelligence (AI) techniques. The satellite measurements are represented by the displacement time series obtained through the Differential Synthetic Aperture Radar Interferometry (DInSAR) technique known as full resolution Small BAseline Subset (SBAS) approach, while the exploited AI technique is represented by the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) methodology. The DBSCAN technique is applied to the SBAS-DInSAR products relevant to the achieved Persistent Scatterers (PSs), to identify clusters of pixels corresponding to buildings within the investigated area. The analysis of the deformation evolution of each building cluster is performed in terms of velocity rates and statistics on the DInSAR measurements. Synthetic deformation maps of the areas are then retrieved to identify critical buildings. The proposed methodology is applied to three areas within the city of Rome (Italy), imaged by the COSMO-SkyMed SAR satellite constellation from ascending and descending orbits (in the time interval 2011–2019). Starting from the DInSAR measurements, the DBSCAN algorithm provides the automatic clustering of buildings within the three selected areas. Exploiting the derived deformation maps of each study area, a preliminary identification and ranking of critical buildings is achieved, thus confirming the validity of the proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
49

Zhang, Yongjun, and Guangheng Gao. "Optimization and Evaluation of an Intelligent Short-Term Blood Glucose Prediction Model Based on Noninvasive Monitoring and Deep Learning Techniques." Journal of Healthcare Engineering 2022 (April 11, 2022): 1–16. http://dx.doi.org/10.1155/2022/8956850.

Full text
Abstract:
Continuous noninvasive blood glucose monitoring and estimation management by using photoplethysmography (PPG) technology always have a series of problems, such as substantial time variability, inaccuracy, and complex nonlinearity. This paper proposes a blood glucose (BG) prediction model for more precise prediction based on BG series decomposition by complete aggregation empirical mode decomposition based on adaptive white noise (CEEMDAN) and the gated recurrent unit (GRU) that is optimized by improved bacterial foraging optimization (IBFO). Hierarchical clustering technology recombines the decomposed BG series according to their sample entropy and the correlations with the original BG trends. Dynamic BG trends are regressed separately for each recombined BG series by the GRU model to realize the more precise estimations, which are optimized by IBFO for its structure and superparameters. Through experiments, the optimized and basic LSTM, RNN, and support vector regression (SVR) are compared to evaluate the performance of the proposed model. The experimental results indicate that the root mean square error (RMSE) and mean absolute percentage error (MAPE) of the 15-min IBFO-GRU prediction is improved on average by about 13.1% and 18.4%, respectively, compared with those of the RNN and LSTM optimized by IBFO. Meanwhile, the proposed model improved the Clarke error grid results by about 2.6% and 5.0% compared with those of the IBFO-LSTM and IBFO-RNN in 30-min prediction and by 4.1% and 6.6% in 15-min ahead forecast, respectively. The evaluation outcomes of our proposed CEEMDAN-IBFO-GRU model have high accuracy and adaptability and can effectively provide early intervention control of the occurrence of hyperglycemic complications.
APA, Harvard, Vancouver, ISO, and other styles
50

Jimoh, Biliaminu, Radu Mariescu-Istodor, and Pasi Fränti. "Is Medoid Suitable for Averaging GPS Trajectories?" ISPRS International Journal of Geo-Information 11, no. 2 (February 14, 2022): 133. http://dx.doi.org/10.3390/ijgi11020133.

Full text
Abstract:
Averaging GPS trajectories is needed in applications such as clustering and automatic extraction of road segments. Calculating mean for trajectories and other time series data is non-trivial and shown to be an NP-hard problem. medoid has therefore been widely used as a practical alternative and because of its (assumed) better noise tolerance. In this paper, we study the usefulness of the medoid to solve the averaging problem with ten different trajectory-similarity/-distance measures. Our results show that the accuracy of medoid depends mainly on the sample size. Compared to other averaging methods, the performance deteriorates especially when there are only few samples from which the medoid must be selected. Another weakness is that medoid inherits properties such as the sample frequency of the arbitrarily selected sample. The choice of the trajectory distance function becomes less significant. For practical applications, other averaging methods than medoid seem a better alternative for higher accuracy.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography