Academic literature on the topic 'Datasety'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Datasety.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Datasety"

1

Almeida, Daniela, Dany Domínguez-Pérez, Ana Matos, Guillermin Agüero-Chapin, Yuselis Castaño, Vitor Vasconcelos, Alexandre Campos, and Agostinho Antunes. "Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus." Data 5, no. 4 (November 27, 2020): 110. http://dx.doi.org/10.3390/data5040110.

Full text
Abstract:
Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP production.
APA, Harvard, Vancouver, ISO, and other styles
2

Haider, S. A., and N. S. Patil. "Minimization of Datasets : Using a Master Interlinked Dataset." Indian Journal of Computer Science 3, no. 5 (October 1, 2018): 20. http://dx.doi.org/10.17010/ijcs/2018/v3/i5/138778.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Feng, Eric, and Xijin Ge. "DataViz: visualization of high-dimensional data in virtual reality." F1000Research 7 (October 23, 2018): 1687. http://dx.doi.org/10.12688/f1000research.16453.1.

Full text
Abstract:
Virtual reality (VR) simulations promote interactivity and immersion, and provide an opportunity that may help researchers gain insights from complex datasets. To explore the utility and potential of VR in graphically rendering large datasets, we have developed an application for immersive, 3-dimensional (3D) scatter plots. Developed using the Unity development environment, DataViz enables the visualization of high-dimensional data with the HTC Vive, a relatively inexpensive and modern virtual reality headset available to the general public. DataViz has the following features: (1) principal component analysis (PCA) of the dataset; (2) graphical rendering of said dataset’s 3D projection onto its first three principal components; and (3) intuitive controls and instructions for using the application. As a use case, we applied DataViz to visualize a single-cell RNA-Seq dataset. DataViz can help gain insights from complex datasets by enabling interaction with high-dimensional data.
APA, Harvard, Vancouver, ISO, and other styles
4

Chang, Nai Chen, Elissa Aminoff, John Pyles, Michael Tarr, and Abhinav Gupta. "Scaling Up Neural Datasets: A public fMRI dataset of 5000 scenes." Journal of Vision 18, no. 10 (September 1, 2018): 732. http://dx.doi.org/10.1167/18.10.732.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Yulian, and Shigeyuki Hamori. "Forecasting Crude Oil Market Crashes Using Machine Learning Technologies." Energies 13, no. 10 (May 13, 2020): 2440. http://dx.doi.org/10.3390/en13102440.

Full text
Abstract:
To the best of our knowledge, this study provides new insight into the forecasting of crude oil futures price crashes in America, employing a moving window. One is the fixed-length window and the other is the expanding-length window, which has never been reported in the past. We aimed to investigate if there is any difference when historical data are discarded. As the explanatory variables, we adapted 13 variables to obtain two datasets, 16 explanatory variables for Dataset1 and 121 explanatory variables for Dataset2. We try to observe results from the different-sized sets of explanatory variables. Specifically, we leverage the merits of a series of machine learning techniques, which include random forests, logistic regression, support vector machines, and extreme gradient boosting (XGBoost). Finally, we employ the evaluation metrics that are broadly used to assess the discriminatory power of imbalanced datasets. Our results indicate that we should occasionally discard distant historical data, and that XGBoost outperforms the other employed approaches, achieving a detection rate as high as 86% using the fixed-length moving window for Dataset2.
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Juan, Zhibin Zhang, and Yanjuan Li. "Constructing Phylogenetic Networks Based on the Isomorphism of Datasets." BioMed Research International 2016 (2016): 1–7. http://dx.doi.org/10.1155/2016/4236858.

Full text
Abstract:
Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK,and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topologyGand construct a network for every dataset. For any one datasetC, we can compute a network from the network representing the simplest dataset which is isomorphic toC. This process will save more time for the algorithms when constructing networks.
APA, Harvard, Vancouver, ISO, and other styles
7

Xie, Yanqing, Zhengqiang Li, Weizhen Hou, Jie Guang, Yan Ma, Yuyang Wang, Siheng Wang, and Dong Yang. "Validation of FY-3D MERSI-2 Precipitable Water Vapor (PWV) Datasets Using Ground-Based PWV Data from AERONET." Remote Sensing 13, no. 16 (August 16, 2021): 3246. http://dx.doi.org/10.3390/rs13163246.

Full text
Abstract:
The medium resolution spectral imager-2 (MERSI-2) is one of the most important sensors onboard China’s latest polar-orbiting meteorological satellite, Fengyun-3D (FY-3D). The National Satellite Meteorological Center of China Meteorological Administration has developed four precipitable water vapor (PWV) datasets using five near-infrared bands of MERSI-2, including the P905 dataset, P936 dataset, P940 dataset and the fusion dataset of the above three datasets. For the convenience of users, we comprehensively evaluate the quality of these PWV datasets with the ground-based PWV data derived from Aerosol Robotic Network. The validation results show that the P905, P936 and fused PWV datasets have relatively large systematic errors (−0.10, −0.11 and −0.07 g/cm2), whereas the systematic error of the P940 dataset (−0.02 g/cm2) is very small. According to the overall accuracy of these four PWV datasets by our assessments, they can be ranked in descending order as P940 dataset, fused dataset, P936 dataset and P905 dataset. The root mean square error (RMSE), relative error (RE) and percentage of retrieval results with error within ±(0.05+0.10∗PWVAERONET) (PER10) of the P940 PWV dataset are 0.24 g/cm2, 0.10 and 76.36%, respectively. The RMSE, RE and PER10 of the P905 PWV dataset are 0.38 g/cm2, 0.15 and 57.72%, respectively. In order to obtain a clearer understanding of the accuracy of these four MERSI-2 PWV datasets, we compare the accuracy of these four MERSI-2 PWV datasets with that of the widely used MODIS PWV dataset and AIRS PWV dataset. The results of the comparison show that the accuracy of the MODIS PWV dataset is not as good as that of all four MERSI-2 PWV datasets, due to the serious overestimation of the MODIS PWV dataset (0.40 g/cm2), and the accuracy of the AIRS PWV dataset is worse than that of the P940 and fused MERSI-2 PWV datasets. In addition, we analyze the error distribution of the four PWV datasets in different locations, seasons and water vapor content. Finally, the reason why the fused PWV dataset is not the one with the highest accuracy among the four PWV datasets is discussed.
APA, Harvard, Vancouver, ISO, and other styles
8

Bahrami, Mostafa, Hossein Javadikia, and Ebrahim Ebrahimi. "APPLICATION OF PATTERN RECOGNITION TECHNIQUES FOR FAULT DETECTION OF CLUTCH RETAINER OF TRACTOR." Journal of Mechanical Engineering 47, no. 1 (May 1, 2018): 31–36. http://dx.doi.org/10.3329/jme.v47i1.35356.

Full text
Abstract:
This study develops a technique based on pattern recognition for fault diagnosis of clutch retainer mechanism of MF285 tractor using the neural network. In this technique, time features and frequency domain features consist of Fast Fourier Transform (FFT) phase angle and Power Spectral Density (PSD) proposes to improve diagnosis ability. Three different cases, such as: normal condition, bearing wears and shaft wears were applied for signal processing. The data divides in two parts; in part one 70% data are dataset1 and in part two 30% for dataset2.At first, the artificial neural networks (ANN) are trained by 60% dataset1 and validated by 20% dataset1 and tested by 20% dataset1. Then, to more test of the proposed model, the network using the datasets2 are simulated. The results indicate effective ability in accurate diagnosis of various clutch retainer mechanism of MF285 tractor faults using pattern recognition networks.
APA, Harvard, Vancouver, ISO, and other styles
9

Bogaardt, Laurens, Romulo Goncalves, Raul Zurita-Milla, and Emma Izquierdo-Verdiguier. "Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets." ISPRS International Journal of Geo-Information 8, no. 2 (January 26, 2019): 55. http://dx.doi.org/10.3390/ijgi8020055.

Full text
Abstract:
The Singular Value Decomposition (SVD) is a mathematical procedure with multiple applications in the geosciences. For instance, it is used in dimensionality reduction and as a support operator for various analytical tasks applicable to spatio-temporal data. Performing SVD analyses on large datasets, however, can be computationally costly, time consuming, and sometimes practically infeasible. However, techniques exist to arrive at the same output, or at a close approximation, which requires far less effort. This article examines several such techniques in relation to the inherent scale of the structure within the data. When the values of a dataset vary slowly, e.g., in a spatial field of temperature over a country, there is autocorrelation and the field contains large scale structure. Datasets do not need a high resolution to describe such fields and their analysis can benefit from alternative SVD techniques based on rank deficiency, coarsening, or matrix factorization approaches. We use both simulated Gaussian Random Fields with various levels of autocorrelation and real-world geospatial datasets to illustrate our study while examining the accuracy of various SVD techniques. As the main result, this article provides researchers with a decision tree indicating which technique to use when and predicting the resulting level of accuracy based on the dataset’s structure scale.
APA, Harvard, Vancouver, ISO, and other styles
10

Yu, Ellen, Aparna Bhaskaran, Shang-Lin Chen, Zachary E. Ross, Egill Hauksson, and Robert W. Clayton. "Southern California Earthquake Data Now Available in the AWS Cloud." Seismological Research Letters 92, no. 5 (June 16, 2021): 3238–47. http://dx.doi.org/10.1785/0220210039.

Full text
Abstract:
Abstract The Southern California Earthquake Data Center is hosting its earthquake catalog and seismic waveform archive in the Amazon Web Services (AWS) Open Dataset Program (s3://scedc-pds; us-west-2 region). The cloud dataset’s high data availability and scalability facilitate research that uses large volumes of data and computationally intensive processing. We describe the data archive and our rationale for the formats and data organization. We provide two simple examples to show how storing the data in AWS Simple Storage Service can benefit the analysis of large datasets. We share usage statistics of our data during the first year in the AWS Open Dataset Program. We also discuss the challenges and opportunities of a cloud-hosted archive.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Datasety"

1

Zembjaková, Martina. "Prieskum a taxonómia sieťových forenzných nástrojov." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445488.

Full text
Abstract:
Táto diplomová práca sa zaoberá prieskumom a taxonómiou sieťových forenzných nástrojov. Popisuje základné informácie o sieťovej forenznej analýze, vrátane procesných modelov, techník a zdrojov dát používaných pri forenznej analýze. Ďalej práca obsahuje prieskum existujúcich taxonómií sieťových forenzných nástrojov vrátane ich porovnania, na ktorý naväzuje prieskum sieťových forenzných nástrojov. Diskutované sieťové nástroje obsahujú okrem nástrojov spomenutých v prieskume taxonómií aj niektoré ďalšie sieťové nástroje. Následne sú v práci detailne popísané a porovnané datasety, ktoré sú podkladom pre analýzu jednotlivými sieťovými nástrojmi. Podľa získaných informácií z vykonaných prieskumov sú navrhnuté časté prípady použitia a nástroje sú demonštrované v rámci popisu jednotlivých prípadov použitia. Na demonštrovanie nástrojov sú okrem verejne dostupných datasetov použité aj novo vytvorené datasety, ktoré sú detailne popísane vo vlastnej kapitole. Na základe získaných informácií je navrhnutá nová taxonómia, ktorá je založená na prípadoch použitia nástrojov na rozdiel od ostatných taxonómií založených na NFAT a NSM nástrojoch, uživateľskom rozhraní, zachytávaní dát, analýze, či type forenznej analýzy.
APA, Harvard, Vancouver, ISO, and other styles
2

Kratochvíla, Lukáš. "Trasování objektu v reálném čase." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403748.

Full text
Abstract:
Sledování obecného objektu na zařízení s omezenými prostředky v reálném čase je obtížné. Mnoho algoritmů věnujících se této problematice již existuje. V této práci se s nimi seznámíme. Různé přístupy k této problematice jsou diskutovány včetně hlubokého učení. Představeny jsou reprezentace objektu, datasety i metriky pro vyhodnocování. Mnoho sledovacích algorimů je představeno, osm z nich je implementováno a vyhodnoceno na VOT datasetu.
APA, Harvard, Vancouver, ISO, and other styles
3

Singh, Manjeet. "A Comparison of Rule Extraction Techniques with Emphasis on Heuristics for Imbalanced Datasets." Ohio University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1282139633.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Silva, Jesús, Palma Hugo Hernández, Núẽz William Niebles, David Ovallos-Gazabon, and Noel Varela. "Parallel Algorithm for Reduction of Data Processing Time in Big Data." Institute of Physics Publishing, 2020. http://hdl.handle.net/10757/652134.

Full text
Abstract:
Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today's applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics of current computer architectures to run several processes concurrently [2]. This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently.
APA, Harvard, Vancouver, ISO, and other styles
5

Munyombwe, Theresa. "The harmonisation of stroke datasets : a case study of four UK datasets." Thesis, University of Leeds, 2016. http://etheses.whiterose.ac.uk/13511/.

Full text
Abstract:
Longitudinal studies of stroke patients play a critical part in developing stroke prognostic models. Stroke longitudinal studies are often limited by small sample sizes, poor recruitment, and high attrition levels. Some of these limitations can be addressed by harmonising and pooling data from existing studies. Thus this thesis evaluated the feasibility of harmonising and pooling secondary stroke datasets to investigate the factors associated with disability after stroke. Data from the Clinical Information Management System for Stroke study (n=312), Stroke Outcome Study 1(n=448), Stroke Outcome Study 2 (n=585), and the Leeds Sentinel Stroke National Audit (n=350) were used in this research. The research conducted in this thesis consisted of four stages. The first stage used the Data Schema and Harmonisation Platform for Epidemiological Research (DataSHaPER) approach to evaluate the feasibility of harmonising and pooling the four datasets that were used in this case study. The second stage evaluated the utility of using multi-group-confirmatory-factor analysis for testing measurement invariance of the GHQ-28 measure prior to pooling the datasets. The third stage evaluated the utility of using Item Response Theory (IRT) models and regression- based methods for linking disability outcome measures. The last stage synthesised the harmonised datasets using multi-group latent class analysis and multi-level Poisson models to investigate the factors associated with disability post-stroke. The main barrier encountered in pooling the four datasets was the heterogeneity in outcome measures. Pooling datasets was beneficial but there was a trade-off between increasing the sample size and losing important covariates. The findings from this present study suggested that the GHQ-28 measure was invariant across the SOS1 and SOS2 stroke cohorts, thus an integrative data analysis of the two SOS datasets was conducted. Harmonising measurement scales using IRT models and regression-based methods was effective for predicting group averages and not individual patient predictions. The analyses of harmonised datasets suggested an association of female gender with anxiety and depressive symptoms post-stroke. This research concludes that harmonising and pooling data from multiple stroke studies was beneficial but there were challenges in measurement comparability. Continued efforts should be made to develop a Data Schema for stroke to facilitate data sharing in stroke rehabilitation research.
APA, Harvard, Vancouver, ISO, and other styles
6

Furman, Yoel Avraham. "Forecasting with large datasets." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:69f2833b-cc53-457a-8426-37c06df85bc2.

Full text
Abstract:
This thesis analyzes estimation methods and testing procedures for handling large data series. The first chapter introduces the use of the adaptive elastic net, and the penalized regression methods nested within it, for estimating sparse vector autoregressions. That chapter shows that under suitable conditions on the data generating process this estimation method satisfies an oracle property. Furthermore, it is shown that the bootstrap can be used to accurately conduct inference on the estimated parameters. These properties are used to show that structural VAR analysis can also be validly conducted, allowing for accurate measures of policy response. The strength of these estimation methods is demonstrated in a numerical study and on U.S. macroeconomic data. The second chapter continues in a similar vein, using the elastic net to estimate sparse vector autoregressions of realized variances to construct volatility forecasts. It is shown that the use of volatility spillovers estimated by the elastic net delivers substantial improvements in forecast ability, and can be used to indicate systemic risk among a group of assets. The model is estimated on realized variances of equities of U.S. financial institutions, where it is shown that the estimated parameters translate into two novel indicators of systemic risk. The third chapter discusses the use of the bootstrap as an alternative to asymptotic Wald-type tests. It is shown that the bootstrap is particularly useful in situations with many restrictions, such as tests of equal conditional predictive ability that make use of many orthogonal variables, or `test functions'. The testing procedure is analyzed in a Monte Carlo study and is used to test the relevance of real variables in forecasting U.S. inflation.
APA, Harvard, Vancouver, ISO, and other styles
7

Mumtaz, Shahzad. "Visualisation of bioinformatics datasets." Thesis, Aston University, 2015. http://publications.aston.ac.uk/25261/.

Full text
Abstract:
Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.
APA, Harvard, Vancouver, ISO, and other styles
8

Mazumdar, Suvodeep. "Visualising large semantic datasets." Thesis, University of Sheffield, 2013. http://etheses.whiterose.ac.uk/5932/.

Full text
Abstract:
This thesis aims at addressing a major issue in Semantic Web and organisational Knowledge Management: consuming large scale semantic data in a generic, scalable and pleasing manner. It proposes two solutions by de-constructing the issue into two sub problems: how can large semantic result sets be presented to users; and how can large semantic datasets be explored and queried. The first proposed solution is a dashboard-based multi-visualisation approach to present simultaneous views over different facets of the data. Challenges imposed by existing technology infrastructure resulted in the development of a set of design guidelines. These guidelines and lessons learnt from the development of the approach is the first contribution of this thesis. The next stage of research initiated with the formulation of design principles from aesthetic design, Visual Analytics and Semantic Web principles derived from the literature. These principles provide guidelines to developers for building generic visualisation solutions for large scale semantic data and constitute the next contribution of the thesis. The second proposed solution is an interactive node-link visualisation approach that presents semantic concepts and their relations enriched with statistics of the underlying data. This solution was developed with an explicit attention to the proposed design principles. The two solutions exploit basic rules and templates to translate low level user interactions into high level intents, and subsequently into formal queries in a generic manner. These translation rules and templates that enable generic exploration of large scale semantic data constitute the third contribution of the thesis. An iterative User-Centered Design methodology, with the active participation of nearly a hundred users including knowledge workers, managers, engineers, researchers and students over the duration of the research was employed to develop both solutions. The fourth contribution of this thesis is an argument for the continued active participation and involvement of all user communities to ensure the development of a highly effective, intuitive and appreciated solution.
APA, Harvard, Vancouver, ISO, and other styles
9

De, León Eduardo Enrique. "Medical abstract inference dataset." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/119516.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (page 35).
In this thesis, I built a dataset for predicting clinical outcomes from medical abstracts and their title. Medical Abstract Inference consists of 1,794 data points. Titles were filtered to include the abstract's reported medical intervention and clinical outcome. Data points were annotated with the interventions effect on the outcome. Resulting labels were one of the following: increased, decreased, or had no significant difference on the outcome. In addition, rationale sentences were marked, these sentences supply the necessary supporting evidence for the overall prediction. Preliminary modeling was also done to evaluate the corpus. Preliminary models included top performing Natural Language Inference models as well as Rationale based models and linear classifiers.
by Eduardo Enrique de León.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
10

Schöner, Holger. "Working with real world datasets preprocessing and prediction with large incomplete and heterogeneous datasets /." [S.l.] : [s.n.], 2005. http://deposit.ddb.de/cgi-bin/dokserv?idn=973424672.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Datasety"

1

1942-, Ullman Jeffrey D., ed. Mining of massive datasets. Cambridge: Cambridge University Press, 2012.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Hutchinson, T. P. The datasets from CBDEA. Adelaide: Rumsby Scientific Publishing, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Milanović, Branko. Dataset - racial tension, volume 6. [Washington, D.C: World Bank, 2005.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Quiñonero-Candela, Joaquin. Dataset shift in machine learning. Cambridge, MA: MIT Press, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Rymer, Thomas E. Data management for large volume datasets. S.l: s.n, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Drechsler, Jörg. Synthetic Datasets for Statistical Disclosure Control. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0326-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ravindra Babu, T., M. Narasimha Murty, and S. V. Subrahmanya. Compression Schemes for Mining Large Datasets. London: Springer London, 2013. http://dx.doi.org/10.1007/978-1-4471-5607-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Scanlon, S. The UK's networked dataset revolution continues. London: Library Information Technology Centre, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Mayeul, Kauffmann, and North Atlantic Treaty Organization. Public Diplomacy Division., eds. Building and using datasets on armed conflicts. Amsterdam, Netherlands: IOS Press, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Johnson, Lisa Kaye. Computer analysis of remote sensing and geologic datasets. Washington: Washington State University Department of Geology, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Datasety"

1

Xing, Yujie, Itishree Mohallick, Jon Atle Gulla, Özlem Özgöbek, and Lemei Zhang. "An Educational News Dataset for Recommender Systems." In ECML PKDD 2020 Workshops, 562–70. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-65965-3_39.

Full text
Abstract:
AbstractDatasets are an integral part of contemporary research on recommender systems. However, few datasets are available for conventional recommender systems and even very limited datasets are available when it comes to contextualized (time and location-dependent) News Recommender Systems. In this paper, we introduce an educational news dataset for recommender systems. This dataset is the refined version of the earlier published Adressa dataset and intends to support the university students in the educational purpose. We discuss the structure and purpose of the refined dataset in this paper.
APA, Harvard, Vancouver, ISO, and other styles
2

Klonovs, Juris, Mohammad A. Haque, Volker Krueger, Kamal Nasrollahi, Karen Andersen-Ranberg, Thomas B. Moeslund, and Erika G. Spaich. "Datasets." In Distributed Computing and Monitoring Technologies for Older Patients, 85–94. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-27024-1_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Stowell, Sarah. "Datasets." In Using R for Statistics, 209–16. Berkeley, CA: Apress, 2014. http://dx.doi.org/10.1007/978-1-4842-0139-8_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ahmed, Mahmuda, Sophia Karagiorgou, Dieter Pfoser, and Carola Wenk. "Datasets." In Map Construction Algorithms, 57–69. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-25166-0_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Luo, Ling. "Datasets." In Temporal Modelling of Customer Behaviour, 7–14. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-18289-2_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Yu, Johan. "Dataset." In Getting Started with Salesforce Einstein Analytics, 57–71. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-5200-0_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Frericks, Sebastian. "Dataset." In Downfall of Large German Listed Companies, 31–38. Wiesbaden: Springer Fachmedien Wiesbaden, 2018. http://dx.doi.org/10.1007/978-3-658-24999-1_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lantelme, Maximilian. "Dataset." In The Rise and Downfall of Germany’s Largest Family and Non-Family Businesses, 31–43. Wiesbaden: Springer Fachmedien Wiesbaden, 2016. http://dx.doi.org/10.1007/978-3-658-16169-9_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ahad, Md Atiqur Rahman. "Action Datasets." In Atlantis Ambient and Pervasive Intelligence, 147–72. Paris: Atlantis Press, 2011. http://dx.doi.org/10.2991/978-94-91216-20-6_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Al-Awadhi, Fahimah. "Simulated Datasets." In Encyclopedia of Social Network Analysis and Mining, 1743–49. New York, NY: Springer New York, 2014. http://dx.doi.org/10.1007/978-1-4614-6170-8_164.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Datasety"

1

Devi, M. S. Girija, and Manisha J. Nene. "Scarce Attack Datasets and Experimental Dataset Generation." In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). IEEE, 2018. http://dx.doi.org/10.1109/iceca.2018.8474612.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Swayamdipta, Swabha, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, and Yejin Choi. "Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.746.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Da Silva, Ronnypetson, Valter M. Filho, and Mario Souza. "Interaffection of Multiple Datasets with Neural Networks in Speech Emotion Recognition." In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2020. http://dx.doi.org/10.5753/eniac.2020.12141.

Full text
Abstract:
Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show that the longstanding belief of more data resulting in more general models doesn’t always hold for SER, as different dataset and meta-parameter combinations hold the best result for each of the analysed datasets.
APA, Harvard, Vancouver, ISO, and other styles
4

Fregin, Andreas, Julian Muller, Ulrich Krebel, and Klaus Dietmayer. "The DriveU Traffic Light Dataset: Introduction and Comparison with Existing Datasets." In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018. http://dx.doi.org/10.1109/icra.2018.8460737.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

"PUBMED DATASET: A JAVA LIBRARY FOR AUTOMATIC CONSTRUCTION OF EVALUATION DATASETS." In International Conference on Bioinformatics Models, Methods and Algorithms. SciTePress - Science and and Technology Publications, 2012. http://dx.doi.org/10.5220/0003797203430346.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Noon, Christian, and Eliot Winer. "A Study of Different Metamodeling Techniques for Conceptual Design." In ASME 2009 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. ASMEDC, 2009. http://dx.doi.org/10.1115/detc2009-86496.

Full text
Abstract:
Many high fidelity analysis tools including finite-element analysis and computational fluid dynamics have become an integral part of the design process. However, these tools were developed for detailed design and are inadequate for conceptual design due to complexity and turnaround time. With the development of more complex technologies and systems, decisions made earlier in the design process have become crucial to product success. Therefore, one possible alternative to high fidelity analysis tools for conceptual design is metamodeling. Metamodels generated upon high fidelity analysis datasets from previous design iterations show large potential to represent the overall trends of the dataset. To determine which metamodeling techniques were best suited to handle high fidelity datasets for conceptual design, an implementation scheme for incorporating Polynomial Response Surface (PRS) methods, Kriging Approximations, and Radial Basis Function Neural Networks (RBFNN) was developed. This paper presents the development of a conceptual design metamodeling strategy. Initially high fidelity legacy datasets were generated from FEA simulations. Metamodels were then built upon the legacy datasets. Finally, metamodel performance was evaluated based upon several dataset conditions including various sample sizes, dataset linearity, interpolation within a domain, and extrapolation outside a domain.
APA, Harvard, Vancouver, ISO, and other styles
7

Soares, Álysson De Sá, Ricardo Batista Das Neves Junior, and Byron Leite Dantas Bezerra. "BID Dataset: a challenge dataset for document processing tasks." In Conference on Graphics, Patterns and Images. Sociedade Brasileira de Computação, 2020. http://dx.doi.org/10.5753/sibgrapi.est.2020.12997.

Full text
Abstract:
The digital relationship between companies and customers happens through online systems where consumers must upload their identification documents pictures to prove their identities. The existence of this large volume of document images encourages the research development to generate image processing systems to automate tasks usually performed by humans, such as Document Type Classification and Document Reading. The lack of identification documents public datasets delays the research development in document image processing because researchers need to attempt partnerships with private or governmental institutions to obtain the data or build their dataset. In this context, this work presents as main contributions a system to support the automatic creation of identification document public datasets and the Brazilian Identity Document Dataset (BID Dataset): the first Brazilian identification documents public dataset. To accomplish the current personal data privacy law, all information in the BID Dataset comes from fake data. This work aims to increase the velocity of research development in identification document image processing, considering that researchers will be able to use the BID Dataset to develop their research freely.
APA, Harvard, Vancouver, ISO, and other styles
8

Chen, Zhanwen, Shiyao Li, Roxanne Rashedi, Xiaoman Zi, Morgan Elrod-Erickson, Bryan Hollis, Angela Maliakal, Xinyu Shen, Simeng Zhao, and Maithilee Kunda. "Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset." In 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob). IEEE, 2020. http://dx.doi.org/10.1109/icdl-epirob48136.2020.9278057.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Huy, Thach Nguyen, Sombut Foitong, Sornchai Udomthanapong, Ouen Pinngern, Sio-Iong Ao, Alan Hoi-Shou Chan, Hideki Katagiri, Osca Castillo, and Li Xu. "Effects of Distance between Classes and Training Dataset Size on Imbalance Datasets." In IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOLUME I: Special Edition of the International MultiConference of Engineers and Computer Scientists 2008. AIP, 2009. http://dx.doi.org/10.1063/1.3078140.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Xu, Marie-Anne, and Rahul Khanna. "Importance of the Single-Span Task Formulation to Extractive Question-answering." In 6th International Conference on Computer Science, Engineering And Applications (CSEA 2020). AIRCC Publishing Corporation, 2020. http://dx.doi.org/10.5121/csit.2020.101809.

Full text
Abstract:
Recent progress in machine reading comprehension and question-answering has allowed machines to reach and even surpass human question-answering. However, the majority of these questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset (21.492% F1) compared to the single-span source datasets (~33.36% F1). While the models tested on the source datasets were slightly fine-tuned, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in questionanswering and improve existing question-answering products and methods.
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Datasety"

1

Bishnu, Pariyar. Nepal Energy Gardens Qualitative Dataset and Quantitative Survey Dataset. University of Leeds. [Dataset]. Unknown, 2015. http://dx.doi.org/10.35648/20.500.12413/11781/ii112.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Comin, Diego, and Bart Hobijn. The CHAT Dataset. Cambridge, MA: National Bureau of Economic Research, September 2009. http://dx.doi.org/10.3386/w15319.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Woods, Ken. LiDAR Datasets of Alaska. DGGS, June 2013. http://dx.doi.org/10.14509/lidar.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ringgaard, Ida M., Kristine S. Madsen, Felix Müller, Laura Tuomi, Laura Rautiainen, and Marcello Passaro. Baltic+ SEAL: Dataset Description. ESA EO, March 2020. http://dx.doi.org/10.5270/esa.balticseal.ddv1.1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Weeding, Jennifer, and Mark Greenwood. Equine Glucose Data [dataset]. Montana State University ScholarWorks, 2016. http://dx.doi.org/10.15788/m2qp4r.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Matey, James R., George W. Quinn, and Patrick J. Grother. IREX validation dataset 2019. Gaithersburg, MD: National Institute of Standards and Technology, September 2019. http://dx.doi.org/10.6028/nist.tn.2058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Fitzhugh, Elizabeth, Eric Smith, and Sharon Ellis. 3D Visualizations of Abstract DataSets. Fort Belvoir, VA: Defense Technical Information Center, August 2010. http://dx.doi.org/10.21236/ada530801.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Alawini, Abdussalam. Identifying Relationships between Scientific Datasets. Portland State University Library, January 2000. http://dx.doi.org/10.15760/etd.2918.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Yurivilca, Rossemary. IDBG Climate Finance 2017 Dataset. Inter-American Development Bank, March 2019. http://dx.doi.org/10.18235/0001632.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ó Carragáin, Eoghan, Nuno Lopes, Rebecca Grant, and Catherine Ryan. Using the Linked Logainm dataset. Royal Irish Academy, September 2013. http://dx.doi.org/10.3318/dri.loder.2013.3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography