Thematische Bibliographien / Datasety

Inhaltsverzeichnis

Zeitschriftenartikel
Dissertationen
Bücher
Buchteile
Konferenzberichte
Berichte der Organisationen

Auswahl der wissenschaftlichen Literatur zum Thema „Datasety“

Autor: Grafiati

Veröffentlicht am 28. Juni 2021

Zuletzt aktualisiert am 7. Februar 2022

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Datasety" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Datasety"

Almeida, Daniela, Dany Domínguez-Pérez, Ana Matos, Guillermin Agüero-Chapin, Yuselis Castaño, Vitor Vasconcelos, Alexandre Campos und Agostinho Antunes. „Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus“. Data 5, Nr. 4 (27.11.2020): 110. http://dx.doi.org/10.3390/data5040110.

Der volle Inhalt der Quelle

Annotation:

Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP production.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Haider, S. A., und N. S. Patil. „Minimization of Datasets : Using a Master Interlinked Dataset“. Indian Journal of Computer Science 3, Nr. 5 (01.10.2018): 20. http://dx.doi.org/10.17010/ijcs/2018/v3/i5/138778.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Feng, Eric, und Xijin Ge. „DataViz: visualization of high-dimensional data in virtual reality“. F1000Research 7 (23.10.2018): 1687. http://dx.doi.org/10.12688/f1000research.16453.1.

Der volle Inhalt der Quelle

Annotation:

Virtual reality (VR) simulations promote interactivity and immersion, and provide an opportunity that may help researchers gain insights from complex datasets. To explore the utility and potential of VR in graphically rendering large datasets, we have developed an application for immersive, 3-dimensional (3D) scatter plots. Developed using the Unity development environment, DataViz enables the visualization of high-dimensional data with the HTC Vive, a relatively inexpensive and modern virtual reality headset available to the general public. DataViz has the following features: (1) principal component analysis (PCA) of the dataset; (2) graphical rendering of said dataset’s 3D projection onto its first three principal components; and (3) intuitive controls and instructions for using the application. As a use case, we applied DataViz to visualize a single-cell RNA-Seq dataset. DataViz can help gain insights from complex datasets by enabling interaction with high-dimensional data.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Chang, Nai Chen, Elissa Aminoff, John Pyles, Michael Tarr und Abhinav Gupta. „Scaling Up Neural Datasets: A public fMRI dataset of 5000 scenes“. Journal of Vision 18, Nr. 10 (01.09.2018): 732. http://dx.doi.org/10.1167/18.10.732.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Zhang, Yulian, und Shigeyuki Hamori. „Forecasting Crude Oil Market Crashes Using Machine Learning Technologies“. Energies 13, Nr. 10 (13.05.2020): 2440. http://dx.doi.org/10.3390/en13102440.

Der volle Inhalt der Quelle

Annotation:

To the best of our knowledge, this study provides new insight into the forecasting of crude oil futures price crashes in America, employing a moving window. One is the fixed-length window and the other is the expanding-length window, which has never been reported in the past. We aimed to investigate if there is any difference when historical data are discarded. As the explanatory variables, we adapted 13 variables to obtain two datasets, 16 explanatory variables for Dataset1 and 121 explanatory variables for Dataset2. We try to observe results from the different-sized sets of explanatory variables. Specifically, we leverage the merits of a series of machine learning techniques, which include random forests, logistic regression, support vector machines, and extreme gradient boosting (XGBoost). Finally, we employ the evaluation metrics that are broadly used to assess the discriminatory power of imbalanced datasets. Our results indicate that we should occasionally discard distant historical data, and that XGBoost outperforms the other employed approaches, achieving a detection rate as high as 86% using the fixed-length moving window for Dataset2.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Wang, Juan, Zhibin Zhang und Yanjuan Li. „Constructing Phylogenetic Networks Based on the Isomorphism of Datasets“. BioMed Research International 2016 (2016): 1–7. http://dx.doi.org/10.1155/2016/4236858.

Der volle Inhalt der Quelle

Annotation:

Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK,and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topologyGand construct a network for every dataset. For any one datasetC, we can compute a network from the network representing the simplest dataset which is isomorphic toC. This process will save more time for the algorithms when constructing networks.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Xie, Yanqing, Zhengqiang Li, Weizhen Hou, Jie Guang, Yan Ma, Yuyang Wang, Siheng Wang und Dong Yang. „Validation of FY-3D MERSI-2 Precipitable Water Vapor (PWV) Datasets Using Ground-Based PWV Data from AERONET“. Remote Sensing 13, Nr. 16 (16.08.2021): 3246. http://dx.doi.org/10.3390/rs13163246.

Der volle Inhalt der Quelle

Annotation:

The medium resolution spectral imager-2 (MERSI-2) is one of the most important sensors onboard China’s latest polar-orbiting meteorological satellite, Fengyun-3D (FY-3D). The National Satellite Meteorological Center of China Meteorological Administration has developed four precipitable water vapor (PWV) datasets using five near-infrared bands of MERSI-2, including the P905 dataset, P936 dataset, P940 dataset and the fusion dataset of the above three datasets. For the convenience of users, we comprehensively evaluate the quality of these PWV datasets with the ground-based PWV data derived from Aerosol Robotic Network. The validation results show that the P905, P936 and fused PWV datasets have relatively large systematic errors (−0.10, −0.11 and −0.07 g/cm2), whereas the systematic error of the P940 dataset (−0.02 g/cm2) is very small. According to the overall accuracy of these four PWV datasets by our assessments, they can be ranked in descending order as P940 dataset, fused dataset, P936 dataset and P905 dataset. The root mean square error (RMSE), relative error (RE) and percentage of retrieval results with error within ±(0.05+0.10∗PWVAERONET) (PER10) of the P940 PWV dataset are 0.24 g/cm2, 0.10 and 76.36%, respectively. The RMSE, RE and PER10 of the P905 PWV dataset are 0.38 g/cm2, 0.15 and 57.72%, respectively. In order to obtain a clearer understanding of the accuracy of these four MERSI-2 PWV datasets, we compare the accuracy of these four MERSI-2 PWV datasets with that of the widely used MODIS PWV dataset and AIRS PWV dataset. The results of the comparison show that the accuracy of the MODIS PWV dataset is not as good as that of all four MERSI-2 PWV datasets, due to the serious overestimation of the MODIS PWV dataset (0.40 g/cm2), and the accuracy of the AIRS PWV dataset is worse than that of the P940 and fused MERSI-2 PWV datasets. In addition, we analyze the error distribution of the four PWV datasets in different locations, seasons and water vapor content. Finally, the reason why the fused PWV dataset is not the one with the highest accuracy among the four PWV datasets is discussed.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bahrami, Mostafa, Hossein Javadikia und Ebrahim Ebrahimi. „APPLICATION OF PATTERN RECOGNITION TECHNIQUES FOR FAULT DETECTION OF CLUTCH RETAINER OF TRACTOR“. Journal of Mechanical Engineering 47, Nr. 1 (01.05.2018): 31–36. http://dx.doi.org/10.3329/jme.v47i1.35356.

Der volle Inhalt der Quelle

Annotation:

This study develops a technique based on pattern recognition for fault diagnosis of clutch retainer mechanism of MF285 tractor using the neural network. In this technique, time features and frequency domain features consist of Fast Fourier Transform (FFT) phase angle and Power Spectral Density (PSD) proposes to improve diagnosis ability. Three different cases, such as: normal condition, bearing wears and shaft wears were applied for signal processing. The data divides in two parts; in part one 70% data are dataset1 and in part two 30% for dataset2.At first, the artificial neural networks (ANN) are trained by 60% dataset1 and validated by 20% dataset1 and tested by 20% dataset1. Then, to more test of the proposed model, the network using the datasets2 are simulated. The results indicate effective ability in accurate diagnosis of various clutch retainer mechanism of MF285 tractor faults using pattern recognition networks.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bogaardt, Laurens, Romulo Goncalves, Raul Zurita-Milla und Emma Izquierdo-Verdiguier. „Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets“. ISPRS International Journal of Geo-Information 8, Nr. 2 (26.01.2019): 55. http://dx.doi.org/10.3390/ijgi8020055.

Der volle Inhalt der Quelle

Annotation:

The Singular Value Decomposition (SVD) is a mathematical procedure with multiple applications in the geosciences. For instance, it is used in dimensionality reduction and as a support operator for various analytical tasks applicable to spatio-temporal data. Performing SVD analyses on large datasets, however, can be computationally costly, time consuming, and sometimes practically infeasible. However, techniques exist to arrive at the same output, or at a close approximation, which requires far less effort. This article examines several such techniques in relation to the inherent scale of the structure within the data. When the values of a dataset vary slowly, e.g., in a spatial field of temperature over a country, there is autocorrelation and the field contains large scale structure. Datasets do not need a high resolution to describe such fields and their analysis can benefit from alternative SVD techniques based on rank deficiency, coarsening, or matrix factorization approaches. We use both simulated Gaussian Random Fields with various levels of autocorrelation and real-world geospatial datasets to illustrate our study while examining the accuracy of various SVD techniques. As the main result, this article provides researchers with a decision tree indicating which technique to use when and predicting the resulting level of accuracy based on the dataset’s structure scale.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Yu, Ellen, Aparna Bhaskaran, Shang-Lin Chen, Zachary E. Ross, Egill Hauksson und Robert W. Clayton. „Southern California Earthquake Data Now Available in the AWS Cloud“. Seismological Research Letters 92, Nr. 5 (16.06.2021): 3238–47. http://dx.doi.org/10.1785/0220210039.

Der volle Inhalt der Quelle

Annotation:

Abstract The Southern California Earthquake Data Center is hosting its earthquake catalog and seismic waveform archive in the Amazon Web Services (AWS) Open Dataset Program (s3://scedc-pds; us-west-2 region). The cloud dataset’s high data availability and scalability facilitate research that uses large volumes of data and computationally intensive processing. We describe the data archive and our rationale for the formats and data organization. We provide two simple examples to show how storing the data in AWS Simple Storage Service can benefit the analysis of large datasets. We share usage statistics of our data during the first year in the AWS Open Dataset Program. We also discuss the challenges and opportunities of a cloud-hosted archive.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mehr Quellen

Dissertationen zum Thema "Datasety"

Zembjaková, Martina. „Prieskum a taxonómia sieťových forenzných nástrojov“. Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445488.

Der volle Inhalt der Quelle

Annotation:

Táto diplomová práca sa zaoberá prieskumom a taxonómiou sieťových forenzných nástrojov. Popisuje základné informácie o sieťovej forenznej analýze, vrátane procesných modelov, techník a zdrojov dát používaných pri forenznej analýze. Ďalej práca obsahuje prieskum existujúcich taxonómií sieťových forenzných nástrojov vrátane ich porovnania, na ktorý naväzuje prieskum sieťových forenzných nástrojov. Diskutované sieťové nástroje obsahujú okrem nástrojov spomenutých v prieskume taxonómií aj niektoré ďalšie sieťové nástroje. Následne sú v práci detailne popísané a porovnané datasety, ktoré sú podkladom pre analýzu jednotlivými sieťovými nástrojmi. Podľa získaných informácií z vykonaných prieskumov sú navrhnuté časté prípady použitia a nástroje sú demonštrované v rámci popisu jednotlivých prípadov použitia. Na demonštrovanie nástrojov sú okrem verejne dostupných datasetov použité aj novo vytvorené datasety, ktoré sú detailne popísane vo vlastnej kapitole. Na základe získaných informácií je navrhnutá nová taxonómia, ktorá je založená na prípadoch použitia nástrojov na rozdiel od ostatných taxonómií založených na NFAT a NSM nástrojoch, uživateľskom rozhraní, zachytávaní dát, analýze, či type forenznej analýzy.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Kratochvíla, Lukáš. „Trasování objektu v reálném čase“. Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403748.

Der volle Inhalt der Quelle

Annotation:

Sledování obecného objektu na zařízení s omezenými prostředky v reálném čase je obtížné. Mnoho algoritmů věnujících se této problematice již existuje. V této práci se s nimi seznámíme. Různé přístupy k této problematice jsou diskutovány včetně hlubokého učení. Představeny jsou reprezentace objektu, datasety i metriky pro vyhodnocování. Mnoho sledovacích algorimů je představeno, osm z nich je implementováno a vyhodnoceno na VOT datasetu.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Singh, Manjeet. „A Comparison of Rule Extraction Techniques with Emphasis on Heuristics for Imbalanced Datasets“. Ohio University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1282139633.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Silva, Jesús, Palma Hugo Hernández, Núẽz William Niebles, David Ovallos-Gazabon und Noel Varela. „Parallel Algorithm for Reduction of Data Processing Time in Big Data“. Institute of Physics Publishing, 2020. http://hdl.handle.net/10757/652134.

Der volle Inhalt der Quelle

Annotation:

Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today's applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics of current computer architectures to run several processes concurrently [2]. This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Munyombwe, Theresa. „The harmonisation of stroke datasets : a case study of four UK datasets“. Thesis, University of Leeds, 2016. http://etheses.whiterose.ac.uk/13511/.

Der volle Inhalt der Quelle

Annotation:

Longitudinal studies of stroke patients play a critical part in developing stroke prognostic models. Stroke longitudinal studies are often limited by small sample sizes, poor recruitment, and high attrition levels. Some of these limitations can be addressed by harmonising and pooling data from existing studies. Thus this thesis evaluated the feasibility of harmonising and pooling secondary stroke datasets to investigate the factors associated with disability after stroke. Data from the Clinical Information Management System for Stroke study (n=312), Stroke Outcome Study 1(n=448), Stroke Outcome Study 2 (n=585), and the Leeds Sentinel Stroke National Audit (n=350) were used in this research. The research conducted in this thesis consisted of four stages. The first stage used the Data Schema and Harmonisation Platform for Epidemiological Research (DataSHaPER) approach to evaluate the feasibility of harmonising and pooling the four datasets that were used in this case study. The second stage evaluated the utility of using multi-group-confirmatory-factor analysis for testing measurement invariance of the GHQ-28 measure prior to pooling the datasets. The third stage evaluated the utility of using Item Response Theory (IRT) models and regression- based methods for linking disability outcome measures. The last stage synthesised the harmonised datasets using multi-group latent class analysis and multi-level Poisson models to investigate the factors associated with disability post-stroke. The main barrier encountered in pooling the four datasets was the heterogeneity in outcome measures. Pooling datasets was beneficial but there was a trade-off between increasing the sample size and losing important covariates. The findings from this present study suggested that the GHQ-28 measure was invariant across the SOS1 and SOS2 stroke cohorts, thus an integrative data analysis of the two SOS datasets was conducted. Harmonising measurement scales using IRT models and regression-based methods was effective for predicting group averages and not individual patient predictions. The analyses of harmonised datasets suggested an association of female gender with anxiety and depressive symptoms post-stroke. This research concludes that harmonising and pooling data from multiple stroke studies was beneficial but there were challenges in measurement comparability. Continued efforts should be made to develop a Data Schema for stroke to facilitate data sharing in stroke rehabilitation research.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Furman, Yoel Avraham. „Forecasting with large datasets“. Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:69f2833b-cc53-457a-8426-37c06df85bc2.

Der volle Inhalt der Quelle

Annotation:

This thesis analyzes estimation methods and testing procedures for handling large data series. The first chapter introduces the use of the adaptive elastic net, and the penalized regression methods nested within it, for estimating sparse vector autoregressions. That chapter shows that under suitable conditions on the data generating process this estimation method satisfies an oracle property. Furthermore, it is shown that the bootstrap can be used to accurately conduct inference on the estimated parameters. These properties are used to show that structural VAR analysis can also be validly conducted, allowing for accurate measures of policy response. The strength of these estimation methods is demonstrated in a numerical study and on U.S. macroeconomic data. The second chapter continues in a similar vein, using the elastic net to estimate sparse vector autoregressions of realized variances to construct volatility forecasts. It is shown that the use of volatility spillovers estimated by the elastic net delivers substantial improvements in forecast ability, and can be used to indicate systemic risk among a group of assets. The model is estimated on realized variances of equities of U.S. financial institutions, where it is shown that the estimated parameters translate into two novel indicators of systemic risk. The third chapter discusses the use of the bootstrap as an alternative to asymptotic Wald-type tests. It is shown that the bootstrap is particularly useful in situations with many restrictions, such as tests of equal conditional predictive ability that make use of many orthogonal variables, or `test functions'. The testing procedure is analyzed in a Monte Carlo study and is used to test the relevance of real variables in forecasting U.S. inflation.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mumtaz, Shahzad. „Visualisation of bioinformatics datasets“. Thesis, Aston University, 2015. http://publications.aston.ac.uk/25261/.

Der volle Inhalt der Quelle

Annotation:

Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mazumdar, Suvodeep. „Visualising large semantic datasets“. Thesis, University of Sheffield, 2013. http://etheses.whiterose.ac.uk/5932/.

Der volle Inhalt der Quelle

Annotation:

This thesis aims at addressing a major issue in Semantic Web and organisational Knowledge Management: consuming large scale semantic data in a generic, scalable and pleasing manner. It proposes two solutions by de-constructing the issue into two sub problems: how can large semantic result sets be presented to users; and how can large semantic datasets be explored and queried. The first proposed solution is a dashboard-based multi-visualisation approach to present simultaneous views over different facets of the data. Challenges imposed by existing technology infrastructure resulted in the development of a set of design guidelines. These guidelines and lessons learnt from the development of the approach is the first contribution of this thesis. The next stage of research initiated with the formulation of design principles from aesthetic design, Visual Analytics and Semantic Web principles derived from the literature. These principles provide guidelines to developers for building generic visualisation solutions for large scale semantic data and constitute the next contribution of the thesis. The second proposed solution is an interactive node-link visualisation approach that presents semantic concepts and their relations enriched with statistics of the underlying data. This solution was developed with an explicit attention to the proposed design principles. The two solutions exploit basic rules and templates to translate low level user interactions into high level intents, and subsequently into formal queries in a generic manner. These translation rules and templates that enable generic exploration of large scale semantic data constitute the third contribution of the thesis. An iterative User-Centered Design methodology, with the active participation of nearly a hundred users including knowledge workers, managers, engineers, researchers and students over the duration of the research was employed to develop both solutions. The fourth contribution of this thesis is an argument for the continued active participation and involvement of all user communities to ensure the development of a highly effective, intuitive and appreciated solution.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

De, León Eduardo Enrique. „Medical abstract inference dataset“. Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/119516.

Der volle Inhalt der Quelle

Annotation:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (page 35).
In this thesis, I built a dataset for predicting clinical outcomes from medical abstracts and their title. Medical Abstract Inference consists of 1,794 data points. Titles were filtered to include the abstract's reported medical intervention and clinical outcome. Data points were annotated with the interventions effect on the outcome. Resulting labels were one of the following: increased, decreased, or had no significant difference on the outcome. In addition, rationale sentences were marked, these sentences supply the necessary supporting evidence for the overall prediction. Preliminary modeling was also done to evaluate the corpus. Preliminary models included top performing Natural Language Inference models as well as Rationale based models and linear classifiers.
by Eduardo Enrique de León.
M. Eng.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Schöner, Holger. „Working with real world datasets preprocessing and prediction with large incomplete and heterogeneous datasets /“. [S.l.] : [s.n.], 2005. http://deposit.ddb.de/cgi-bin/dokserv?idn=973424672.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mehr Quellen

Bücher zum Thema "Datasety"

1942-, Ullman Jeffrey D., Hrsg. Mining of massive datasets. Cambridge: Cambridge University Press, 2012.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Hutchinson, T. P. The datasets from CBDEA. Adelaide: Rumsby Scientific Publishing, 1993.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Milanović, Branko. Dataset - racial tension, volume 6. [Washington, D.C: World Bank, 2005.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Quiñonero-Candela, Joaquin. Dataset shift in machine learning. Cambridge, MA: MIT Press, 2009.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Rymer, Thomas E. Data management for large volume datasets. S.l: s.n, 1986.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Drechsler, Jörg. Synthetic Datasets for Statistical Disclosure Control. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0326-5.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Ravindra Babu, T., M. Narasimha Murty und S. V. Subrahmanya. Compression Schemes for Mining Large Datasets. London: Springer London, 2013. http://dx.doi.org/10.1007/978-1-4471-5607-9.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Scanlon, S. The UK's networked dataset revolution continues. London: Library Information Technology Centre, 1993.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mayeul, Kauffmann, und North Atlantic Treaty Organization. Public Diplomacy Division., Hrsg. Building and using datasets on armed conflicts. Amsterdam, Netherlands: IOS Press, 2008.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Johnson, Lisa Kaye. Computer analysis of remote sensing and geologic datasets. Washington: Washington State University Department of Geology, 1988.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mehr Quellen

Buchteile zum Thema "Datasety"

Xing, Yujie, Itishree Mohallick, Jon Atle Gulla, Özlem Özgöbek und Lemei Zhang. „An Educational News Dataset for Recommender Systems“. In ECML PKDD 2020 Workshops, 562–70. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-65965-3_39.

Der volle Inhalt der Quelle

Annotation:

AbstractDatasets are an integral part of contemporary research on recommender systems. However, few datasets are available for conventional recommender systems and even very limited datasets are available when it comes to contextualized (time and location-dependent) News Recommender Systems. In this paper, we introduce an educational news dataset for recommender systems. This dataset is the refined version of the earlier published Adressa dataset and intends to support the university students in the educational purpose. We discuss the structure and purpose of the refined dataset in this paper.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Klonovs, Juris, Mohammad A. Haque, Volker Krueger, Kamal Nasrollahi, Karen Andersen-Ranberg, Thomas B. Moeslund und Erika G. Spaich. „Datasets“. In Distributed Computing and Monitoring Technologies for Older Patients, 85–94. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-27024-1_5.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Stowell, Sarah. „Datasets“. In Using R for Statistics, 209–16. Berkeley, CA: Apress, 2014. http://dx.doi.org/10.1007/978-1-4842-0139-8_14.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Ahmed, Mahmuda, Sophia Karagiorgou, Dieter Pfoser und Carola Wenk. „Datasets“. In Map Construction Algorithms, 57–69. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-25166-0_5.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Luo, Ling. „Datasets“. In Temporal Modelling of Customer Behaviour, 7–14. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-18289-2_2.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Yu, Johan. „Dataset“. In Getting Started with Salesforce Einstein Analytics, 57–71. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-5200-0_4.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Frericks, Sebastian. „Dataset“. In Downfall of Large German Listed Companies, 31–38. Wiesbaden: Springer Fachmedien Wiesbaden, 2018. http://dx.doi.org/10.1007/978-3-658-24999-1_3.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Lantelme, Maximilian. „Dataset“. In The Rise and Downfall of Germany’s Largest Family and Non-Family Businesses, 31–43. Wiesbaden: Springer Fachmedien Wiesbaden, 2016. http://dx.doi.org/10.1007/978-3-658-16169-9_4.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Ahad, Md Atiqur Rahman. „Action Datasets“. In Atlantis Ambient and Pervasive Intelligence, 147–72. Paris: Atlantis Press, 2011. http://dx.doi.org/10.2991/978-94-91216-20-6_6.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Al-Awadhi, Fahimah. „Simulated Datasets“. In Encyclopedia of Social Network Analysis and Mining, 1743–49. New York, NY: Springer New York, 2014. http://dx.doi.org/10.1007/978-1-4614-6170-8_164.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Datasety"

Devi, M. S. Girija, und Manisha J. Nene. „Scarce Attack Datasets and Experimental Dataset Generation“. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). IEEE, 2018. http://dx.doi.org/10.1109/iceca.2018.8474612.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Swayamdipta, Swabha, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith und Yejin Choi. „Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics“. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.746.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Da Silva, Ronnypetson, Valter M. Filho und Mario Souza. „Interaffection of Multiple Datasets with Neural Networks in Speech Emotion Recognition“. In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2020. http://dx.doi.org/10.5753/eniac.2020.12141.

Der volle Inhalt der Quelle

Annotation:

Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show that the longstanding belief of more data resulting in more general models doesn’t always hold for SER, as different dataset and meta-parameter combinations hold the best result for each of the analysed datasets.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Fregin, Andreas, Julian Muller, Ulrich Krebel und Klaus Dietmayer. „The DriveU Traffic Light Dataset: Introduction and Comparison with Existing Datasets“. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018. http://dx.doi.org/10.1109/icra.2018.8460737.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

„PUBMED DATASET: A JAVA LIBRARY FOR AUTOMATIC CONSTRUCTION OF EVALUATION DATASETS“. In International Conference on Bioinformatics Models, Methods and Algorithms. SciTePress - Science and and Technology Publications, 2012. http://dx.doi.org/10.5220/0003797203430346.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Noon, Christian, und Eliot Winer. „A Study of Different Metamodeling Techniques for Conceptual Design“. In ASME 2009 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. ASMEDC, 2009. http://dx.doi.org/10.1115/detc2009-86496.

Der volle Inhalt der Quelle

Annotation:

Many high fidelity analysis tools including finite-element analysis and computational fluid dynamics have become an integral part of the design process. However, these tools were developed for detailed design and are inadequate for conceptual design due to complexity and turnaround time. With the development of more complex technologies and systems, decisions made earlier in the design process have become crucial to product success. Therefore, one possible alternative to high fidelity analysis tools for conceptual design is metamodeling. Metamodels generated upon high fidelity analysis datasets from previous design iterations show large potential to represent the overall trends of the dataset. To determine which metamodeling techniques were best suited to handle high fidelity datasets for conceptual design, an implementation scheme for incorporating Polynomial Response Surface (PRS) methods, Kriging Approximations, and Radial Basis Function Neural Networks (RBFNN) was developed. This paper presents the development of a conceptual design metamodeling strategy. Initially high fidelity legacy datasets were generated from FEA simulations. Metamodels were then built upon the legacy datasets. Finally, metamodel performance was evaluated based upon several dataset conditions including various sample sizes, dataset linearity, interpolation within a domain, and extrapolation outside a domain.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Soares, Álysson De Sá, Ricardo Batista Das Neves Junior und Byron Leite Dantas Bezerra. „BID Dataset: a challenge dataset for document processing tasks“. In Conference on Graphics, Patterns and Images. Sociedade Brasileira de Computação, 2020. http://dx.doi.org/10.5753/sibgrapi.est.2020.12997.

Der volle Inhalt der Quelle

Annotation:

The digital relationship between companies and customers happens through online systems where consumers must upload their identification documents pictures to prove their identities. The existence of this large volume of document images encourages the research development to generate image processing systems to automate tasks usually performed by humans, such as Document Type Classification and Document Reading. The lack of identification documents public datasets delays the research development in document image processing because researchers need to attempt partnerships with private or governmental institutions to obtain the data or build their dataset. In this context, this work presents as main contributions a system to support the automatic creation of identification document public datasets and the Brazilian Identity Document Dataset (BID Dataset): the first Brazilian identification documents public dataset. To accomplish the current personal data privacy law, all information in the BID Dataset comes from fake data. This work aims to increase the velocity of research development in identification document image processing, considering that researchers will be able to use the BID Dataset to develop their research freely.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Chen, Zhanwen, Shiyao Li, Roxanne Rashedi, Xiaoman Zi, Morgan Elrod-Erickson, Bryan Hollis, Angela Maliakal, Xinyu Shen, Simeng Zhao und Maithilee Kunda. „Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset“. In 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob). IEEE, 2020. http://dx.doi.org/10.1109/icdl-epirob48136.2020.9278057.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Huy, Thach Nguyen, Sombut Foitong, Sornchai Udomthanapong, Ouen Pinngern, Sio-Iong Ao, Alan Hoi-Shou Chan, Hideki Katagiri, Osca Castillo und Li Xu. „Effects of Distance between Classes and Training Dataset Size on Imbalance Datasets“. In IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOLUME I: Special Edition of the International MultiConference of Engineers and Computer Scientists 2008. AIP, 2009. http://dx.doi.org/10.1063/1.3078140.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Xu, Marie-Anne, und Rahul Khanna. „Importance of the Single-Span Task Formulation to Extractive Question-answering“. In 6th International Conference on Computer Science, Engineering And Applications (CSEA 2020). AIRCC Publishing Corporation, 2020. http://dx.doi.org/10.5121/csit.2020.101809.

Der volle Inhalt der Quelle

Annotation:

Recent progress in machine reading comprehension and question-answering has allowed machines to reach and even surpass human question-answering. However, the majority of these questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset (21.492% F1) compared to the single-span source datasets (~33.36% F1). While the models tested on the source datasets were slightly fine-tuned, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in questionanswering and improve existing question-answering products and methods.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Berichte der Organisationen zum Thema "Datasety"

Bishnu, Pariyar. Nepal Energy Gardens Qualitative Dataset and Quantitative Survey Dataset. University of Leeds. [Dataset]. Unknown, 2015. http://dx.doi.org/10.35648/20.500.12413/11781/ii112.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Comin, Diego, und Bart Hobijn. The CHAT Dataset. Cambridge, MA: National Bureau of Economic Research, September 2009. http://dx.doi.org/10.3386/w15319.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Woods, Ken. LiDAR Datasets of Alaska. DGGS, Juni 2013. http://dx.doi.org/10.14509/lidar.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Ringgaard, Ida M., Kristine S. Madsen, Felix Müller, Laura Tuomi, Laura Rautiainen und Marcello Passaro. Baltic+ SEAL: Dataset Description. ESA EO, März 2020. http://dx.doi.org/10.5270/esa.balticseal.ddv1.1.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Weeding, Jennifer, und Mark Greenwood. Equine Glucose Data [dataset]. Montana State University ScholarWorks, 2016. http://dx.doi.org/10.15788/m2qp4r.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Matey, James R., George W. Quinn und Patrick J. Grother. IREX validation dataset 2019. Gaithersburg, MD: National Institute of Standards and Technology, September 2019. http://dx.doi.org/10.6028/nist.tn.2058.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Fitzhugh, Elizabeth, Eric Smith und Sharon Ellis. 3D Visualizations of Abstract DataSets. Fort Belvoir, VA: Defense Technical Information Center, August 2010. http://dx.doi.org/10.21236/ada530801.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Alawini, Abdussalam. Identifying Relationships between Scientific Datasets. Portland State University Library, Januar 2000. http://dx.doi.org/10.15760/etd.2918.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Yurivilca, Rossemary. IDBG Climate Finance 2017 Dataset. Inter-American Development Bank, März 2019. http://dx.doi.org/10.18235/0001632.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Ó Carragáin, Eoghan, Nuno Lopes, Rebecca Grant und Catherine Ryan. Using the Linked Logainm dataset. Royal Irish Academy, September 2013. http://dx.doi.org/10.3318/dri.loder.2013.3.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!