Tesi sul tema "Data / features engineering"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Data / features engineering".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Mohammed, Hussein Syed. "Random feature subspace ensemble based approaches for the analysis of data with missing features /". Full text available online, 2006. http://www.lib.rowan.edu/find/theses.
Testo completoBaik, Edward H. (Edward Hyeen). "Surface-based segmentation of volume data using texture features". Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43516.
Testo completoIncludes bibliographical references (p. 117-123).
by Edward H. Baik.
M.Eng.
Campbell, Richard John. "Recognition of free-form 3D objects in range data using global and local features /". The Ohio State University, 2001. http://rave.ohiolink.edu/etdc/view?acc_num=osu1486397841221694.
Testo completoOldfield, Robin B. "Lithological mapping of Northwest Argentina with remote sensing data using tonal, textural and contextual features". Thesis, Aston University, 1988. http://publications.aston.ac.uk/14287/.
Testo completoMora, Omar Ernesto. "Morphology-Based Identification of Surface Features to Support Landslide Hazard Detection Using Airborne LiDAR Data". The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1429861576.
Testo completoFridley, Lila (Lila J. ). "Improving online demand forecast using novel features in website data : a case study at Zara". Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117976.
Testo completoThesis: S.M., Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, in conjunction with the Leaders for Global Operations Program at MIT, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references (page 77).
The challenge of improving retail inventory customer service level while reducing costs is common across many retailers. This problem is typically addressed through efficient supply chain operations. This thesis discusses the development of new methodologies to predict e-commerce consumer demand for seasonal, short life-cycle articles. The new methodology incorporates novel data to predict demand of existing products through a bottom-up point forecast at the color and location level. It addresses the widely observed challenge of forecasting censored demand during a stock out. Zara introduces thousands of new items each season across over 2100 stores in 93 markets worldwide [1]. The Zara Distribution team is responsible for allocating inventory to each physical and e-commerce store. In line with Zara's quick to retail strategy, Distribution is flexible and responsive in forecasting store demand, with new styles arriving in stores twice per week [1]. The company is interested in improving the demand forecast by leveraging the novel e-commerce data that has become available since the launch of Zara.com in 2010 [2]. The results of this thesis demonstrate that the addition of new data to a linear regression model reduces prediction error by an average of 16% for e-commerce articles experiencing censored demand during a stock out, in comparison to traditional methods. Expanding the scope to all e-commerce articles, this thesis demonstrates that incorporating easily accessible web data yields an additional 2% error reduction on average for all articles on a color and location basis. Traditional methods to improve demand prediction have not before leveraged the expansive availability of e-commerce data, and this research presents a novel solution to the fashion forecasting challenge. This thesis project may additionally be used as a case-study for companies using subscriptions or an analogous tracking tool, as well as novel data features, in a user-friendly and implementable demand forecast model.
by Lila Fridley.
M.B.A.
S.M.
Wang, Ziang. "People Matching for Transportation Planning Using Optimized Features and Texel Camera Data for Sequential Estimation". DigitalCommons@USU, 2012. https://digitalcommons.usu.edu/etd/1298.
Testo completoKatzwinkel, Tim, Bhavinbhai Patel, Alexander Schmid, Walter Schmidt, Justus Siebrecht, Manuel Löwer e Jörg Feldhusen. "Kosteneffiziente Technologien zur geometrischen Datenaufnahme im digitalen Reverse Engineering". Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-215118.
Testo completoFabijan, Aleksander. "Developing the right features : the role and impact of customer and product data in software product development". Licentiate thesis, Malmö högskola, Fakulteten för teknik och samhälle (TS), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-7794.
Testo completoErdogan, Ozgur. "Main Seismological Features Of Recently Compiled Turkish Strong Motion Database". Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/3/12609679/index.pdf.
Testo completoJin, Chao. "Methodology on Exact Extraction of Time Series Features for Robust Prognostics and Health Monitoring". University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504795992214385.
Testo completoMehta, Alok. "Evolving legacy system's features into fine-grained components using regression test-cases". Link to electronic thesis, 2002. http://www.wpi.edu/Pubs/ETD/Available/etd-1211102-163800.
Testo completoKeywords: software maintenance; software evolution; regression test-cases; components; legacy system; incremental software evolution methodology; fine-grained components. Includes bibliographical references (p. 283-294).
Hounsell, Marcelo da Silva. "Feature-based validation reasoning for intent-driven engineering design". Thesis, Loughborough University, 1998. https://dspace.lboro.ac.uk/2134/33152.
Testo completoLee, Nien-Lung. "Feature Recognition From Scanned Data Points /". The Ohio State University, 1995. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487868114111376.
Testo completoDavis, Jonathan J. "Machine learning and feature engineering for computer network security". Thesis, Queensland University of Technology, 2017. https://eprints.qut.edu.au/106914/1/Jonathan_Davis_Thesis.pdf.
Testo completoRamanayaka, Mudiyanselage Asanga. "Data Engineering and Failure Prediction for Hard Drive S.M.A.R.T. Data". Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1594957948648404.
Testo completoSarkar, Saurabh. "Feature Selection with Missing Data". University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1378194989.
Testo completoAl-Sit, Waleed. "Automatic feature detection and interpretation in borehole data". Thesis, University of Liverpool, 2015. http://livrepository.liverpool.ac.uk/2014181/.
Testo completoAbdalla, Hassan Shafik. "Development of a design for manufacture concurrent engineering system". Thesis, De Montfort University, 1995. http://hdl.handle.net/2086/4253.
Testo completoNi, Weizeng. "Ontology-based Feature Construction on Non-structured Data". University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340.
Testo completoSarkar, Biplab. "Modeling and manufacturing of multiple featured objects based on measurement data /". The Ohio State University, 1991. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487757723996478.
Testo completoMuteba, Ben Ilunga. "Data Science techniques for predicting plant genes involved in secondary metabolites production". University of the Western Cape, 2018. http://hdl.handle.net/11394/7039.
Testo completoPlant genome analysis is currently experiencing a boost due to reduced costs associated with the development of next generation sequencing technologies. Knowledge on genetic background can be applied to guide targeted plant selection and breeding, and to facilitate natural product discovery and biological engineering. In medicinal plants, secondary metabolites are of particular interest because they often represent the main active ingredients associated with health-promoting qualities. Plant polyphenols are a highly diverse family of aromatic secondary metabolites that act as antimicrobial agents, UV protectants, and insect or herbivore repellents. Most of the genome mining tools developed to understand genetic materials have very seldom addressed secondary metabolite genes and biosynthesis pathways. Little significant research has been conducted to study key enzyme factors that can predict a class of secondary metabolite genes from polyketide synthases. The objectives of this study were twofold: Primarily, it aimed to identify the biological properties of secondary metabolite genes and the selection of a specific gene, naringenin-chalcone synthase or chalcone synthase (CHS). The study hypothesized that data science approaches in mining biological data, particularly secondary metabolite genes, would enable the compulsory disclosure of some aspects of secondary metabolite (SM). Secondarily, the aim was to propose a proof of concept for classifying or predicting plant genes involved in polyphenol biosynthesis from data science techniques and convey these techniques in computational analysis through machine learning algorithms and mathematical and statistical approaches. Three specific challenges experienced while analysing secondary metabolite datasets were: 1) class imbalance, which refers to lack of proportionality among protein sequence classes; 2) high dimensionality, which alludes to a phenomenon feature space that arises when analysing bioinformatics datasets; and 3) the difference in protein sequences lengths, which alludes to a phenomenon that protein sequences have different lengths. Considering these inherent issues, developing precise classification models and statistical models proves a challenge. Therefore, the prerequisite for effective SM plant gene mining is dedicated data science techniques that can collect, prepare and analyse SM genes.
Khazem, Salim. "Apprentissage profond et traitement d'images pour la détection et la prédiction des nœuds au cœur des rondins". Electronic Thesis or Diss., CentraleSupélec, 2024. http://www.theses.fr/2024CSUP0016.
Testo completoIn the wood industry, the quality of logs is heavily influenced by their internal structure, particularly the distribution of defects, especially knots within the trees. Accurately detecting these knots, which result from branch growth, can significantly enhance the industry's efficiency by reducing waste and optimizing the quality of wood products. Traditionally, identifying knots and other internal characteristics of logs, such as centers and contours, requires specialized equipment like CT scanners, often combined with conventional computer vision approaches to obtain detailed images of the trees' internal structure. The main challenge is that such equipment is costly and not accessible to all companies, limiting its adoption in the industry. This thesis focuses on addressing this issue, particularly on detecting internal defects based on the external surface of logs. The initial goal is to automate the detection of various log characteristics. These characteristics will then be used to perform the main task, which involves utilizing contour variations to detect the distribution of internal defects. One of the contributions of this work is the automation of detecting the semantic characteristics of trees using X-ray images. We establish that deep learning-based methods can perform well in detection and generalize effectively to other species without requiring human expertise. We introduce three end-to-end pipelines for detecting different characteristics, namely tree biological centers, contours, and knots. The second significant contribution of this work is the development of a model for detecting internal defects based on the external surface. The model exclusively uses the fine contours of the log to predict the presence and distribution of internal knots, leveraging deep learning techniques. Initially, a recurrent convolutional model was employed to efficiently capture contour variations for inferring internal defects. Subsequently, exploratory work was conducted, beginning with the development of a lightweight model for shape classification. This approach helped validate the underlying principles before extending it to the detection of internal defects, aiming to reduce model complexity without compromising result accuracy
Null, Thomas Calvin. "Use of Self Organized Maps for Feature Extraction of Hyperspectral Data". MSSTATE, 2001. http://sun.library.msstate.edu/ETD-db/theses/available/etd-11082001-145530/.
Testo completoYeu, Yeon. "FEATURE EXTRACTION FROM HYPERSPECTRAL IMAGERY FOR OBJECT RECOGNITION". The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1306848130.
Testo completoCassabaum, Mary Lou. "Exploiting high dimensional data for signal characterization and classification in feature space". Diss., The University of Arizona, 2004. http://hdl.handle.net/10150/280592.
Testo completoLi, Hua. "Feature Selection for High-risk Pattern Discovery in Medical Data". University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1353154433.
Testo completoZhang, Yi. "Application of Hyper-geometric Hypothesis-based Quantication and Markov Blanket Feature Selection Methods to Generate Signals for Adverse Drug Reaction Detection". University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1353343669.
Testo completoSharma, Jason P. (Jason Poonam) 1979. "Classification performance of support vector machines on genomic data utilizing feature space selection techniques". Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/87830.
Testo completoWu, You. "Feature Selection on High Dimensional Histogram Data to Improve Vehicle Components´ Life Length Prediction". Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-428615.
Testo completoHe, Yi. "An Analysis of Airborne Data Collection Methods for Updating Highway Feature Inventory". DigitalCommons@USU, 2016. https://digitalcommons.usu.edu/etd/5016.
Testo completoAllen, Andrew J. "Combining Machine Learning and Empirical Engineering Methods Towards Improving Oil Production Forecasting". DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2223.
Testo completoTennety, Chandu. "Machining Feature Recognition Using 2D Data of Extruded Operations in Solid Models". Ohio University / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1181406949.
Testo completoSivakumar, Krish. "CAD feature development and abstraction for process planning". Ohio : Ohio University, 1994. http://www.ohiolink.edu/etd/view.cgi?ohiou1180038784.
Testo completoSong, Wen. "Planetary navigation activity recognition using wearable accelerometer data". Thesis, Kansas State University, 2013. http://hdl.handle.net/2097/15813.
Testo completoDepartment of Electrical & Computer Engineering
Steve Warren
Activity recognition can be an important part of human health awareness. Many benefits can be generated from the recognition results, including knowledge of activity intensity as it relates to wellness over time. Various activity-recognition techniques have been presented in the literature, though most address simple activity-data collection and off-line analysis. More sophisticated real-time identification is less often addressed. Therefore, it is promising to consider the combination of current off-line, activity-detection methods with wearable, embedded tools in order to create a real-time wireless human activity recognition system with improved accuracy. Different from previous work on activity recognition, the goal of this effort is to focus on specific activities that an astronaut may encounter during a mission. Planetary navigation field test (PNFT) tasks are designed to meet this need. The approach used by the KSU team is to pre-record data on the ground in normal earth gravity and seek signal features that can be used to identify, and even predict, fatigue associated with these activities. The eventual goal is to then assess/predict the condition of an astronaut in a reduced-gravity environment using these predetermined rules. Several classic machine learning algorithms, including the k-Nearest Neighbor, Naïve Bayes, C4.5 Decision Tree, and Support Vector Machine approaches, were applied to these data to identify recognition algorithms suitable for real-time application. Graphical user interfaces (GUIs) were designed for both MATLAB and LabVIEW environments to facilitate recording and data analysis. Training data for the machine learning algorithms were recorded while subjects performed each activity, and then these identification approaches were applied to new data sets with an identification accuracy of around 86%. Early results indicate that a single three-axis accelerometer is sufficient to identify the occurrence of a given PNFT activity. A custom, embedded acceleration monitoring system employing ZigBee transmission is under development for future real-time activity recognition studies. A different GUI has been implemented for this system, which uses an on-line algorithm that will seek to identify activity at a refresh rate of 1 Hz.
Mortensen, Clifton H. "A Computational Fluid Dynamics Feature Extraction Method Using Subjective Logic". BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2208.
Testo completoChen, Yan. "Data Quality Assessment Methodology for Improved Prognostics Modeling". University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1330024393.
Testo completoYang, Yimin. "Exploring Hidden Coherent Feature Groups and Temporal Semantics for Multimedia Big Data Analysis". FIU Digital Commons, 2015. http://digitalcommons.fiu.edu/etd/2254.
Testo completoAbid, Saad Bin, e Xian Wei. "Development of Software for Feature Model Rendering". Thesis, Jönköping University, JTH, Computer and Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-621.
Testo completoThis Master’s thesis is aimed at improving the management of artifacts in the context of a joint-project between Jönköping University with the SEMCO project and industrial partner, a company involved in developing software for safety components. Both have a slightly distinct interest but this project can serve both parties.
Nowadays feature modelling is efficient way for domain analysis. The purpose of this master thesis is to analysis existing four popular feature diagrams, to find out commonalities between each of them and conclude results to give suggestions of how to use existing notation systems efficiently and according to situations.
The developed software based on knowledge established from research analysis. Two notation systems which are suggested in research part of the thesis report are implemented in the developed software “NotationManager”. The development procedures are also described and developer choices are mentioned along with the comparisons according to the situations
Scope of the research part as well as development is discussed. Future work for developed solution is also suggested.
Hanley, John P. "A New Evolutionary Algorithm For Mining Noisy, Epistatic, Geospatial Survey Data Associated With Chagas Disease". ScholarWorks @ UVM, 2017. http://scholarworks.uvm.edu/graddis/727.
Testo completoZhou, Mu. "Knowledge Discovery and Predictive Modeling from Brain Tumor MRIs". Scholar Commons, 2015. http://scholarcommons.usf.edu/etd/5809.
Testo completoPookhao, Naruekamol. "Statistical Methods for Functional Metagenomic Analysis Based on Next-Generation Sequencing Data". Diss., The University of Arizona, 2014. http://hdl.handle.net/10150/320986.
Testo completoDill, Evan T. "Integration of 3D and 2D Imaging Data for Assured Navigation in Unknown Environments". Ohio University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1299616166.
Testo completoMizaku, Alda. "Biomolecular feature selection of colorectal cancer microarray data using GA-SVM hybrid and noise perturbation to address overfitting". Diss., Online access via UMI:, 2009.
Cerca il testo completoIncludes bibliographical references.
Bard, Ari. "Modeling and Predicting Heat Transfer Coefficients for Flow Boiling in Microchannels". Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1619091352188123.
Testo completoRegnier, Lise. "Localization, Characterization and Recognition of Singing Voices". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00687475.
Testo completoHenriksson, Erik, e Kristopher Werlinder. "Housing Price Prediction over Countrywide Data : A comparison of XGBoost and Random Forest regressor models". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302535.
Testo completoMålet med den här studien är att jämföra och undersöka hur en XGBoost regressor och en Random Forest regressor presterar i att förutsäga huspriser. Detta görs med hjälp av två stycken datauppsättningar. Jämförelsen tar hänsyn till modellernas träningstid, slutledningstid och de tre utvärderingsfaktorerna R2, RMSE and MAPE. Datauppsättningarna beskrivs i detalj tillsammans med en bakgrund om regressionsmodellerna. Metoden innefattar en rengöring av datauppsättningarna, sökande efter optimala hyperparametrar för modellerna och 5delad korsvalidering för att uppnå goda förutsägelser. Resultatet av studien är att XGBoost regressorn presterar bättre på både små och stora datauppsättningar, men att den är överlägsen när det gäller stora datauppsättningar. Medan Random Forest modellen kan uppnå liknande resultat som XGBoost modellen, tar träningstiden mellan 250 gånger så lång tid och modellen får en cirka 40 gånger längre slutledningstid. Detta gör att XGBoost är särskilt överlägsen vid användning av stora datauppsättningar.
Hu, Renjie. "Random neural networks for dimensionality reduction and regularized supervised learning". Diss., University of Iowa, 2019. https://ir.uiowa.edu/etd/6960.
Testo completoGe, Esther. "The query based learning system for lifetime prediction of metallic components". Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/18345/4/Esther_Ting_Ge_Thesis.pdf.
Testo completoGe, Esther. "The query based learning system for lifetime prediction of metallic components". Queensland University of Technology, 2008. http://eprints.qut.edu.au/18345/.
Testo completo