Dissertations / Theses on the topic 'Data / features engineering'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Data / features engineering.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Mohammed, Hussein Syed. "Random feature subspace ensemble based approaches for the analysis of data with missing features /." Full text available online, 2006. http://www.lib.rowan.edu/find/theses.
Full textBaik, Edward H. (Edward Hyeen). "Surface-based segmentation of volume data using texture features." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43516.
Full textIncludes bibliographical references (p. 117-123).
by Edward H. Baik.
M.Eng.
Campbell, Richard John. "Recognition of free-form 3D objects in range data using global and local features /." The Ohio State University, 2001. http://rave.ohiolink.edu/etdc/view?acc_num=osu1486397841221694.
Full textOldfield, Robin B. "Lithological mapping of Northwest Argentina with remote sensing data using tonal, textural and contextual features." Thesis, Aston University, 1988. http://publications.aston.ac.uk/14287/.
Full textMora, Omar Ernesto. "Morphology-Based Identification of Surface Features to Support Landslide Hazard Detection Using Airborne LiDAR Data." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1429861576.
Full textFridley, Lila (Lila J. ). "Improving online demand forecast using novel features in website data : a case study at Zara." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117976.
Full textThesis: S.M., Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, in conjunction with the Leaders for Global Operations Program at MIT, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references (page 77).
The challenge of improving retail inventory customer service level while reducing costs is common across many retailers. This problem is typically addressed through efficient supply chain operations. This thesis discusses the development of new methodologies to predict e-commerce consumer demand for seasonal, short life-cycle articles. The new methodology incorporates novel data to predict demand of existing products through a bottom-up point forecast at the color and location level. It addresses the widely observed challenge of forecasting censored demand during a stock out. Zara introduces thousands of new items each season across over 2100 stores in 93 markets worldwide [1]. The Zara Distribution team is responsible for allocating inventory to each physical and e-commerce store. In line with Zara's quick to retail strategy, Distribution is flexible and responsive in forecasting store demand, with new styles arriving in stores twice per week [1]. The company is interested in improving the demand forecast by leveraging the novel e-commerce data that has become available since the launch of Zara.com in 2010 [2]. The results of this thesis demonstrate that the addition of new data to a linear regression model reduces prediction error by an average of 16% for e-commerce articles experiencing censored demand during a stock out, in comparison to traditional methods. Expanding the scope to all e-commerce articles, this thesis demonstrates that incorporating easily accessible web data yields an additional 2% error reduction on average for all articles on a color and location basis. Traditional methods to improve demand prediction have not before leveraged the expansive availability of e-commerce data, and this research presents a novel solution to the fashion forecasting challenge. This thesis project may additionally be used as a case-study for companies using subscriptions or an analogous tracking tool, as well as novel data features, in a user-friendly and implementable demand forecast model.
by Lila Fridley.
M.B.A.
S.M.
Wang, Ziang. "People Matching for Transportation Planning Using Optimized Features and Texel Camera Data for Sequential Estimation." DigitalCommons@USU, 2012. https://digitalcommons.usu.edu/etd/1298.
Full textKatzwinkel, Tim, Bhavinbhai Patel, Alexander Schmid, Walter Schmidt, Justus Siebrecht, Manuel Löwer, and Jörg Feldhusen. "Kosteneffiziente Technologien zur geometrischen Datenaufnahme im digitalen Reverse Engineering." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-215118.
Full textFabijan, Aleksander. "Developing the right features : the role and impact of customer and product data in software product development." Licentiate thesis, Malmö högskola, Fakulteten för teknik och samhälle (TS), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-7794.
Full textErdogan, Ozgur. "Main Seismological Features Of Recently Compiled Turkish Strong Motion Database." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/3/12609679/index.pdf.
Full textJin, Chao. "Methodology on Exact Extraction of Time Series Features for Robust Prognostics and Health Monitoring." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504795992214385.
Full textMehta, Alok. "Evolving legacy system's features into fine-grained components using regression test-cases." Link to electronic thesis, 2002. http://www.wpi.edu/Pubs/ETD/Available/etd-1211102-163800.
Full textKeywords: software maintenance; software evolution; regression test-cases; components; legacy system; incremental software evolution methodology; fine-grained components. Includes bibliographical references (p. 283-294).
Hounsell, Marcelo da Silva. "Feature-based validation reasoning for intent-driven engineering design." Thesis, Loughborough University, 1998. https://dspace.lboro.ac.uk/2134/33152.
Full textLee, Nien-Lung. "Feature Recognition From Scanned Data Points /." The Ohio State University, 1995. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487868114111376.
Full textDavis, Jonathan J. "Machine learning and feature engineering for computer network security." Thesis, Queensland University of Technology, 2017. https://eprints.qut.edu.au/106914/1/Jonathan_Davis_Thesis.pdf.
Full textRamanayaka, Mudiyanselage Asanga. "Data Engineering and Failure Prediction for Hard Drive S.M.A.R.T. Data." Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1594957948648404.
Full textSarkar, Saurabh. "Feature Selection with Missing Data." University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1378194989.
Full textAl-Sit, Waleed. "Automatic feature detection and interpretation in borehole data." Thesis, University of Liverpool, 2015. http://livrepository.liverpool.ac.uk/2014181/.
Full textAbdalla, Hassan Shafik. "Development of a design for manufacture concurrent engineering system." Thesis, De Montfort University, 1995. http://hdl.handle.net/2086/4253.
Full textNi, Weizeng. "Ontology-based Feature Construction on Non-structured Data." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340.
Full textSarkar, Biplab. "Modeling and manufacturing of multiple featured objects based on measurement data /." The Ohio State University, 1991. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487757723996478.
Full textMuteba, Ben Ilunga. "Data Science techniques for predicting plant genes involved in secondary metabolites production." University of the Western Cape, 2018. http://hdl.handle.net/11394/7039.
Full textPlant genome analysis is currently experiencing a boost due to reduced costs associated with the development of next generation sequencing technologies. Knowledge on genetic background can be applied to guide targeted plant selection and breeding, and to facilitate natural product discovery and biological engineering. In medicinal plants, secondary metabolites are of particular interest because they often represent the main active ingredients associated with health-promoting qualities. Plant polyphenols are a highly diverse family of aromatic secondary metabolites that act as antimicrobial agents, UV protectants, and insect or herbivore repellents. Most of the genome mining tools developed to understand genetic materials have very seldom addressed secondary metabolite genes and biosynthesis pathways. Little significant research has been conducted to study key enzyme factors that can predict a class of secondary metabolite genes from polyketide synthases. The objectives of this study were twofold: Primarily, it aimed to identify the biological properties of secondary metabolite genes and the selection of a specific gene, naringenin-chalcone synthase or chalcone synthase (CHS). The study hypothesized that data science approaches in mining biological data, particularly secondary metabolite genes, would enable the compulsory disclosure of some aspects of secondary metabolite (SM). Secondarily, the aim was to propose a proof of concept for classifying or predicting plant genes involved in polyphenol biosynthesis from data science techniques and convey these techniques in computational analysis through machine learning algorithms and mathematical and statistical approaches. Three specific challenges experienced while analysing secondary metabolite datasets were: 1) class imbalance, which refers to lack of proportionality among protein sequence classes; 2) high dimensionality, which alludes to a phenomenon feature space that arises when analysing bioinformatics datasets; and 3) the difference in protein sequences lengths, which alludes to a phenomenon that protein sequences have different lengths. Considering these inherent issues, developing precise classification models and statistical models proves a challenge. Therefore, the prerequisite for effective SM plant gene mining is dedicated data science techniques that can collect, prepare and analyse SM genes.
Khazem, Salim. "Apprentissage profond et traitement d'images pour la détection et la prédiction des nœuds au cœur des rondins." Electronic Thesis or Diss., CentraleSupélec, 2024. http://www.theses.fr/2024CSUP0016.
Full textIn the wood industry, the quality of logs is heavily influenced by their internal structure, particularly the distribution of defects, especially knots within the trees. Accurately detecting these knots, which result from branch growth, can significantly enhance the industry's efficiency by reducing waste and optimizing the quality of wood products. Traditionally, identifying knots and other internal characteristics of logs, such as centers and contours, requires specialized equipment like CT scanners, often combined with conventional computer vision approaches to obtain detailed images of the trees' internal structure. The main challenge is that such equipment is costly and not accessible to all companies, limiting its adoption in the industry. This thesis focuses on addressing this issue, particularly on detecting internal defects based on the external surface of logs. The initial goal is to automate the detection of various log characteristics. These characteristics will then be used to perform the main task, which involves utilizing contour variations to detect the distribution of internal defects. One of the contributions of this work is the automation of detecting the semantic characteristics of trees using X-ray images. We establish that deep learning-based methods can perform well in detection and generalize effectively to other species without requiring human expertise. We introduce three end-to-end pipelines for detecting different characteristics, namely tree biological centers, contours, and knots. The second significant contribution of this work is the development of a model for detecting internal defects based on the external surface. The model exclusively uses the fine contours of the log to predict the presence and distribution of internal knots, leveraging deep learning techniques. Initially, a recurrent convolutional model was employed to efficiently capture contour variations for inferring internal defects. Subsequently, exploratory work was conducted, beginning with the development of a lightweight model for shape classification. This approach helped validate the underlying principles before extending it to the detection of internal defects, aiming to reduce model complexity without compromising result accuracy
Null, Thomas Calvin. "Use of Self Organized Maps for Feature Extraction of Hyperspectral Data." MSSTATE, 2001. http://sun.library.msstate.edu/ETD-db/theses/available/etd-11082001-145530/.
Full textYeu, Yeon. "FEATURE EXTRACTION FROM HYPERSPECTRAL IMAGERY FOR OBJECT RECOGNITION." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1306848130.
Full textCassabaum, Mary Lou. "Exploiting high dimensional data for signal characterization and classification in feature space." Diss., The University of Arizona, 2004. http://hdl.handle.net/10150/280592.
Full textLi, Hua. "Feature Selection for High-risk Pattern Discovery in Medical Data." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1353154433.
Full textZhang, Yi. "Application of Hyper-geometric Hypothesis-based Quantication and Markov Blanket Feature Selection Methods to Generate Signals for Adverse Drug Reaction Detection." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1353343669.
Full textSharma, Jason P. (Jason Poonam) 1979. "Classification performance of support vector machines on genomic data utilizing feature space selection techniques." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/87830.
Full textWu, You. "Feature Selection on High Dimensional Histogram Data to Improve Vehicle Components´ Life Length Prediction." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-428615.
Full textHe, Yi. "An Analysis of Airborne Data Collection Methods for Updating Highway Feature Inventory." DigitalCommons@USU, 2016. https://digitalcommons.usu.edu/etd/5016.
Full textAllen, Andrew J. "Combining Machine Learning and Empirical Engineering Methods Towards Improving Oil Production Forecasting." DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2223.
Full textTennety, Chandu. "Machining Feature Recognition Using 2D Data of Extruded Operations in Solid Models." Ohio University / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1181406949.
Full textSivakumar, Krish. "CAD feature development and abstraction for process planning." Ohio : Ohio University, 1994. http://www.ohiolink.edu/etd/view.cgi?ohiou1180038784.
Full textSong, Wen. "Planetary navigation activity recognition using wearable accelerometer data." Thesis, Kansas State University, 2013. http://hdl.handle.net/2097/15813.
Full textDepartment of Electrical & Computer Engineering
Steve Warren
Activity recognition can be an important part of human health awareness. Many benefits can be generated from the recognition results, including knowledge of activity intensity as it relates to wellness over time. Various activity-recognition techniques have been presented in the literature, though most address simple activity-data collection and off-line analysis. More sophisticated real-time identification is less often addressed. Therefore, it is promising to consider the combination of current off-line, activity-detection methods with wearable, embedded tools in order to create a real-time wireless human activity recognition system with improved accuracy. Different from previous work on activity recognition, the goal of this effort is to focus on specific activities that an astronaut may encounter during a mission. Planetary navigation field test (PNFT) tasks are designed to meet this need. The approach used by the KSU team is to pre-record data on the ground in normal earth gravity and seek signal features that can be used to identify, and even predict, fatigue associated with these activities. The eventual goal is to then assess/predict the condition of an astronaut in a reduced-gravity environment using these predetermined rules. Several classic machine learning algorithms, including the k-Nearest Neighbor, Naïve Bayes, C4.5 Decision Tree, and Support Vector Machine approaches, were applied to these data to identify recognition algorithms suitable for real-time application. Graphical user interfaces (GUIs) were designed for both MATLAB and LabVIEW environments to facilitate recording and data analysis. Training data for the machine learning algorithms were recorded while subjects performed each activity, and then these identification approaches were applied to new data sets with an identification accuracy of around 86%. Early results indicate that a single three-axis accelerometer is sufficient to identify the occurrence of a given PNFT activity. A custom, embedded acceleration monitoring system employing ZigBee transmission is under development for future real-time activity recognition studies. A different GUI has been implemented for this system, which uses an on-line algorithm that will seek to identify activity at a refresh rate of 1 Hz.
Mortensen, Clifton H. "A Computational Fluid Dynamics Feature Extraction Method Using Subjective Logic." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2208.
Full textChen, Yan. "Data Quality Assessment Methodology for Improved Prognostics Modeling." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1330024393.
Full textYang, Yimin. "Exploring Hidden Coherent Feature Groups and Temporal Semantics for Multimedia Big Data Analysis." FIU Digital Commons, 2015. http://digitalcommons.fiu.edu/etd/2254.
Full textAbid, Saad Bin, and Xian Wei. "Development of Software for Feature Model Rendering." Thesis, Jönköping University, JTH, Computer and Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-621.
Full textThis Master’s thesis is aimed at improving the management of artifacts in the context of a joint-project between Jönköping University with the SEMCO project and industrial partner, a company involved in developing software for safety components. Both have a slightly distinct interest but this project can serve both parties.
Nowadays feature modelling is efficient way for domain analysis. The purpose of this master thesis is to analysis existing four popular feature diagrams, to find out commonalities between each of them and conclude results to give suggestions of how to use existing notation systems efficiently and according to situations.
The developed software based on knowledge established from research analysis. Two notation systems which are suggested in research part of the thesis report are implemented in the developed software “NotationManager”. The development procedures are also described and developer choices are mentioned along with the comparisons according to the situations
Scope of the research part as well as development is discussed. Future work for developed solution is also suggested.
Hanley, John P. "A New Evolutionary Algorithm For Mining Noisy, Epistatic, Geospatial Survey Data Associated With Chagas Disease." ScholarWorks @ UVM, 2017. http://scholarworks.uvm.edu/graddis/727.
Full textZhou, Mu. "Knowledge Discovery and Predictive Modeling from Brain Tumor MRIs." Scholar Commons, 2015. http://scholarcommons.usf.edu/etd/5809.
Full textPookhao, Naruekamol. "Statistical Methods for Functional Metagenomic Analysis Based on Next-Generation Sequencing Data." Diss., The University of Arizona, 2014. http://hdl.handle.net/10150/320986.
Full textDill, Evan T. "Integration of 3D and 2D Imaging Data for Assured Navigation in Unknown Environments." Ohio University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1299616166.
Full textMizaku, Alda. "Biomolecular feature selection of colorectal cancer microarray data using GA-SVM hybrid and noise perturbation to address overfitting." Diss., Online access via UMI:, 2009.
Find full textIncludes bibliographical references.
Bard, Ari. "Modeling and Predicting Heat Transfer Coefficients for Flow Boiling in Microchannels." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1619091352188123.
Full textRegnier, Lise. "Localization, Characterization and Recognition of Singing Voices." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00687475.
Full textHenriksson, Erik, and Kristopher Werlinder. "Housing Price Prediction over Countrywide Data : A comparison of XGBoost and Random Forest regressor models." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302535.
Full textMålet med den här studien är att jämföra och undersöka hur en XGBoost regressor och en Random Forest regressor presterar i att förutsäga huspriser. Detta görs med hjälp av två stycken datauppsättningar. Jämförelsen tar hänsyn till modellernas träningstid, slutledningstid och de tre utvärderingsfaktorerna R2, RMSE and MAPE. Datauppsättningarna beskrivs i detalj tillsammans med en bakgrund om regressionsmodellerna. Metoden innefattar en rengöring av datauppsättningarna, sökande efter optimala hyperparametrar för modellerna och 5delad korsvalidering för att uppnå goda förutsägelser. Resultatet av studien är att XGBoost regressorn presterar bättre på både små och stora datauppsättningar, men att den är överlägsen när det gäller stora datauppsättningar. Medan Random Forest modellen kan uppnå liknande resultat som XGBoost modellen, tar träningstiden mellan 250 gånger så lång tid och modellen får en cirka 40 gånger längre slutledningstid. Detta gör att XGBoost är särskilt överlägsen vid användning av stora datauppsättningar.
Hu, Renjie. "Random neural networks for dimensionality reduction and regularized supervised learning." Diss., University of Iowa, 2019. https://ir.uiowa.edu/etd/6960.
Full textGe, Esther. "The query based learning system for lifetime prediction of metallic components." Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/18345/4/Esther_Ting_Ge_Thesis.pdf.
Full textGe, Esther. "The query based learning system for lifetime prediction of metallic components." Queensland University of Technology, 2008. http://eprints.qut.edu.au/18345/.
Full text