Добірка наукової літератури з теми "Data engineering and data science"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Data engineering and data science".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Data engineering and data science"
Klettke, Meike, and Uta Störl. "Four Generations in Data Engineering for Data Science." Datenbank-Spektrum 22, no. 1 (December 22, 2021): 59–66. http://dx.doi.org/10.1007/s13222-021-00399-3.
Повний текст джерелаKRIEGER, JAMES. "Data on academic science/engineering updated." Chemical & Engineering News 65, no. 1 (January 5, 1987): 18. http://dx.doi.org/10.1021/cen-v065n001.p018a.
Повний текст джерелаBertino, Elisa. "Introduction to Data Science and Engineering." Data Science and Engineering 1, no. 1 (February 25, 2016): 1–3. http://dx.doi.org/10.1007/s41019-016-0005-1.
Повний текст джерелаKulkarni, Nishant. "Olympic Data Analysis using Data Science." International Journal for Research in Applied Science and Engineering Technology 10, no. 12 (December 31, 2022): 855–61. http://dx.doi.org/10.22214/ijraset.2022.48046.
Повний текст джерелаDuever, Thomas A. "Data Science in the Chemical Engineering Curriculum." Processes 7, no. 11 (November 8, 2019): 830. http://dx.doi.org/10.3390/pr7110830.
Повний текст джерелаAshraf, Chowdhury, Nisarg Joshi, David A. C. Beck, and Jim Pfaendtner. "Data Science in Chemical Engineering: Applications to Molecular Science." Annual Review of Chemical and Biomolecular Engineering 12, no. 1 (June 7, 2021): 15–37. http://dx.doi.org/10.1146/annurev-chembioeng-101220-102232.
Повний текст джерелаCressie, Noel. "Comment: When Is It Data Science and When Is It Data Engineering?" Journal of the American Statistical Association 115, no. 530 (April 2, 2020): 660–62. http://dx.doi.org/10.1080/01621459.2020.1762619.
Повний текст джерелаKroll, Joshua A. "Data Science Data Governance [AI Ethics]." IEEE Security & Privacy 16, no. 6 (November 2018): 61–70. http://dx.doi.org/10.1109/msec.2018.2875329.
Повний текст джерелаHering, Janet G. "From Slide Rule to Big Data: How Data Science is Changing Water Science and Engineering." Journal of Environmental Engineering 145, no. 8 (August 2019): 02519001. http://dx.doi.org/10.1061/(asce)ee.1943-7870.0001578.
Повний текст джерелаGibert, Karina, Jeffery S. Horsburgh, Ioannis N. Athanasiadis, and Geoff Holmes. "Environmental Data Science." Environmental Modelling & Software 106 (August 2018): 4–12. http://dx.doi.org/10.1016/j.envsoft.2018.04.005.
Повний текст джерелаДисертації з теми "Data engineering and data science"
Kanter, Max (James Max). "The data science machine : emulating human intelligence in data science endeavors." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/107031.
Повний текст джерелаThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 87-88).
Data scientists are responsible for many tasks in the data analysis process including formulating the question, generating features, building a model, and disseminating the results. The Data Science Machine is a automated system that emulates a human data scientist's ability to generate predictive models from raw data. In this thesis, we propose the Deep Feature Synthesis algorithm for automatically generating features for relational datasets. We implement this algorithm and test it on 3 data science competitions that have participation from nearly 1000 data science enthusiasts. In 2 of the 3 competitions we beat a majority of competitors, and in the third, we achieve 94% of the best competitor's score. Finally, we take steps towards incorporating the Data Science Machine into the data science process by implementing and evaluating an interface for users to interact with the Data Science Machine.
by Max Kanter
M. Eng.
Wason, Jasmin Lesley. "Automating data management in science and engineering." Thesis, University of Southampton, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.396143.
Повний текст джерелаSmith, Micah J. (Micah Jacob). "Scaling collaborative open data science." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117819.
Повний текст джерелаThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 103-107).
Large-scale, collaborative, open data science projects have the potential to address important societal problems using the tools of predictive machine learning. However, no suitable framework exists to develop such projects collaboratively and openly, at scale. In this thesis, I discuss the deficiencies of current approaches and then develop new approaches for this problem through systems, algorithms, and interfaces. A central theme is the restructuring of data science projects into scalable, fundamental units of contribution. I focus on feature engineering, structuring contributions as the creation of independent units of feature function source code. This then facilitates the integration of many submissions by diverse collaborators into a single, unified, machine learning model, where contributions can be rigorously validated and verified to ensure reproducibility and trustworthiness. I validate this concept by designing and implementing a cloud-based collaborative feature engineering platform, Feature- Hub, as well as an associated discussion platform for real-time collaboration. The platform is validated through an extensive user study and modeling performance is benchmarked against data science competition results. In the process, I also collect and analyze a novel data set on the feature engineering source code submitted by crowd data scientist workers of varying backgrounds around the world. Within this context, I discuss paths forward for collaborative data science.
by Micah J. Smith.
S.M. in Computer Science
Yang, Ying. "Interactive Data Management and Data Analysis." Thesis, State University of New York at Buffalo, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10288109.
Повний текст джерелаEveryone today has a big data problem. Data is everywhere and in different formats, they can be referred to as data lakes, data streams, or data swamps. To extract knowledge or insights from the data or to support decision-making, we need to go through a process of collecting, cleaning, managing and analyzing the data. In this process, data cleaning and data analysis are two of the most important and time-consuming components.
One common challenge in these two components is a lack of interaction. The data cleaning and data analysis are typically done as a batch process, operating on the whole dataset without any feedback. This leads to long, frustrating delays during which users have no idea if the process is effective. Lacking interaction, human expert effort is needed to make decisions on which algorithms or parameters to use in the systems for these two components.
We should teach computers to talk to humans, not the other way around. This dissertation focuses on building systems --- Mimir and CIA --- that help user conduct data cleaning and analysis through interaction. Mimir is a system that allows users to clean big data in a cost- and time-efficient way through interaction, a process I call on-demand ETL. Convergent inference algorithms (CIA) are a family of inference algorithms in probabilistic graphical models (PGM) that enjoys the benefit of both exact and approximate inference algorithms through interaction.
Mimir provides a general language for user to express different data cleaning needs. It acts as a shim layer that wraps around the database making it possible for the bulk of the ETL process to remain within a classical deterministic system. Mimir also helps users to measure the quality of an analysis result and provides rankings for cleaning tasks to improve the result quality in a cost efficient manner. CIA focuses on providing user interaction through the process of inference in PGMs. The goal of CIA is to free users from the upfront commitment to either approximate or exact inference, and provide user more control over time/accuracy trade-offs to direct decision-making and computation instance allocations. This dissertation describes the Mimir and CIA frameworks to demonstrate that it is feasible to build efficient interactive data management and data analysis systems.
Gertner, Yael. "Private data base access schemes avoiding data distribution." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/42730.
Повний текст джерелаLi, Richard D. (Richard Ding) 1978. "Web clickstream data analysis using a dimensional data warehouse." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86671.
Повний текст джерелаIncludes bibliographical references (leaves 83-84).
by Richard D. Li.
M.Eng.
Ramanayaka, Mudiyanselage Asanga. "Data Engineering and Failure Prediction for Hard Drive S.M.A.R.T. Data." Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1594957948648404.
Повний текст джерелаDerksen, Timothy J. (Timothy John). "Processing of outliers and missing data in multivariate manufacturing data." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/38800.
Повний текст джерелаIncludes bibliographical references (leaf 64).
by Timothy J. Derksen.
M.Eng.
Wang, Yi. "Data Management and Data Processing Support on Array-Based Scientific Data." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1436157356.
Повний текст джерелаChiesa, Alessandro. "Proof-carrying data." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/61151.
Повний текст джерелаPage 96 blank. Cataloged from PDF version of thesis.
Includes bibliographical references (p. 87-95).
The security of systems can often be expressed as ensuring that some property is maintained at every step of a distributed computation conducted by untrusted parties. Special cases include integrity of programs running on untrusted platforms, various forms of confidentiality and side-channel resilience, and domain-specific invariants. We propose a new approach, proof-carrying data (PCD), which sidesteps the threat of faults and leakage by reasoning about properties of a computation's output data, regardless of the process that produced it. In PCD, the system designer prescribes the desired properties of a computation's outputs. Corresponding proofs are attached to every message flowing through the system, and are mutually verified by the system's components. Each such proof attests that the message's data and all of its history comply with the prescribed properties. We construct a general protocol compiler that generates, propagates, and verifies such proofs of compliance, while preserving the dynamics and efficiency of the original computation. Our main technical tool is the cryptographic construction of short non-interactive arguments (computationally-sound proofs) for statements whose truth depends on "hearsay evidence": previous arguments about other statements. To this end, we attain a particularly strong proof-of-knowledge property. We realize the above, under standard cryptographic assumptions, in a model where the prover has blackbox access to some simple functionality - essentially, a signature card.
by Alessandro Chiesa.
M.Eng.
Книги з теми "Data engineering and data science"
Cui, Zhen, Jinshan Pan, Shanshan Zhang, Liang Xiao, and Jian Yang, eds. Intelligence Science and Big Data Engineering. Visual Data Engineering. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-36189-1.
Повний текст джерелаLee, Roger, ed. Big Data, Cloud Computing, Data Science & Engineering. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-319-96803-2.
Повний текст джерелаLee, Roger, ed. Big Data, Cloud Computing, and Data Science Engineering. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-24405-7.
Повний текст джерелаHe, Xiaofei, Xinbo Gao, Yanning Zhang, Zhi-Hua Zhou, Zhi-Yong Liu, Baochuan Fu, Fuyuan Hu, and Zhancheng Zhang, eds. Intelligence Science and Big Data Engineering. Image and Video Data Engineering. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23989-7.
Повний текст джерелаKing, Tim. Data Network Engineering. Boston, MA: Springer US, 1999.
Знайти повний текст джерелаCheremisinoff, Paul N. Process engineering data book. Lancaster, PA: Technomic Pub., 1995.
Знайти повний текст джерелаUnit, Engineering Sciences Data. Engineering sciences data: fatigue - fracture mechanics data. London: Engineering Sciences Data Unit, 1985.
Знайти повний текст джерелаUnit, Engineering Sciences Data. Engineering sciences data: wind engineering. London: Engineering Sciences Data Unit, 1985.
Знайти повний текст джерелаMadarshahian, Ramin, and Francois Hemez, eds. Data Science in Engineering, Volume 9. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-76004-5.
Повний текст джерелаPolkowski, Zdzislaw, Sambit Kumar Mishra, and Julian Vasilev. Data Science in Engineering and Management. New York: CRC Press, 2021. http://dx.doi.org/10.1201/9781003216278.
Повний текст джерелаЧастини книг з теми "Data engineering and data science"
Quix, Christoph. "Data Engineering." In Data Science, 85–104. Wiesbaden: Springer Fachmedien Wiesbaden, 2021. http://dx.doi.org/10.1007/978-3-658-33403-1_5.
Повний текст джерелаPapp, Stefan, and Bernhard Ortner. "Data Engineering." In Handbuch Data Science und KI, 112–44. 2nd ed. München: Carl Hanser Verlag GmbH & Co. KG, 2022. http://dx.doi.org/10.3139/9783446472457.004.
Повний текст джерелаVarga, Ervin. "Data Engineering." In Practical Data Science with Python 3, 29–71. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-4859-1_2.
Повний текст джерелаPapp, Stefan, and Bernhard Ortner. "Data Engineering." In The Handbook of Data Science and AI, 101–30. München: Carl Hanser Verlag GmbH & Co. KG, 2022. http://dx.doi.org/10.3139/9781569908877.004.
Повний текст джерелаSoh, Julian, and Priyanshi Singh. "Data Preparation and Data Engineering Basics." In Data Science Solutions on Azure, 65–115. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-6405-8_3.
Повний текст джерелаSoviany, Sorin, and Cristina Soviany. "Feature Engineering." In Principles of Data Science, 79–103. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-43981-1_5.
Повний текст джерелаMahalle, Parikshit Narendra, Gitanjali Rahul Shinde, Priya Dudhale Pise, and Jyoti Yogesh Deshmukh. "Data Science in Civil Engineering and Mechanical Engineering." In Studies in Big Data, 87–99. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-5160-1_6.
Повний текст джерелаPatgiri, Ripon, and Sabuzima Nayak. "Big Biomedical Data Engineering." In Principles of Data Science, 31–48. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-43981-1_3.
Повний текст джерелаDuboue, Pablo. "Feature Engineering." In Applied Data Science in Tourism, 109–27. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-88389-8_7.
Повний текст джерелаVuppalapati, Chandrasekar. "Data Engineering and Exploratory Data Analysis Techniques." In International Series in Operations Research & Management Science, 75–158. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-77485-1_2.
Повний текст джерелаТези доповідей конференцій з теми "Data engineering and data science"
Oyamada, Masafumi. "Extracting Feature Engineering Knowledge from Data Science Notebooks." In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019. http://dx.doi.org/10.1109/bigdata47090.2019.9006522.
Повний текст джерелаGlotzer, Sharon C. "Data Science for Assembly Engineering." In KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3447548.3469649.
Повний текст джерелаMenzies, Tim, Ekrem Kocaguneli, Fayola Peters, Burak Turhan, and Leandro L. Minku. "Data science for software engineering." In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 2013. http://dx.doi.org/10.1109/icse.2013.6606752.
Повний текст джерелаLam, Hoang Thanh, Beat Buesser, Hong Min, Tran Ngoc Minh, Martin Wistuba, Udayan Khurana, Gregory Bramble, Theodoros Salonidis, Dakuo Wang, and Horst Samulowitz. "Automated Data Science for Relational Data." In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2021. http://dx.doi.org/10.1109/icde51399.2021.00305.
Повний текст джерелаDrummond, David E. "Open sourcing education for Data Engineering and Data Science." In 2016 IEEE Frontiers in Education Conference (FIE). IEEE, 2016. http://dx.doi.org/10.1109/fie.2016.7757517.
Повний текст джерелаCruz, Lito Perez. "When Data Science Becomes Software Engineering." In 9th International Conference on Knowledge Engineering and Ontology Development. SCITEPRESS - Science and Technology Publications, 2017. http://dx.doi.org/10.5220/0006508502260232.
Повний текст джерелаChakravaram, Venkamaraju, Vidya Sagar Rao G., Jangirala Srinivas, and Sunitha Ratnakaram. "The Role of Big Data, Data Science and Data Analytics in Financial Engineering." In the 2019 International Conference. New York, New York, USA: ACM Press, 2019. http://dx.doi.org/10.1145/3341620.3341630.
Повний текст джерелаLeung, Carson K., Yubo Chen, Siyuan Shang, and Deyu Deng. "Big Data Science on COVID-19 Data." In 2020 IEEE 14th International Conference on Big Data Science and Engineering (BigDataSE). IEEE, 2020. http://dx.doi.org/10.1109/bigdatase50710.2020.00010.
Повний текст джерелаHaas, Laura. "Leveraging Data and People to Accelerate Data Science." In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 2017. http://dx.doi.org/10.1109/icde.2017.9.
Повний текст джерелаDubath, Pierre, Roland Walter, and Thierry Courvoisier. "INTEGRAL Science Data Center." In SPIE's 1996 International Symposium on Optical Science, Engineering, and Instrumentation, edited by Brian D. Ramsey and Thomas A. Parnell. SPIE, 1996. http://dx.doi.org/10.1117/12.253992.
Повний текст джерелаЗвіти організацій з теми "Data engineering and data science"
Greenberg, Jane, Samantha Grabus, Florence Hudson, Tim Kraska, Samuel Madden, René Bastón, and Katie Naum. The Northeast Big Data Innovation Hub: "Enabling Seamless Data Sharing in Industry and Academia" Workshop Report. Drexel University, March 2017. http://dx.doi.org/10.17918/d8159v.
Повний текст джерелаDaniels, Matthew, Autumn Toney, Melissa Flagg, and Charles Yang. Machine Intelligence for Scientific Discovery and Engineering Invention. Center for Security and Emerging Technology, May 2021. http://dx.doi.org/10.51593/20200099.
Повний текст джерелаHalford, Alison. Working towards modern, affordable & sustainable energy systems in the context of displacement. Recommendations for researchers and practitioners. Coventry University, January 2020. http://dx.doi.org/10.18552/heed/2020/0001.
Повний текст джерелаCaplin, Andrew. Economic Data Engineering. Cambridge, MA: National Bureau of Economic Research, October 2021. http://dx.doi.org/10.3386/w29378.
Повний текст джерелаDEFENSE LOGISTICS AGENCY ALEXANDRIA VA. Data Quality Engineering Handbook. Fort Belvoir, VA: Defense Technical Information Center, June 1994. http://dx.doi.org/10.21236/ada315573.
Повний текст джерелаSteeves, Brye, and Donald Montoya. NSDS Nuclear Science Data Solutions. Office of Scientific and Technical Information (OSTI), September 2020. http://dx.doi.org/10.2172/1663156.
Повний текст джерелаFeldgoise, Jacob, and Remco Zwetsloot. Estimating the Number of Chinese STEM Students in the United States. Center for Security and Emerging Technology, October 2020. http://dx.doi.org/10.51593/20200023.
Повний текст джерелаBishop, Bradley Wade. Job analyses of earth science data librarians and data managers. University of Tennessee, Knoxville Libraries, February 2020. http://dx.doi.org/10.7290/mi9a8xvdto.
Повний текст джерелаBidier, S., U. Khristenko, A. Kodakkal, C. Soriano, and R. Rossi. D7.4 Final report on Stochastic Optimization results. Scipedia, 2022. http://dx.doi.org/10.23967/exaqute.2022.3.02.
Повний текст джерелаIshkov, Vitaly, N. Sergeyeva, L. Zabarinskaya, M. Nisilevich, E. Kedrov, and T. Krylova. Data on Solar Activity for Science. Balkan, Black sea and Caspian sea Regional Network for Space Weather Studies, July 2019. http://dx.doi.org/10.31401/sungeo.2019.01.01.
Повний текст джерела