Academic literature on the topic 'Big Data, Machine Learning, Data Science, Apache Spark'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Big Data, Machine Learning, Data Science, Apache Spark.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Big Data, Machine Learning, Data Science, Apache Spark"
Mutasher, Watheq Ghanim, and Abbas Fadhil Aljuboori. "Real Time Big Data Sentiment Analysis and Classification of Facebook." Webology 19, no. 1 (January 20, 2022): 1112–27. http://dx.doi.org/10.14704/web/v19i1/web19076.
Full textOmar, Hoger Khayrolla, and Alaa Khalil Jumaa. "Distributed big data analysis using spark parallel data processing." Bulletin of Electrical Engineering and Informatics 11, no. 3 (June 1, 2022): 1505–15. http://dx.doi.org/10.11591/eei.v11i3.3187.
Full textOmar, Hoger Khayrolla, and Alaa Khalil Jumaa. "Big Data Analysis Using Apache Spark MLlib and Hadoop HDFS with Scala and Java." Kurdistan Journal of Applied Research 4, no. 1 (May 8, 2019): 7–14. http://dx.doi.org/10.24017/science.2019.1.2.
Full textWei, Chih-Chiang, and Tzu-Hao Chou. "Typhoon Quantitative Rainfall Prediction from Big Data Analytics by Using the Apache Hadoop Spark Parallel Computing Framework." Atmosphere 11, no. 8 (August 17, 2020): 870. http://dx.doi.org/10.3390/atmos11080870.
Full textGupta, Madhuri, and Bharat Gupta. "Survey of Breast Cancer Detection Using Machine Learning Techniques in Big Data." Journal of Cases on Information Technology 21, no. 3 (July 2019): 80–92. http://dx.doi.org/10.4018/jcit.2019070106.
Full textKamburugamuve, Supun, Pulasthi Wickramasinghe, Saliya Ekanayake, and Geoffrey C. Fox. "Anatomy of machine learning algorithm implementations in MPI, Spark, and Flink." International Journal of High Performance Computing Applications 32, no. 1 (July 2, 2017): 61–73. http://dx.doi.org/10.1177/1094342017712976.
Full textÖzgüven, Yavuz, Utku Gönener, and Süleyman Eken. "A Dockerized big data architecture for sports analytics." Computer Science and Information Systems, no. 00 (2022): 10. http://dx.doi.org/10.2298/csis220118010o.
Full textConcolato, Claude E., and Li M. Chen. "Data Science: A New Paradigm in the Age of Big-Data Science and Analytics." New Mathematics and Natural Computation 13, no. 02 (July 2017): 119–43. http://dx.doi.org/10.1142/s1793005717400038.
Full textMyung, Rohyoung, and Sukyong Choi. "Machine-Learning Based Memory Prediction Model for Data Parallel Workloads in Apache Spark." Symmetry 13, no. 4 (April 16, 2021): 697. http://dx.doi.org/10.3390/sym13040697.
Full textHussin, Sahar K., Salah M. Abdelmageid, Adel Alkhalil, Yasser M. Omar, Mahmoud I. Marie, and Rabie A. Ramadan. "Handling Imbalance Classification Virtual Screening Big Data Using Machine Learning Algorithms." Complexity 2021 (January 28, 2021): 1–15. http://dx.doi.org/10.1155/2021/6675279.
Full textDissertations / Theses on the topic "Big Data, Machine Learning, Data Science, Apache Spark"
Ray, Sujan. "Dimensionality Reduction in Healthcare Data Analysis on Cloud Platform." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin161375080072697.
Full textMurgia, Antonio. "Lightweight Internet Traffic Classification - A Subject Based Solution with Word Embeddings." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/10569/.
Full textСоболь, Віталій Миколайович, and Vitaliy Sobol. "Розподілена комп’ютерна система для прогнозування поширення рослинного покриву з використанням засобів машинного навчання." Master's thesis, Тернопільський національний технічний університет імені Івана Пулюя, 2021. http://elartu.tntu.edu.ua/handle/lib/36653.
Full textThe aim of the work is to develop software and implement machine learning algorithms for forecasting the forest cover of a certain area, taking into account the diversity and uniqueness of the environment and the original plantings in a certain area. The study analyzes important concepts, principles and sequences of processes used in the design of computer systems and program writing, and work with big data, in particular, terminological features in the process of implementing software for forecasting, which allowed to understand further identify ways to implement machine learning methods to improve the efficiency of greenery in a given area.
ПЕРЕЛІК ОСНОВНИХ УМОВНИХ ПОЗНАЧЕНЬ, СИМВОЛІВ І СКОРОЧЕНЬ... 9 ВСТУП...10 РОЗДІЛ 1. АНАЛІЗ ОСОБЛИВОСТЕЙ ПРОЦЕСУ ОБРОБКИ ВЕЛИКИХ ДАНИХ ТА КЛАСИФІКАЦІЯ ОСНОВНИХ АЛГОРИТМІВ...14 1.1. Аналіз та основні виклики науки про дані...14 1.2. Порівняння Hadoop і Spark, як основних конкурентів по роботі з Великими даними...17 1.3. Обгрунтування вибору Apache Spark як основного фреймворка роботи...19 1.4. Швидкий перехід до регресії...22 1.5. Вектори та особливості...23 1.6. Тренувальні приклади...24 1.7. Дерева рішень та ліси...25 1.8. Набір даних лісового покриття...26 1.9. Висновки до розділу...27 РОЗДІЛ 2. ОПИС ТА ВИБІР МЕТОДІВ МАШИННОГО НАВЧАННЯ ПРИ ОБРОБЦІ ВЕЛИКИХ ДАНИХ...28 2.1. Попередня обробка даних та аналіз даних...28 2.1.1. Пропущені значення...29 2.1.2. Дублювання даних...29 2.1.3. Шуми та викиди...30 2.1.4. Очищення даних...31 2.1.5. Методи нормування даних...32 2.1.6. Методи заповнення пропусків...33 2.2. Вибір базових класифікаторів...34 2.2.1. Загальна постановка задачі класифікації...34 2.2.2. Лінійні класифікатори...36 2.2.2.1. Лінійний дискримінант Фішера...40 2.2.2.2. Одношаровий персептрон...40 2.2.2.3. Логістична регресія...40 2.2.2.4. Метод опорних векторів...41 2.2.3. Метод k найбільших сусідів...42 2.2.4. Наївний байєсівський класифікатор...43 2.2.5. Дерева рішень...44 2.3. Використання ансамблів моделей класифікації, як більш ефективного алгоритму...45 2.3.1. Беггінг...45 2.3.2. Бустинг...48 2.4. Метрики оцінки якості роботи класифікаторів ...50 2.4.1. Правильність (Accuracy)...51 2.4.2. Точність (Precision)..51 2.4.3. Повнота (Recall) або Чутливість (Sensitivity)... 51 2.4.4. Специфічність (Specificity).... 52 2.4.5. F - міра...52 2.4.6. Log-loss (logarithmic loss).... 52 2.4.7. ROC крива (Receiver Operating Characteristics Curve)... 52 2.5. Висновки до розділу...54 РОЗДІЛ 3. ВИБІР ТА ОПИС МЕТОДІВ МАШИННОГО НАВЧАННЯ ДЛЯ ОБРОБКИ ВЕЛИКИХ ДАНИХ...55 3.1. Підготовка вхідних даних та обробка файлу CSV...55 3.2. Перше дерево рішень (Decision Tree).... 57 3.3. Гіперпараметри дерева рішень...61 3.4. Налаштування дерев рішень...63 3.5. Переглянуто категорійні характеристики...68 3.6. Висновки до розділу...71 РОЗДІЛ 4. ОХОРОНА ПРАЦІ ТА БЕЗПЕКА В НАДЗВИЧАЙНИХ СИТУАЦІЯХ...73 4.1. Охорона праці...73 4.2. Підвищення стійкості роботи об'єктів господарської діяльності у воєнний час...75 4.3. Висновки до розділу...80 ВИСНОВКИ...82 СПИСОК ВИКОРИСТАНИХ ДЖЕРЕЛ...84 Додаток А Тези конференцій...86 Додаток Б Повний код програми...90
"Large-Scale Matrix Completion Using Orthogonal Rank-One Matrix Pursuit, Divide-Factor-Combine, and Apache Spark." Master's thesis, 2014. http://hdl.handle.net/2286/R.I.24857.
Full textDissertation/Thesis
M.S. Computer Science 2014
Books on the topic "Big Data, Machine Learning, Data Science, Apache Spark"
Apache Spark Quick Start Guide: Quickly Learn the Art of Writing Efficient Big Data Applications with Apache Spark. Packt Publishing, Limited, 2019.
Find full textApache Spark Machine Learning Blueprints. Packt Publishing, Limited, 2016.
Find full textLearning Spark: Lightning-Fast Big Data Analysis. O'Reilly Media, 2015.
Find full textKarim, Rezaul, Romeo Kienzler, Sridhar Alla, Siamak Amirghodsi, and Meenakshi Rajendran. Apache Spark 2 : Data Processing and Real-Time Analytics: Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark. Packt Publishing, Limited, 2018.
Find full textHands-On Data Science and Python Machine Learning. Packt Publishing - ebooks Account, 2017.
Find full textSpark: The Definitive Guide: Big Data Processing Made Simple. O'Reilly Media, 2018.
Find full textQuddus, Jillur. Machine Learning with Apache Spark Quick Start Guide: Uncover patterns, derive actionable insights, and learn from big data using MLlib. Packt Publishing, 2018.
Find full textGulli, Dr Antonio. A collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark: Hands-on Big Data and Machine ... Programming Interview Questions). Createspace Independent Publishing Platform, 2015.
Find full textBook chapters on the topic "Big Data, Machine Learning, Data Science, Apache Spark"
Mogha, Garima, Khyati Ahlawat, and Amit Prakash Singh. "Performance Analysis of Machine Learning Techniques on Big Data Using Apache Spark." In Data Science and Analytics, 17–26. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8527-7_2.
Full textAbdel Hai, Ameen, and Babak Forouraghi. "On Scalability of Distributed Machine Learning with Big Data on Apache Spark." In Big Data – BigData 2018, 209–19. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-94301-5_16.
Full textHafez, Manar Mohamed, Mohamed Elemam Shehab, Essam El Fakharany, and Abd El Ftah Abdel Ghfar Hegazy. "Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark." In Advances in Intelligent Systems and Computing, 692–704. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-48308-5_66.
Full textKerestely, Arpad, Alexandra Baicoianu, and Razvan Bocu. "A Research Study on Running Machine Learning Algorithms on Big Data with Spark." In Knowledge Science, Engineering and Management, 307–18. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-82136-4_25.
Full textCheng, Jane, and Peng Zhao. "Sustainable Big Data Analytics Process Pipeline Using Apache Ecosystem." In Encyclopedia of Data Science and Machine Learning, 1247–59. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-7998-9220-5.ch073.
Full textChen, Li, and Lala Aicha Coulibaly. "Data Science and Big Data Practice Using Apache Spark and Python." In Advances in Data Mining and Database Management, 67–95. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-4963-6.ch004.
Full textGupta, Madhuri, and Bharat Gupta. "Survey of Breast Cancer Detection Using Machine Learning Techniques in Big Data." In Research Anthology on Medical Informatics in Breast and Cervical Cancer, 371–85. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-7136-4.ch020.
Full textBrahmane, Anilkumar V., and B. Chaitanya Krishna. "DSAE – Deep Stack Auto Encoder and RCBO – Rider Chaotic Biogeography Optimization Algorithm for Big Data Classification." In Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210198.
Full textRashid, Mamoon, Vishal Goyal, Shabir Ahmad Parah, and Harjeet Singh. "Drug Prediction in Healthcare Using Big Data and Machine Learning." In Advances in Social Networking and Online Communities, 79–92. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-9096-5.ch005.
Full textDumancas, Gerard G., Ghalib A. Bello, Jeff Hughes, Renita Murimi, Lakshmi Chockalingam Kasi Viswanath, Casey O'Neal Orndorff, Glenda Fe Dumancas, and Jacy D. O'Dell. "Visualization Tools for Big Data Analytics in Quantitative Chemical Analysis." In Advances in Data Mining and Database Management, 873–917. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-3142-5.ch030.
Full textConference papers on the topic "Big Data, Machine Learning, Data Science, Apache Spark"
Assefi, Mehdi, Ehsun Behravesh, Guangchi Liu, and Ahmad P. Tafti. "Big data machine learning using apache spark MLlib." In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2017. http://dx.doi.org/10.1109/bigdata.2017.8258338.
Full textDunner, Celestine, Thomas Parnell, Kubilay Atasu, Manolis Sifalakis, and Haralampos Pozidis. "Understanding and optimizing the performance of distributed machine learning applications on apache spark." In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2017. http://dx.doi.org/10.1109/bigdata.2017.8257942.
Full textJunaid, Muhammad, Shiraz Ali Wagan, Nawab Muhammad Faseeh Qureshi, Choon Sung Nam, and Dong Ryeol Shin. "Big data Predictive Analytics for Apache Spark using Machine Learning." In 2020 Global Conference on Wireless and Optical Technologies (GCWOT). IEEE, 2020. http://dx.doi.org/10.1109/gcwot49901.2020.9391620.
Full textChen, Lin, Rui Li, Yige Liu, Ruixuan Zhang, and Diane Myung-kyung Woodbridge. "Machine learning-based product recommendation using Apache Spark." In 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 2017. http://dx.doi.org/10.1109/uic-atc.2017.8397470.
Full textAlomari, Ebtesam, Rashid Mehmood, and Iyad Katib. "Road Traffic Event Detection Using Twitter Data, Machine Learning, and Apache Spark." In 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 2019. http://dx.doi.org/10.1109/smartworld-uic-atc-scalcom-iop-sci.2019.00332.
Full textAlbaldawi, Wafaa S., Rafah M. Almuttairi, and Mehdi Ebady Manaa. "Big Data Analysis for Healthcare Application using Minhash and Machine Learning in Apache Spark Framework." In 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). IEEE, 2022. http://dx.doi.org/10.1109/hora55278.2022.9799934.
Full textSheshasaayee, Ananthi, and J. V. N. Lakshmi. "An insight into tree based machine learning techniques for big data analytics using Apache Spark." In 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT). IEEE, 2017. http://dx.doi.org/10.1109/icicict1.2017.8342833.
Full textSASSI, Imad, Sara OUAFTOUH, and Samir ANTER. "Adaptation of Classical Machine Learning Algorithms to Big Data Context: Problems and Challenges : Case Study: Hidden Markov Models Under Spark." In 2019 1st International Conference on Smart Systems and Data Science (ICSSD). IEEE, 2019. http://dx.doi.org/10.1109/icssd47982.2019.9002857.
Full text