Academic literature on the topic 'Hadoop (Computer program)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Hadoop (Computer program).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Hadoop (Computer program)"

1

Huang, Tzu-Chi, Guo-Hao Huang, and Ming-Fong Tsai. "Improving the Performance of MapReduce for Small-Scale Cloud Processes Using a Dynamic Task Adjustment Mechanism." Mathematics 10, no. 10 (May 19, 2022): 1736. http://dx.doi.org/10.3390/math10101736.

Full text
Abstract:
The MapReduce architecture can reliably distribute massive datasets to cloud worker nodes for processing. When each worker node processes the input data, the Map program generates intermediate data that are used by the Reduce program for integration. However, as the worker nodes process the MapReduce tasks, there are differences in the number of intermediate data created, due to variation in the operating-system environments and the input data, which results in the phenomenon of laggard nodes and affects the completion time for each small-scale cloud application task. In this paper, we propose a dynamic task adjustment mechanism for an intermediate-data processing cycle prediction algorithm, with the aim of improving the execution performance of small-scale cloud applications. Our mechanism dynamically adjusts the number of Map and Reduce program tasks based on the intermediate-data processing capabilities of each cloud worker node, in order to mitigate the problem of performance degradation caused by the limitations on the Google Cloud platform (Hadoop cluster) due to the phenomenon of laggards. The proposed dynamic task adjustment mechanism was compared with a simulated Hadoop system in a performance analysis, and an improvement of at least 5% in the processing efficiency was found for a small-scale cloud application.
APA, Harvard, Vancouver, ISO, and other styles
2

Ceballos, Oscar, Carlos Alberto Ramírez Restrepo, María Constanza Pabón, Andres M. Castillo, and Oscar Corcho. "SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink." Applied Sciences 11, no. 15 (July 30, 2021): 7033. http://dx.doi.org/10.3390/app11157033.

Full text
Abstract:
Existing SPARQL query engines and triple stores are continuously improved to handle more massive datasets. Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the MapReduce Programming Model and Hadoop-based ecosystems. New trends in Big Data technologies have also emerged (e.g., Apache Spark, Apache Flink); they use distributed in-memory processing and promise to deliver higher data processing performance. In this paper, we present a formal interpretation of some PACT transformations implemented in the Apache Flink DataSet API. We use this formalization to provide a mapping to translate a SPARQL query to a Flink program. The mapping was implemented in a prototype used to determine the correctness and performance of the solution. The source code of the project is available in Github under the MIT license.
APA, Harvard, Vancouver, ISO, and other styles
3

Zuo, Zhiqiang, Kai Wang, Aftab Hussain, Ardalan Amiri Sani, Yiyu Zhang, Shenming Lu, Wensheng Dou, et al. "Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan." ACM Transactions on Computer Systems 38, no. 1-2 (July 2021): 1–39. http://dx.doi.org/10.1145/3466820.

Full text
Abstract:
There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability. In this article, we revisit the scalability problem of interprocedural static analysis from a “Big Data” perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We propose Graspan , a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs. We develop two backends for Graspan, namely, Graspan-C running on CPUs and Graspan-G on GPUs, and present their designs in the article. Graspan-C can analyze large-scale systems code on any commodity PC, while, if GPUs are available, Graspan-G can be readily used to achieve orders of magnitude speedup by harnessing a GPU’s massive parallelism. We have implemented fully context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases written in multiple languages such as Linux and Apache Hadoop demonstrates that their Graspan implementations are language-independent, scale to millions of lines of code, and are much simpler than their original implementations. Moreover, we show that these analyses can be used to uncover many real-world bugs in large-scale systems code.
APA, Harvard, Vancouver, ISO, and other styles
4

Lee, Kyong-Ha, Woo Lam Kang, and Young-Kyoon Suh. "Improving I/O Efficiency in Hadoop-Based Massive Data Analysis Programs." Scientific Programming 2018 (December 2, 2018): 1–9. http://dx.doi.org/10.1155/2018/2682085.

Full text
Abstract:
Apache Hadoop has been a popular parallel processing tool in the era of big data. While practitioners have rewritten many conventional analysis algorithms to make them customized to Hadoop, the issue of inefficient I/O in Hadoop-based programs has been repeatedly reported in the literature. In this article, we address the problem of the I/O inefficiency in Hadoop-based massive data analysis by introducing our efficient modification of Hadoop. We first incorporate a columnar data layout into the conventional Hadoop framework, without any modification of the Hadoop internals. We also provide Hadoop with indexing capability to save a huge amount of I/O while processing not only selection predicates but also star-join queries that are often used in many analysis tasks.
APA, Harvard, Vancouver, ISO, and other styles
5

Dicky, Timothy, Alva Erwin, and Heru Purnomo Ipung. "Developing a Scalable and Accurate Job Recommendation System with Distributed Cluster System using Machine Learning Algorithm." Journal of Applied Information, Communication and Technology 7, no. 2 (March 17, 2021): 71–78. http://dx.doi.org/10.33555/jaict.v7i2.108.

Full text
Abstract:
The purpose of this research is to develop a job recommender system based on the Hadoop MapReduce framework to achieve scalability of the system when it processes big data. Also, a machine learning algorithm is implemented inside the job recommender to produce an accurate job recommendation. The project begins by collecting sample data to build an accurate job recommender system with a centralized program architecture. Then a job recommender with a distributed system program architecture is implemented using Hadoop MapReduce which then deployed to a Hadoop cluster. After the implementation, both systems are tested using a large number of applicants and job data, with the time required for the program to compute the data is recorded to be analyzed. Based on the experiments, we conclude that the recommender produces the most accurate result when the cosine similarity measure is used inside the algorithm. Also, the centralized job recommender system is able to process the data faster compared to the distributed cluster job recommender system. But as the size of the data grows, the centralized system eventually will lack the capacity to process the data, while the distributed cluster job recommender is able to scale according to the size of the data.
APA, Harvard, Vancouver, ISO, and other styles
6

Sanchez, David, Oswaldo Solarte, Victor Bucheli, and Hugo Ordonez. "Evaluating The Scalability of Big Data Frameworks." Scalable Computing: Practice and Experience 19, no. 3 (September 14, 2018): 301–7. http://dx.doi.org/10.12694/scpe.v19i3.1402.

Full text
Abstract:
The aim of this paper is to present a method based on the Isoefficiency for assessing the scalability in big data environments. The programs word count and sort were implemented and compared in Hadoop and Spark. The results confirm that isoefficiency presented a linear growth as the size of the data sets was increased. It was experimentally confronted that the evaluated frameworks are scalable and a model of the form Y (s) = β X(s)$ where β ≈[0.47-0.85] <1 was obtained. The paper discuss how the scalability in big data is governed by a constant of scalability (β).
APA, Harvard, Vancouver, ISO, and other styles
7

Tall, Anne M., and Cliff C. Zou. "A Framework for Attribute-Based Access Control in Processing Big Data with Multiple Sensitivities." Applied Sciences 13, no. 2 (January 16, 2023): 1183. http://dx.doi.org/10.3390/app13021183.

Full text
Abstract:
There is an increasing demand for processing large volumes of unstructured data for a wide variety of applications. However, protection measures for these big data sets are still in their infancy, which could lead to significant security and privacy issues. Attribute-based access control (ABAC) provides a dynamic and flexible solution that is effective for mediating access. We analyzed and implemented a prototype application of ABAC to large dataset processing in Amazon Web Services, using open-source versions of Apache Hadoop, Ranger, and Atlas. The Hadoop ecosystem is one of the most popular frameworks for large dataset processing and storage and is adopted by major cloud service providers. We conducted a rigorous analysis of cybersecurity in implementing ABAC policies in Hadoop, including developing a synthetic dataset of information at multiple sensitivity levels that realistically represents healthcare and connected social media data. We then developed Apache Spark programs that extract, connect, and transform data in a manner representative of a realistic use case. Our result is a framework for securing big data. Applying this framework ensures that serious cybersecurity concerns are addressed. We provide details of our analysis and experimentation code in a GitHub repository for further research by the community.
APA, Harvard, Vancouver, ISO, and other styles
8

Ortega, Pablo G., and David R. Entem. "Coupling Hadron-Hadron Thresholds within a Chiral Quark Model Approach." Symmetry 13, no. 2 (February 6, 2021): 279. http://dx.doi.org/10.3390/sym13020279.

Full text
Abstract:
Heavy hadron spectroscopy was well understood within the naive quark model until the end of the past century. However, in 2003, the X(3872) was discovered, with puzzling properties difficult to understand in the simple naive quark model picture. This state made clear that excited states of heavy mesons should be coupled to two-meson states in order to understand not only the masses but, in some cases, unexpected decay properties. In this work, we will give an overview of a way in which the naive quark model can be complemented with the coupling to two hadron thresholds. This program has been already applied to the heavy meson spectrum with the chiral quark model, and we show some examples where thresholds are of special relevance.
APA, Harvard, Vancouver, ISO, and other styles
9

S.Kalai vani, Y., and Dr P.Ranjana. "Anomaly Detection in Distributed Denial of Service Attack using Map Reduce Improvised counter-based algorithm in Hadoop." International Journal of Engineering & Technology 7, no. 4.36 (December 9, 2018): 390. http://dx.doi.org/10.14419/ijet.v7i4.36.23811.

Full text
Abstract:
A Distributed Denial of Service (DDOS) is one of the major threats in the cyber network and it causes the computers flooded with the Users Datagram Packet (UDP).This type of attack crashes the victim with large volume of traffic and the victim is not capable of performing normal communication and crashes it completely. To handle this DDOS attack the normal Intrusion Detection System is not suitable to hold and find the amount of the data in the network. Hadoop is a frame work that allows huge amount of data and it is used to processes the huge amount of data. A Map reduce program comprises of a Map task that performs filtering and sorting and a Reduce task that performs summary operation. The propose work focuses on the detection algorithm based on Map Reduce platform which uses the Improvised counter based (MRICB) algorithm to detect the DDOS flooding attacks. The MRICB algorithm is implemented with Map Reduce functionalities at the stage of verifying the Network IPS. This proposed algorithm also focuses on the UDP flooding attack using anomaly based intrusion detection technique that identifies the kind of packets and the flow of packet in the node is more that the set threshold and also identifies the source code causing UDP Flood attack . Thus it ensures the normal communication with large volume of traffic.
APA, Harvard, Vancouver, ISO, and other styles
10

Umam, Aguswan Khotibul. "Perberdayaan Santri Melalui Pendidikan Kecakapan Hidup." Tarbawiyah Jurnal Ilmiah Pendidikan 1, no. 01 (January 7, 2018): 163. http://dx.doi.org/10.32332/tarbawiyah.v1i01.1015.

Full text
Abstract:
Pesantren’s roles include providing life skill and internalizing Islamic principles in real life contexts. This research was focused on the empowerment of students’ life skill: personal skill,thinking skill, social skill, academic skill, and vocational skill at Pondok Pesantren Darul A’mal Metro Lampung. The students were trained on life skills through several trainings: reciting Islamic classical book,computer and IT, fahmil Qur’an, tilwatil Qur’an, public speaking skills, calligraphy, tahkfid nadzom, hadroh, syaril Qur’an; and the students also have opportunity to follow shalawat singing training and pencak silat sport such as Persaudaraan Setia Hati and Pagar Nusa. In order to succeed these programs, all stakeholders should cooperate to develop students’ life skills according to teaching vision and mission of the Pondok Pesantren.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Hadoop (Computer program)"

1

Yee, Adam J. "Sharing the love : a generic socket API for Hadoop Mapreduce." Scholarly Commons, 2011. https://scholarlycommons.pacific.edu/uop_etds/772.

Full text
Abstract:
Hadoop is a popular software framework written in Java that performs data-intensive distributed computations on a cluster. It includes Hadoop MapReduce and the Hadoop Distributed File System (HDFS). HDFS has known scalability limitations due to its single NameNode which holds the entire file system namespace in RAM on one computer. Therefore, the NameNode can only store limited amounts of file names depending on the RAM capacity. The solution to furthering scalability is distributing the namespace similar to how file is data divided into chunks and stored across cluster nodes. Hadoop has an abstract file system API which is extended to integrate HDFS, but has also been extended for integrating file systems S3, CloudStore, Ceph and PVFS. File systems Ceph and PVFS already distribute the namespace, while others such as Lustre are making the conversion. Google previously announced in 2009 they have been implementing a Google File System distributed namespace to achieve greater scalability. The Generic Hadoop API is created from Hadoop's abstract file system API. It speaks a simple communication protocol that can integrate any file system which supports TCP sockets. By providing a file system agnostic API, future work with other file systems might provide ways for surpassing Hadoop 's current scalability limitations. Furthermore, the new API eliminates the need for customizing Hadoop's Java implementation, and instead moves the implementation to the file system itself. Thus, developers wishing to integrate their new file system with Hadoop are not responsible for understanding details ofHadoop's internal operation. The API is tested on a homogeneous, four-node cluster with OrangeFS. Initial OrangeFS I/0 throughputs compared to HDFS are 67% ofHDFS' write throughput and 74% percent of HDFS' read throughput. But, compared with an alternate method of integrating with OrangeFS (a POSIX kernel interface), write and read throughput is increased by 23% and 7%, respectively
APA, Harvard, Vancouver, ISO, and other styles
2

Cheng, Lu. "Concentric layout, a new scientific data layout for matrix data set in Hadoop file system." Master's thesis, University of Central Florida, 2010. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4545.

Full text
Abstract:
The data generated by scientific simulation, sensor, monitor or optical telescope has increased with dramatic speed. In order to analyze the raw data speed and space efficiently, data pre-process operation is needed to achieve better performance in data analysis phase. Current research shows an increasing tread of adopting MapReduce framework for large scale data processing. However, the data access patterns which generally applied to scientific data set are not supported by current MapReduce framework directly. The gap between the requirement from analytics application and the property of MapReduce framework motivates us to provide support for these data access patterns in MapReduce framework. In our work, we studied the data access patterns in matrix files and proposed a new concentric data layout solution to facilitate matrix data access and analysis in MapReduce framework. Concentric data layout is a data layout which maintains the dimensional property in chunk level. Contrary to the continuous data layout which adopted in current Hadoop framework by default, concentric data layout stores the data from the same sub-matrix into one chunk. This matches well with the matrix operations like computation. The concentric data layout preprocesses the data beforehand, and optimizes the afterward run of MapReduce application. The experiments indicate that the concentric data layout improves the overall performance, reduces the execution time by 38% when the file size is 16 GB, also it relieves the data overhead phenomenon and increases the effective data retrieval rate by 32% on average.
ID: 029051151; System requirements: World Wide Web browser and PDF reader.; Mode of access: World Wide Web.; Thesis (M.S.)--University of Central Florida, 2010.; Includes bibliographical references (p. 56-58).
M.S.
Masters
Department of Electrical Engineering and Computer Science
Engineering
APA, Harvard, Vancouver, ISO, and other styles
3

Lakkimsetti, Praveen Kumar. "A framework for automatic optimization of MapReduce programs based on job parameter configurations." Kansas State University, 2011. http://hdl.handle.net/2097/12011.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
Mitchell L. Neilsen
Recently, cost-effective and timely processing of large datasets has been playing an important role in the success of many enterprises and the scientific computing community. Two promising trends ensure that applications will be able to deal with ever increasing data volumes: first, the emergence of cloud computing, which provides transparent access to a large number of processing, storage and networking resources; and second, the development of the MapReduce programming model, which provides a high-level abstraction for data-intensive computing. MapReduce has been widely used for large-scale data analysis in the Cloud [5]. The system is well recognized for its elastic scalability and fine-grained fault tolerance. However, even to run a single program in a MapReduce framework, a number of tuning parameters have to be set by users or system administrators to increase the efficiency of the program. Users often run into performance problems because they are unaware of how to set these parameters, or because they don't even know that these parameters exist. With MapReduce being a relatively new technology, it is not easy to find qualified administrators [4]. The major objective of this project is to provide a framework that optimizes MapReduce programs that run on large datasets. This is done by executing the MapReduce program on a part of the dataset using stored parameter combinations and setting the program with the most efficient combination and this modified program can be executed over the different datasets. We know that many MapReduce programs are used over and over again in applications like daily weather analysis, log analysis, daily report generation etc. So, once the parameter combination is set, it can be used on a number of data sets efficiently. This feature can go a long way towards improving the productivity of users who lack the skills to optimize programs themselves due to lack of familiarity with MapReduce or with the data being processed.
APA, Harvard, Vancouver, ISO, and other styles
4

Jang, Jiyong. "Scaling Software Security Analysis to Millions of Malicious Programs and Billions of Lines of Code." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/306.

Full text
Abstract:
Software security is a big data problem. The volume of new software artifacts created far outpaces the current capacity of software analysis. This gap has brought an urgent challenge to our security community—scalability. If our techniques cannot cope with an ever increasing volume of software, we will always be one step behind attackers. Thus developing scalable analysis to bridge the gap is essential. In this dissertation, we argue that automatic code reuse detection enables an efficient data reduction of a high volume of incoming malware for downstream analysis and enhances software security by efficiently finding known vulnerabilities across large code bases. In order to demonstrate the benefits of automatic software similarity detection, we discuss two representative problems that are remedied by scalable analysis: malware triage and unpatched code clone detection. First, we tackle the onslaught of malware. Although over one million new malware are reported each day, existing research shows that most malware are not written from scratch; instead, they are automatically generated variants of existing malware. When groups of highly similar variants are clustered together, new malware more easily stands out. Unfortunately, current systems struggle with handling this high volume of malware. We scale clustering using feature hashing and perform semantic analysis using co-clustering. Our evaluation demonstrates that these techniques are an order of magnitude faster than previous systems and automatically discover highly correlated features and malware groups. Furthermore, we design algorithms to infer evolutionary relationships among malware, which helps analysts understand trends over time and make informed decisions about which malware to analyze first. Second, we address the problem of detecting unpatched code clones at scale. When buggy code gets copied from project to project, eventually all projects will need to be patched. We call clones of buggy code that have been fixed in only a subset of projects unpatched code clones. Unfortunately, code copying is usually ad-hoc and is often not tracked, which makes it challenging to identify all unpatched vulnerabilities in code basesat the scale of entire OS distributions. We scale unpatched code clone detection to spot over15,000 latent security vulnerabilities in 2.1 billion lines of code from the Linux kernel, allDebian and Ubuntu packages, and all C/C++ projects in SourceForge in three hours on asingle machine. To the best of our knowledge, this is the largest set of bugs ever reported in a single paper.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhong, Peilin. "New Primitives for Tackling Graph Problems and Their Applications in Parallel Computing." Thesis, 2021. https://doi.org/10.7916/d8-pnyz-ck91.

Full text
Abstract:
We study fundamental graph problems under parallel computing models. In particular, we consider two parallel computing models: Parallel Random Access Machine (PRAM) and Massively Parallel Computation (MPC). The PRAM model is a classic model of parallel computation. The efficiency of a PRAM algorithm is measured by its parallel time and the number of processors needed to achieve the parallel time. The MPC model is an abstraction of modern massive parallel computing systems such as MapReduce, Hadoop and Spark. The MPC model captures well coarse-grained computation on large data --- data is distributed to processors, each of which has a sublinear (in the input data) amount of local memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. We usually desire fully scalable MPC algorithms, i.e., algorithms that can work for any local memory size. The efficiency of a fully scalable MPC algorithm is measured by its parallel time and the total space usage (the local memory size times the number of machines). Consider an 𝑛-vertex 𝑚-edge undirected graph 𝐺 (either weighted or unweighted) with diameter 𝐷 (the largest diameter of its connected components). Let 𝑁=𝑚+𝑛 denote the size of 𝐺. We present a series of efficient (randomized) parallel graph algorithms with theoretical guarantees. Several results are listed as follows: 1) Fully scalable MPC algorithms for graph connectivity and spanning forest using 𝑂(𝑁) total space and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time. 2) Fully scalable MPC algorithms for 2-edge and 2-vertex connectivity using 𝑂(𝑁) total space where 2-edge connectivity algorithm needs 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time, and 2-vertex connectivity algorithm needs 𝑂(log 𝐷⸱log²log_{𝑁/𝑛} n+\log D'⸱loglog_{𝑁/𝑛} 𝑛) parallel time. Here 𝐷' denotes the bi-diameter of 𝐺. 3) PRAM algorithms for graph connectivity and spanning forest using 𝑂(𝑁) processors and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time. 4) PRAM algorithms for (1 + 𝜖)-approximate shortest path and (1 + 𝜖)-approximate uncapacitated minimum cost flow using 𝑂(𝑁) processors and poly(log 𝑛) parallel time. These algorithms are built on a series of new graph algorithmic primitives which may be of independent interests.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Hadoop (Computer program)"

1

White, Tom. Hadoop: The Definitive Guide. 2nd ed. Sebastopol: O'Reilly, 2010.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kumar, Vavilapalli Vinod, Eadline Doug 1956-, Niemiec Joseph, and Markham Jeff, eds. Apache Hadoop YARN: Moving beyond MapReduce and batch processing with Apache Hadoop 2. Upper Saddle River, NJ: Addison-Wesley, 2014.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wilkinson, Paul, Lars George, Jan Kunigk, and Ian Buss. Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale. Edited by Nicole Tache and Michele Cronin. Beijing: O'Reilly Media, 2019.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Big Data analytics with R and Hadoop: Set up an integrated infrastructure of R and Hadoop to turn your data analytics into Big Data analytics. Birmingham: Packt Publishing, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Research), International Roxie Users Meeting and Workshop (1st 1998 European Organization for Nuclear. ROXIE: Routine for the optimization of magnet X-sections, inverse field calculation and coil end design : First International Roxie Users Meeting and Workshop, CERN, 16-18 March 1998, proceedings. Geneva: CERN, European Organization for Nuclear Research, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Miner, Donald. Enterprise Hadoop. O'Reilly Media, Incorporated, 2016.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Deshpande, Tanmay, and Anurag Shrivastava. Hadoop Blueprints. Packt Publishing - ebooks Account, 2016.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Big Data Processing with Hadoop. IGI Global, 2018.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Revathi, T., M. Blessa Binolin, and K. Muneeswaran. Big Data Processing with Hadoop. IGI Global, 2018.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Menguy, Charles. Hadoop Programming: Pushing the Limit. Wiley & Sons, Limited, John, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Hadoop (Computer program)"

1

Yadav, Ravinder, Aravind Kilaru, Devesh Kumar Srivastava, and Priyanka Dahiya. "Performance Evaluation of Word Count Program Using C#, Java and Hadoop." In Communications in Computer and Information Science, 299–307. Singapore: Springer Singapore, 2016. http://dx.doi.org/10.1007/978-981-10-3433-6_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Keats, Jonathon. "The Cloud." In Virtual Words. Oxford University Press, 2010. http://dx.doi.org/10.1093/oso/9780195398540.003.0010.

Full text
Abstract:
“It’s really just complete gibberish,” seethed Larry Ellison when asked about the cloud at a financial analysts’ conference in September 2008. “When is this idiocy going to stop?” By March 2009 the Oracle CEO had answered his own question, in a manner of speaking: in an earnings call to investors, Ellison brazenly peddled Oracle’s own forthcoming software as “cloud-computing ready.” Ellison’s capitulation was inevitable. The cloud is ubiquitous, the catchiest online metaphor since Tim Berners-Lee proposed “a way to link and access information of various kinds” at the European Organization for Nuclear Research (CERN) in 1990 and dubbed his creation the WorldWideWeb. In fact while many specific definitions of cloud computing have been advanced by companies seeking to capitalize on the cloud’s popularity—Dell even attempted to trademark the term, unsuccessfully—the cloud has most broadly come to stand for the web, a metaphor for a metaphor reminding us of how unfathomable our era’s signal invention has become. When Berners-Lee conceived the web his ideas were anything but cloudy. His inspiration was hypertext, developed by the computer pioneer Ted Nelson in the 1960s as a means of explicitly linking wide-ranging information in a nonhierarchical way. Nelson envisioned a “docuverse” which he described as “a unified environment available to everyone providing access to this whole space.” In 1980 Berners-Lee implemented this idea in a rudimentary way with a program called Enquire, which he used to cross-reference the software in CERN’s Proton Synchrotron control room. Over the following decade, machines such as the Proton Synchrotron threatened to swamp CERN with scientific data. Looking forward to the Large Hadron Collider, physicists began voicing concern about how they’d ever process their experiments, let alone productively share results with colleagues. Berners-Lee reckoned that, given wide enough implementation, hypertext might rescue them. He submitted a proposal in March 1989 for an “information mesh” accessible to the several thousand CERN employees. “Vague, but interesting,” his boss replied. Adequately encouraged, Berners-Lee spent the next year and a half struggling to refine his idea, and also to find a suitable name.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Hadoop (Computer program)"

1

Grazzini, Massimiliano. "HNNLO: a Monte Carlo program to compute Higgs boson production at hadron colliders." In 8th International Symposium on Radiative Corrections. Trieste, Italy: Sissa Medialab, 2008. http://dx.doi.org/10.22323/1.048.0046.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography