Transformation of Hadoop : A Survey
Author(s):
Yogesh Prabhu , Vidyalankar Institute of Technology, Wadala, Mumbai; Prof. Sachin Deshpande, VIT, wadala, mumbai.
Keywords:
Hadoop, Mapreduce, YARN, big data
Abstract:
In recent years, the issue of large amount of growing data has gained a lot of attention. Big Data is defined as data that is too big to fit on a single server and too unstructured to fit into a traditional row-and-column database, or too continuously flowing to fit into a static data warehouse. This data is providing huge opportunities to uncover new aspects. Volume, Velocity and Veracity are three major characteristics which are used to define Big Data. Hadoop is a widely adopted open source tool which implements the Google's famous computation model, MapReduce. It is a batch processing Java based programming model which can process large amount of data sets in a distributed environment. Hadoop consist of two major components Hadoop Distributed File System (HDFS) and processing unit called YARN. Hadoop Distributed File System (HDFS) is a distributed file system to store large amount of data on cluster, and Yet another Resource Negotiator (YARN) provides distributed processing of data on cluster. In this paper, we have studied journey of open-source software framework called Hadoop. Being an open-source project Hadoop has evolved tremendously over the years. Every version has improved the capabilities of the platform to help users to solve big data challenges.
Other Details:
| Manuscript Id | : | IJSTEV4I8031
|
| Published in | : | Volume : 4, Issue : 8
|
| Publication Date | : | 01/03/2018
|
| Page(s) | : | 97-101
|
Download Article