Transformation of Hadoop : A Survey

Yogesh Prabhu; Prof. Sachin Deshpande

CALL FOR PAPERS : Feb-2026

Submission Last Date

25-Feb-26

Submit Manuscript Online

FOR AUTHORS

FOR REVIEWERS

ARCHIEVES

DOWNLOADS

Open Access

Transformation of Hadoop : A Survey

Author(s):

Yogesh Prabhu , Vidyalankar Institute of Technology, Wadala, Mumbai; Prof. Sachin Deshpande, VIT, wadala, mumbai.

Keywords:

Hadoop, Mapreduce, YARN, big data

Abstract:

In recent years, the issue of large amount of growing data has gained a lot of attention. Big Data is defined as data that is too big to fit on a single server and too unstructured to fit into a traditional row-and-column database, or too continuously flowing to fit into a static data warehouse. This data is providing huge opportunities to uncover new aspects. Volume, Velocity and Veracity are three major characteristics which are used to define Big Data. Hadoop is a widely adopted open source tool which implements the Google's famous computation model, MapReduce. It is a batch processing Java based programming model which can process large amount of data sets in a distributed environment. Hadoop consist of two major components Hadoop Distributed File System (HDFS) and processing unit called YARN. Hadoop Distributed File System (HDFS) is a distributed file system to store large amount of data on cluster, and Yet another Resource Negotiator (YARN) provides distributed processing of data on cluster. In this paper, we have studied journey of open-source software framework called Hadoop. Being an open-source project Hadoop has evolved tremendously over the years. Every version has improved the capabilities of the platform to help users to solve big data challenges.