Paper Title
Design And Estimation Of Big Data Analysis Using Mapreduce And Hadoop-A
Abstract
Hadoop is a trendy open-source implementation of the MapReduce programming model. Map-Reduce and
HDFS are the two major mechanisms of Hadoop. To avoid network congestions, a new technique to preprocess intermediate
data between the maps and reduce stages, thus increasing the throughput of Hadoop clusters. These include a serialization
barrier that delays the reduce phase and repetitive merges, disk accesses. To handle large dataset needs to progress the
performance by modifying existing Hadoop system. Describe Hadoop-A, an acceleration framework that optimizes Hadoop
with plugin mechanism implemented for fast data movement, overcoming its existing limitations. A merge algorithm is
introduced to merge data without replication and disk access. In addition, a full pipeline is designed to overlap the shuffle,
merge and reduce phases. Experimental results show that Hadoop-A significantly speeds up data movement in MapReduce.
Keywords— Hadoop, MapReduce, Hadoop Acceleration.