The core components in Hadoop

Ashok Kunkala

Walmart Global Tech•1K followers

Published May 22, 2018

+ Follow

The Core Components in Hadoop.

Three core components of Haoop are :

HDFS: The Java-based distributed file system that can store all kinds of data without prior organization.

MapReduce: The Java-based distributed file system that can store all kinds of data without prior organization.

YARN: The Java-based distributed file system that can store all kinds of data without prior organization.

HDFS is a fault-tolerant and self-healing distributed filesystem designed to turn a cluster of industry-standard servers into a massively scalable pool of storage. Developed specifically for large-scale data processing workloads where scalability, flexibility, and throughput are critical, HDFS accepts data in any format regardless of schema, optimizes for high-bandwidth streaming,.

The default big data storage layer for Apache Hadoop is HDFS. HDFS is the “Secret Sauce” of Apache Hadoop components as users can dump huge datasets into HDFS and the data will sit there nicely until the user wants to leverage it for analysis. HDFS component creates several replicas of the data block to be distributed across different clusters for reliable and quick data access. HDFS comprises of 3 important components-NameNode, DataNode and Secondary NameNode. HDFS operates on a Master-Slave architecture model where the NameNode acts as the master node for keeping a track of the storage cluster and the DataNode acts as a slave node summing up to the various systems within a Hadoop cluster

YARN forms an integral part of Hadoop 2.0.YARN is great enabler for dynamic resource utilization on Hadoop framework as users can run various Hadoop applications without having to bother about increasing workloads.

Regards,

Ashok Kumar K.

To view or add a comment, sign in

The core components in Hadoop

Ashok Kunkala

Walmart Global Tech•1K followers

More articles by Ashok Kunkala

Others also viewed

Controlling Hadoop Storage

Hadoop 3: Comparison with Hadoop 2 and Spark

Understanding Hadoop in 8 Minutes

Hadoop & Ansible Integration

Hadoop 2.x

Common Hadoop interview questions and answers

Task Efficiency: A Comparative Study of Hadoop MapReduce, Apache Spark

How "HADOOP" revolutionised Data Processing

Hadoop Ecosystem

Hadoop Ecosystem

Explore content categories

More articles by Ashok Kunkala

AI Engineering: RAG vs MCP

Small Talk about Apache Hudi

MapReduce Vs Spark

Big Data Vs Hadoop

Others also viewed

Controlling Hadoop Storage

Hadoop 3: Comparison with Hadoop 2 and Spark

Understanding Hadoop in 8 Minutes

Hadoop & Ansible Integration

Hadoop 2.x

Common Hadoop interview questions and answers

Task Efficiency: A Comparative Study of Hadoop MapReduce, Apache Spark

How "HADOOP" revolutionised Data Processing

Hadoop Ecosystem

Hadoop Ecosystem

Explore content categories