2-node distributed cluster
- Node 1 → NameNode + ResourceManager + DataNode
- Node 2 → DataNode + NodeManager
- Replication factor = 2 (so blocks replicate across machines)
Assumptions
- You have single node cluster already set up on both nodes using Single node cluster setup
- Node1 hostname:
node1 - Node2 hostname:
node2 - Both users have Hadoop in
$HOME/hadoop - Both machines can SSH to each other
High-Level Change
From:
Node1 → Complete standalone cluster
Node2 → Complete standalone cluster
To:
node1:
NameNode
DataNode
ResourceManager
NodeManager
node2:
DataNode
NodeManager
STEP 0 - Take config backup in case restoring back to single node cluster (On both nodes)
cp -r $HADOOP_HOME/etc/hadoop $HADOOP_HOME/etc/hadoop_backup
STEP 1 — Clean Both Nodes (Important)
On both nodes:
Stop everything:
stop-dfs.sh
stop-yarn.sh
Remove old metadata:
rm -rf $HOME/hadoop-data/dfs/*
We must reformat the cluster because metadata must be shared from a single NameNode.
STEP 2 — Decide Master Node
Let:
- node1 → Master
- node2 → Worker
Find hostnames:
hostname
Or use IPs if hostnames don’t resolve.
If needed, add entries in ~/.bashrc (temporary):
export HADOOP_MASTER=node1
STEP 3 — Configure SSH Between Nodes
From node1, ensure passwordless SSH to node2:
ssh node2
If password prompts → copy key:
ssh-copy-id node2
Test again:
ssh node2
Also test reverse:
ssh node1
This is required because start-dfs.sh uses SSH to start daemons remotely.
STEP 4 — Modify Configuration (On BOTH Nodes)
Now we update configs to make them distributed.
core-site.xml (Both nodes)
Change:
<property>
<name>fs.defaultFS</name>
<value>hdfs://node1:9000</value>
</property>
IMPORTANT:
- Replace
localhost - Use
node1(master hostname or IP)
hdfs-site.xml (Both nodes)
Change replication to 2:
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
Keep name/data directories local (do NOT share directories).
On node1:
dfs.namenode.name.dir → node1 local path
dfs.datanode.data.dir → node1 local path
On node2:
dfs.datanode.data.dir → node2 local path
Node2 does NOT need namenode directory actively used (but harmless if present).
workers file (VERY IMPORTANT)
Edit:
$HADOOP_HOME/etc/hadoop/workers
On node1, set:
node1
node2
On node2, it doesn’t matter much, but keep same for consistency.
This tells Hadoop where to start DataNodes.
yarn-site.xml (Both nodes)
Add ResourceManager host:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node1</value>
</property>
Keep shuffle service.
STEP 5 — Format ONLY ON MASTER (node1)
⚠ IMPORTANT: Format only once, and only on node1.
On node1:
hdfs namenode -format
DO NOT format on node2.
STEP 6 — Start Cluster From node1
On node1:
start-dfs.sh
start-yarn.sh
These scripts will SSH into node2 and start DataNode + NodeManager there.
STEP 7 — Verify Processes
On node1:
jps
Should show:
NameNode
DataNode
ResourceManager
NodeManager
On node2:
jps
Should show:
DataNode
NodeManager
STEP 8 — Verify Replication
Upload a file:
hdfs dfs -mkdir /test
hdfs dfs -put somefile.txt /test
Now check block placement:
hdfs fsck /test/somefile.txt -files -blocks -locations
You should see:
Block replica on node1
Block replica on node2
🔥 Final Architecture
node1
-------------------------
NameNode
DataNode
ResourceManager
NodeManager
-------------------------
|
| Replication pipeline
|
-------------------------
node2
DataNode
NodeManager
-------------------------