1. Install Java
$ apt-get -y upgrade && apt-get -y update
$ apt install -y default-jdk
$ java --version
2. Create Dedicated Hadoop user
$ sudo addgroup [group name]
$ sudo adduser --ingroup [group name] [user name]
$ sudo adduser [username] sudo # Add to sudoers group
3. Setup Local and HDFS network connection using SSH
$ sudo apt-get install openssh-client openssh-server
$ su - [username]
$ ssh-keygen -t rsa -P ""
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
4. Download Hadoop Tar file from official registry
Link to Hadoop Registry.
$ cd [to hadoop folder]
$ sudo tar xvzf [folder name]
$ sudo mv [extracted folder] /usr/local/hadoop
$ sudo chown -R [username] /usr/local/hadoop
5. Perform configurations
1. ~/.bashrc
Add following lines at End of file
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/native"
- Execute the file to modify changes.
$ source ~/.bashrc
2. /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
3. nano /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
4. /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>
5. /usr/local/hadoop/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
6. /usr/local/hadoop/etc/hadoop/mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
6. Create directories for data node and name node
$ sudo mkdir -p /usr/local/hadoop_space
$ sudo mkdir -p /usr/local/hadoop_space/hdfs/namenode
$ sudo mkdir -p /usr/local/hadoop_space/hdfs/datanode
$ sudo chown -R nish /usr/local/hadoop_space
7. Running Hadoop in Action
i. Format Name node
$ hdfs namenode -format
ii. Start All hadoop components
$ start-dfs.sh
iii. Start YARN
$ start-yarn.sh
iv. Check which components are up
$ jps
Top comments (0)