This blog talks about the HADOOP installation.

It takes at the max 2 hours for the installation if you are lucky

 

Please follow the below steps:

 

Step-1:


   1. Download a stable release ending with tar.gz (hadoop-1.2.1.tar.gz)

   2. In Linux, create a new folder “/home/hadoop”

   3. Move the downloaded file to the folder “/home/hadoop” using Winscp or Filezilla.

   4. In putty type: cd /home/hadoop

   5. Type: tar xvf hadoop-1.2.1.tar.gz

 

Step-2:


Downloading and setting up java:

 

   1.Check if Java is present

Type: java –version

 

   2. If java is not present, please install it by following the below steps

   3. Make a directory where we can install Java (/usr/local/java)

   4. Download 64-bit Linux Java JDK and JRE ending with tar.gz from the below link:

 

http://oracle.com/technetwork/java/javase/downloads/index.html

 

   5. Copy the downloaded files to the created folder

   6. Extract and install java:

 

Type: cd /usr/local/java

Type: tar xvzf jdk.*.tar.gz

Type: tar xvzf jre.*.tar.gz

 

   7. Include all the variables for path and Home directories in the /etc/profile at the end of file

 

JAVA_HOME=/usr/local/java/jdk1.7.0_40

PATH=$PATH:$JAVA_HOME/bin

JRE_HOME=/usr/local/java/jre1.7.0_40

PATH=$PATH:$JRE_HOME/bin

HADOOP_INSTALL=/home/hadoop/hadoop-1.2.1

PATH=$PATH:$ HADOOP_INSTALL /bin

Export JAVA_HOME

Export JRE_HOME

Export HADOOP_INSTALL

 

   8. Run the below commands so that Linux can understand where Java is installed:


sudo update-alternatives --install "/usr/bin/java" "java" "/usr/local/java/jre1.7.0_40/bin/java" 1

sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk1.7.0_40/bin/javac" 1

sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/local/java/jre1.7.0_40/bin/javaws" 1

sudo update-alternatives –set java /usr/local/java/ jre1.7.0_40/bin/java

sudo update-alternatives –set javac /usr/local/java/jdk1.7.0_40/bin/javac

sudo update-alternatives –set javaws /usr/local/java/jre1.7.0_40/bin/javaws

 

   9. Test Java by typing Java –version

  10. Check if JAVA_HOME is set by typing: echo $JAVA_HOME

 

Now we are done with the installation of Hadoop (Stand alone mode).

 

Step-3:


We can check if we are successful by running an example.

Go to Hadoop Installation directory

Type: mkdir output

Type: bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+’

Type: ls output/*

The output is displayed with the success.

 

Step-4:


As a next step, change the configuration in the below files:

 

    1. In the Hadoop installation folder change /conf/core-site.xml file to:

<configuration>

          <property>

                    <name>fs.default.name</name>

                    <value>hdfs://localhost:9000</value>

          </property>

</configuration>

 

    2. Change /conf/hdfs-site.xml:

<configuration>

           <property>

                     <name>dfs.replication</name>

                     <value>1</value>

           </property>

</configuration>

 

    3. Change /conf/mapred-site.xml:

<configuration>

           <property>

                     <name>mapred.job.tracker</name>

                     <value>localhost:9001</value>

           </property>

</configuration>

 

    4. Edit /conf/hadoop-env.sh file:

 

export JAVA_HOME=/usr/local/java/ jdk1.7.0_40

 

Step-5:

 

   1. Setup password less ssh by running the below commands:

 

Type: ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

Type: cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

 

   2. To check if the ssh password is disabled

 

Type: ssh localhost  (It should not ask any password)

 

   3. Format the name node:

 

Type: /bin/hadoop namenode –format

 

Step-6:

 

To start all the Hadoop services:

 

Type: /bin/start-all.sh

 

Now try the same example which we tried earlier:

 

Type: bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+’

 

It should give the output.

 

To stop all the Hadoop services:

 

Type: /bin/stop-all.sh

Actions

Filter Blog

By date: By tag: