Additional Blogs by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
Mithun_K
Advisor
Advisor
0 Kudos

This blog talks about the HADOOP installation.

It takes at the max 2 hours for the installation if you are lucky :smile:

Please follow the below steps:

Step-1:


   1. Download a stable release ending with tar.gz (hadoop-1.2.1.tar.gz)

   2. In Linux, create a new folder “/home/hadoop”

   3. Move the downloaded file to the folder “/home/hadoop” using Winscp or Filezilla.

   4. In putty type: cd /home/hadoop

   5. Type: tar xvf hadoop-1.2.1.tar.gz

Step-2:


Downloading and setting up java:

 

   1.Check if Java is present

Type: java –version

 

   2. If java is not present, please install it by following the below steps

   3. Make a directory where we can install Java (/usr/local/java)

   4. Download 64-bit Linux Java JDK and JRE ending with tar.gz from the below link:

http://oracle.com/technetwork/java/javase/downloads/index.html

   5. Copy the downloaded files to the created folder

   6. Extract and install java:

Type: cd /usr/local/java

Type: tar xvzf jdk.*.tar.gz

Type: tar xvzf jre.*.tar.gz

   7. Include all the variables for path and Home directories in the /etc/profile at the end of file

JAVA_HOME=/usr/local/java/jdk1.7.0_40

PATH=$PATH:$JAVA_HOME/bin

JRE_HOME=/usr/local/java/jre1.7.0_40

PATH=$PATH:$JRE_HOME/bin

HADOOP_INSTALL=/home/hadoop/hadoop-1.2.1

PATH=$PATH:$ HADOOP_INSTALL /bin

Export JAVA_HOME

Export JRE_HOME

Export HADOOP_INSTALL

   8. Run the below commands so that Linux can understand where Java is installed:


sudo update-alternatives --install "/usr/bin/java" "java" "/usr/local/java/jre1.7.0_40/bin/java" 1

sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk1.7.0_40/bin/javac" 1

sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/local/java/jre1.7.0_40/bin/javaws" 1

sudo update-alternatives –set java /usr/local/java/ jre1.7.0_40/bin/java

sudo update-alternatives –set javac /usr/local/java/jdk1.7.0_40/bin/javac

sudo update-alternatives –set javaws /usr/local/java/jre1.7.0_40/bin/javaws

   9. Test Java by typing Java –version

  10. Check if JAVA_HOME is set by typing: echo $JAVA_HOME

Now we are done with the installation of Hadoop (Stand alone mode). :smile:

Step-3:


We can check if we are successful by running an example.

Go to Hadoop Installation directory

Type: mkdir output

Type: bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+’

Type: ls output/*

The output is displayed with the success.

Step-4:


As a next step, change the configuration in the below files:

    1. In the Hadoop installation folder change /conf/core-site.xml file to:

<configuration>

          <property>

                    <name>fs.default.name</name>

                    <value>hdfs://localhost:9000</value>

          </property>

</configuration>

    2. Change /conf/hdfs-site.xml:

<configuration>

           <property>

                     <name>dfs.replication</name>

                     <value>1</value>

           </property>

</configuration>

    3. Change /conf/mapred-site.xml:

<configuration>

           <property>

                     <name>mapred.job.tracker</name>

                     <value>localhost:9001</value>

           </property>

</configuration>

    4. Edit /conf/hadoop-env.sh file:

export JAVA_HOME=/usr/local/java/ jdk1.7.0_40

Step-5:

   1. Setup password less ssh by running the below commands:

Type: ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

Type: cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

   2. To check if the ssh password is disabled

Type: ssh localhost  (It should not ask any password)

   3. Format the name node:

Type: /bin/hadoop namenode –format

Step-6:

To start all the Hadoop services:

Type: /bin/start-all.sh

Now try the same example which we tried earlier:

Type: bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+’

It should give the output.

To stop all the Hadoop services:

Type: /bin/stop-all.sh