To Access to Hive from HANA, we first should have Hadoop and Hive installed. In the first section and the second section, installation of Hadoop and hive will be introduced.

Former Member · ‎05-19-2014

To Access to Hive from HANA, we first should have Hadoop and Hive installed. In the first section and the second section, installation of Hadoop and hive will be introduced.

1. Download Hadoop and move to directory

Download Hadoop from apache Hadoop mirror: http://hadoop.apache.org/releases.html#Download

In this case, we choose Hadoop-2.2.0.

Unzip the downloaded Hadoop package and put the Hadoop fold to directory where you want it to be installed.

tar -zxvf hadoop-2.2.0.tar.gz

Switch to your Hana server user:

su hana_user_name.

We need to install Hadoop under Hana user, because Hana server needs to communicate with Hadoop with the same user.

If you just want to set up Hadoop without accessing from Hana, you can simply create a dedicate Hadoop account by “addgroup” and “adduser” (these two command lines depend on system, Suse and Ubuntu seem to have different command lines)

2. Check Java

Before we install the Hadoop, we should make we have Java installed.

Use:

java –version

to check java and find java path by

whereis java

And write the following script in $HOME/.bashrc to add your java path:

export JAVA_HOME=/java/path/

export PATH=$PATH:/java/path/

3. SSH passwordless

Install ssh first if you don’t have.

Type the following commands in console to create a public key and put the key to authorized keys

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

4. Add path to Hadoop

Write the following script in $HOME/.bashrc if you want to add the Hadoop path permanently.

Open the .bashrc file by

vi $HOME/.bashrc

Add the following script

export HADOOP_INSTALL=/hadoop/path/

For the hadoop path, I put the Hadoop folder under /usr/local,

so I use /usr/local/hadoop instead of /hadoop/path/ in my case

export PATH=$PATH:$HADOOP_INSTALL/bin

export PATH=$PATH:$HADOOP_INSTALL/sbin

5. Hadoop configuration

Find the configuration files, core-site.xml, hdfs-core.xml, yarn-site.xml, mapred-site.xml, hadoop-env.sh in Hadoop folder. These files exist in $ HADOOP_INSTALL /etc/hadoop/ under you Hadoop folder. You may simply rename the “template file” in the folder if you can find the xml files. For example:

cp mapred-site.xml.template mapred-site.xml

Some other tutorials said you can find them under /conf/ directory, I guess /conf/ is for older Hadoop version, but in hadoop-2.2.0 the files are under /etc/hadoop/

Modify the configuration files as followed:

vi core-site.xml

Put the following between configuration tab

<property>

<name>fs.default.name</name>

<value>hdfs://computer name or IP(localhost would also work):8020</value>

</property>

vi hdfs-site.xml

Put the following between configuration tab

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/namenode/dir</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/datanode/dir</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

vi yarn-site.xml

Put the following between configuration tab

<property>

<name>yarn.resourcemanager.hostname</name>

<value>yourcomputername or IP</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

vi mapred-site.xml

Put the following between configuration tab

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

For more information about all the tabs, please check

http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/core-default.xml

http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-def...

http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

vi hadoop-env.sh

add the following two statement in the end of this file

export HADOOP_COMMON_LIB_NATIVE_DIR=/hadoop/path/lib/native

export HADOOP_OPTS="-Djava.library.path=/hadoop/path /lib"

6. Start Hadoop

The last thing needs to do before starting your Hadoop is to format your namenode and datanode simply by:

hadoop namenode -format

In the end, you can start Hadoop by calling “start-all.sh”, you may find this file in /hadoop/path/sbin.

To check your Hadoop has started, type

jps

You should see NameNode, NodeManager, DataNode, SecondaryNameNode and ResourceManager are running.

Alternatively, you can also check if Hadoop is running by visiting localhost:50070 to check Hadoop file system information

and localhost:8088 to check cluster information.

You may find that localhost:50030 contains jobtracker info in some tutorials. However, localhost:50030 does not exist in hadoop-2.2.0, because hadoop-2.2.0 divides the two major functions of the JobTracker: resource management and job life-cycle management into separate components. Don’t worry about localhost:50030 not working.

Access to Hive from HANA - Section 1 Hadoop Installation

To Access to Hive from HANA, we first should have Hadoop and Hive installed. In the first section and the second section, installation of Hadoop and hive will be introduced.

1. Download Hadoop and move to directory

2. Check Java

3. SSH passwordless

4. Add path to Hadoop

5. Hadoop configuration

6. Start Hadoop

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win