Step 1: Enable multiverse repo and get packages
The first thing we need to do is make sure we've got multiverse repos installed. Using your favorite editor (vi) add these lines to your etc/apt/sources.list:
deb http://us.archive.ubuntu.com/ubuntu/ lucid multiverse deb-src http://us.archive.ubuntu.com/ubuntu/ lucid multiverse deb http://us.archive.ubuntu.com/ubuntu/ lucid-updates multiverse deb-src http://us.archive.ubuntu.com/ubuntu/ lucid-updates multiverse
With that done, go ahead and update your copy and install the subversion, java, and ant packages you'll need to do the install.
sudo apt-get update sudo apt-get dist-ugprade sudo apt-get install openjdk-6-jre ant subversion
Step 2: Get Hadoop
The next thing we'll do is grab hadoop. Be sure to get the latest version. For this tutorial we're using 0.20.2
wget http://mirror.its.uidaho.edu/pub/apache/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gzWe'll move this to /usr/local, untar it, and then rename it. Use any alternate techniques you like here.. (e.g. symlinks, different directories, etc) there's no magic in this step
sudo tar xvzf hadoop-0.20.2.tar.gz sudo mv hadoop-0.20.2 hadoop cd hadoop
Once you've extracted it and moved into the directory, find the JAVA_HOME line in the environment script and uncomment it as so
sudo vi conf/hadoop-env.sh export JAVA_HOME=/usr/lib/jvm/java-6-openjdk/
sudo antFinally, when ant is done doing it's thing, remove the build directory
sudo rm -rf /usr/local/hadoop/build
Step 3. Get Hive
From /usr/local let's go ahead and checkout hive using subversion and then build it:
sudo svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hive cd hive sudo ant package
By default hive uses a directory called /user/hive/warehouse You can change that if you like, but for simplicity, we'll just go ahead and create it instead.
sudo mkdir -p /user/hive/warehouse
Step 4: Add the ingredients to your PATH
I'm running hive as root in development but you can add this PATH statement to whatever user has permissions.
export PATH=$PATH:/usr/src/hive/build/dist/bin/ export PATH=$PATH:/usr/src/hive/build/dist/lib/ export PATH=$PATH:/usr/local/hadoop/bin
Once done, log out and log back in (so your path takes hold) and then as root you can launch hive using this command:
hive --service hiveserver
If you get an error about hadoop not being found, make sure you've renamed your hadoop-0.20.2 folder to just hadoop (or used symlinks or whatever)