Hadoop on Mac OS X
From NeoWiki
From http://www.infosci.cornell.edu/hadoop/mac.html
This guide is written to help Cornell students using Mac OS X 10.5 with setting up a development environment for working with Hadoop [1] and running Hadoop jobs on the Cornell Center for Advanced Computing (CAC) Hadoop cluster. This guide will walk you through compiling and running a simple example Hadoop job. More information is available at the official Hadoop Map-Reduce Tutorial[2].
The overall process of developing a Hadoop job is as follows:
- Install Hadoop on your development machine (personal or lab computer)
- Compile the Hadoop job, create a JAR file
- Run the Hadoop job JAR file on your development machine, for testing and debugging
- Run the Hadoop job JAR file on the CAC Hadoop cluster, for production
1. Installing Hadoop
This section shows you how to download Hadoop and prepare it for use on a Mac machine. Note: For hadoop versions up to and including 0.17.2, you must use Java version 1.5. Using Java 1.6 will fail. The below instructions take this into account.
- Obtain the latest stable Hadoop release. The file is named hadoop-version.tar.gz and can be obtained here. Unzip the downloaded file and place the resulting folder on your Desktop (or other location).
- To make hadoop run on a Mac, you will need to edit two files. Open the file conf/hadoop-env.sh within the hadoop folder you just unzipped in your favorite text editor. Find the following line in the file:
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
and change it to:
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/
Save the file. Second, open the file bin/hadoop within the hadoop folder in your favorite text editor. Search the file for the following line:
JAVA=$JAVA_HOME/bin/java
and change it to:
JAVA=$JAVA_HOME/Commands/java
Save the file and exit the editor. You have now set up Hadoop for development purposes on your computer.