November 25, 2024, Monday, 329

Hadoop on Mac OS X

From NeoWiki

(Difference between revisions)
Jump to: navigation, search
m
Line 9: Line 9:
 
# Run the Hadoop job JAR file on your development machine, for testing and debugging
 
# Run the Hadoop job JAR file on your development machine, for testing and debugging
 
# Run the Hadoop job JAR file on the CAC Hadoop cluster, for production
 
# Run the Hadoop job JAR file on the CAC Hadoop cluster, for production
 +
  
 
==1. Installing Hadoop==
 
==1. Installing Hadoop==
Line 15: Line 16:
  
 
# Obtain the latest stable Hadoop release. The file is named hadoop-version.tar.gz and can be obtained here. Unzip the downloaded file and place the resulting folder on your Desktop (or other location).
 
# Obtain the latest stable Hadoop release. The file is named hadoop-version.tar.gz and can be obtained here. Unzip the downloaded file and place the resulting folder on your Desktop (or other location).
 
 
# To make hadoop run on a Mac, you will need to edit two files. Open the file conf/hadoop-env.sh within the hadoop folder you just unzipped in your favorite text editor. Find the following line in the file:  
 
# To make hadoop run on a Mac, you will need to edit two files. Open the file conf/hadoop-env.sh within the hadoop folder you just unzipped in your favorite text editor. Find the following line in the file:  
  

Revision as of 02:48, 17 November 2009

From http://www.infosci.cornell.edu/hadoop/mac.html

This guide is written to help Cornell students using Mac OS X 10.5 with setting up a development environment for working with Hadoop [1] and running Hadoop jobs on the Cornell Center for Advanced Computing (CAC) Hadoop cluster. This guide will walk you through compiling and running a simple example Hadoop job. More information is available at the official Hadoop Map-Reduce Tutorial[2].

The overall process of developing a Hadoop job is as follows:

  1. Install Hadoop on your development machine (personal or lab computer)
  2. Compile the Hadoop job, create a JAR file
  3. Run the Hadoop job JAR file on your development machine, for testing and debugging
  4. Run the Hadoop job JAR file on the CAC Hadoop cluster, for production


1. Installing Hadoop

This section shows you how to download Hadoop and prepare it for use on a Mac machine. Note: For hadoop versions up to and including 0.17.2, you must use Java version 1.5. Using Java 1.6 will fail. The below instructions take this into account.

  1. Obtain the latest stable Hadoop release. The file is named hadoop-version.tar.gz and can be obtained here. Unzip the downloaded file and place the resulting folder on your Desktop (or other location).
  2. To make hadoop run on a Mac, you will need to edit two files. Open the file conf/hadoop-env.sh within the hadoop folder you just unzipped in your favorite text editor. Find the following line in the file:
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun 

and change it to:

export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/ 

Save the file. Second, open the file bin/hadoop within the hadoop folder in your favorite text editor. Search the file for the following line:

JAVA=$JAVA_HOME/bin/java 

and change it to:

JAVA=$JAVA_HOME/Commands/java 

Save the file and exit the editor. You have now set up Hadoop for development purposes on your computer.