# For Ubuntu 11
sudo apt-add-repository ppa:flexiondotorg/java
sudo apt-get update
sudo apt-get install sun-java6-jdk
# if installation is inside of a VM and behined a proxy
# In addition to configuring proxies, tell sudo to consider the environment with the flag -E
export http_proxy=http://<proxy>:<port>
export https_proxy=http://<proxy>:<port>
sudo -E apt-add-repository ppa:flexiondotorg/javaCheck if java is installed java version
sudo addgroup hadoop
sudo adduser -ingroup hadoop ubuntuWe use the default ubuntu:ubuntu user from Amazon EC2 instance in the following.
su - ubuntu
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keyswget https://archive.apache.org/dist/hadoop/core/hadoop-0.20.205.0/hadoop-0.20.205.0.tar.gz
sudo tar xzf hadoop-0.20.205.0.tar.gz
sudo chown -R ubuntu hadoop-0.20.205.0sudo mkdir -p /home/ubuntu/myhdfs
sudo chown ubuntu:ubuntu /home/ubuntu/myhdfshadoop-env.shexport JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true<configuration> ... </configuration> tags of inconf/core-site.xml.<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/myhdfs</value>
<!-- For default : mkdir -p /tmp/hadoop-username/dfs -->
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>conf/mapred-site.xml:<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>conf/hdfs-site.xml:<property>
<name>dfs.replication</name>
<value>1</value>
</property>hadoop-0.20.205.0/bin/hadoop namenode –formathadoop-0.20.205.0/bin/start-all.shUsing bin/hadoop fsck / to ckeck if all data nodes are on.
Installing ssh Java6 and Hadoop on the new slave node.
Adding a hadoop user if necessary.
Copying the master’s key to the salve, on the master node types follows
ssh-copy-id -i $HOME/.ssh/id_rsa.pub ubuntu@159.xxx.xxx.xxxFollowing configuation with files which are all from the hadoop_home/conf
Download and unzip the Hadoop.
hadoop-env.sh as we did for the master node.core-site.xml, hdfs-site.xml and mapred-site.xml from the master to the slave node. In this case, there is no custom settings on the new node. The localhost should be replaced with IP of the master node.Editing the Slaves file of the master node. Appending the IP address or hostname of the slave node at the end of file.
localhost
159.xxx.xxx.xxxbin/start-all.sh. After a moment, the node will be initialized and appeared on the Web admin of the master.Hadoop_Home/conf/slaves file located on the master node. And making sure that the node is not listed in the exclude file./bin/hadoop-daemon.sh start datanode and bin/hadoop-daemon.sh start tasktracker will start the data storage and task tracker processes on the new node, therefore adding it to the cluster.bin/Hadoop dfsadmin –refreshNodes must be run on the master server. This forces the master to repopulate the list of valid nodes from the slaves and exclude filesDFS Format
bin/hadoop namenode -format # DFS format commandFormat aborted in /home/hduser/hadooptmp/dfs/name
11/10/25 04:29:40 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1If the name node is already shutdown, then go to the dfs directory and manually delete all files. After this, input the format command again.
Name node stucks in the safe mode org.apache.hadoop.dfs.SafeModeException Use the following command to turn off the safe mode.
bin/hadoop dfsadmin -safemode leaveWith many hadoop tutorials I had following problems with Hadoop Eclipse plugin.
The possible error messages:
Error:null
Error: Call to localhost/127.0.0.1:54310 failed on local exception: java.io.EOFException
Error: Call to localhost/127.0.0.1:54310 failed on connection exception:
java.net.ConnectException: Connection Refused. Fix:
Just install the Cygwin, then adding the cygwin path to the Path of Environment Variables: ;c:\cygwin\bin;c:\cygwin\usr\bin. Restart the eclipse, the problem will be fixed.
Permission denied: user=xxx\xxxxx, access=WRITE, inode="":hduser:supergroup:rwxr-xr-Fix:
Changing the permission. This Must be done in the dfs system
hadoop fs -chmod -R ugo+rwx /user VM image hadoop-appliance-0.18.0.vmx http://developer.yahoo.com/hadoop/tutorial/index.html