/×××××××××××××××××××××××××××××××××××××××××/
Author:xxx0624
HomePage:http://www.cnblogs.com/xxx0624/
/×××××××××××××××××××××××××××××××××××××××××/
Hadoop伪分布式配置过程:
Hadoop:1.2.1
Hbase:0.94.25
nutch:2.2.1
Java:1.8.0
SSH:1.0.1j
tomcat:7.0.57
zookeeper:3.4.6
(1)配置Java环境:http://www.cnblogs.com/xxx0624/p/4164744.html
(2)配置OpenSSH:http://www.cnblogs.com/xxx0624/p/4165252.html
(3)配置Hadoop:http://www.cnblogs.com/xxx0624/p/4166095.html
(4)配置tomcat:http://www.cnblogs.com/xxx0624/p/4166840.html
(5)配置zookeeper:http://www.cnblogs.com/xxx0624/p/4168440.html
(6)配置HBase:http://www.cnblogs.com/xxx0624/p/4170468.html
(7)配置ant:http://www.cnblogs.com/xxx0624/p/4172277.html
(8)配置nutch:http://www.cnblogs.com/xxx0624/p/4172601.html
(9)集成:http://www.cnblogs.com/xxx0624/p/4176199.html
Hadoop:
(1)hadoop/conf/core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/xxx0624/hadoop/tmp</value> </property> </configuration>
(2)hadoop/conf/hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>/home/xxx0624/hadoop/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/home/xxx0624/hadoop/hdfs/data</value> </property> </configuration>
(3)hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
(4)hadoop/conf/hadoop-env.sh
1 # Set Hadoop-specific environment variables here. 2 3 # The only required environment variable is JAVA_HOME. All others are 4 # optional. When running a distributed configuration it is best to 5 # set JAVA_HOME in this file, so that it is correctly defined on 6 # remote nodes. 7 8 # The java implementation to use. Required. 9 # export JAVA_HOME=/usr/lib/j2sdk1.5-sun 10 11 # Extra Java CLASSPATH elements. Optional. 12 # export HADOOP_CLASSPATH= 13 14 # The maximum amount of heap to use, in MB. Default is 1000. 15 # export HADOOP_HEAPSIZE=2000 16 17 # Extra Java runtime options. Empty by default. 18 # export HADOOP_OPTS=-server 19 20 # Command specific options appended to HADOOP_OPTS when specified 21 export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS" 22 export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS" 23 export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS" 24 export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS" 25 export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS" 26 # export HADOOP_TASKTRACKER_OPTS= 27 # The following applies to multiple commands (fs, dfs, fsck, distcp etc) 28 # export HADOOP_CLIENT_OPTS 29 30 # Extra ssh options. Empty by default. 31 # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" 32 33 # Where log files are stored. $HADOOP_HOME/logs by default. 34 # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs 35 36 # File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default. 37 # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves 38 39 # host:path where hadoop code should be rsync'd from. Unset by default. 40 # export HADOOP_MASTER=master:/home/$USER/src/hadoop 41 42 # Seconds to sleep between slave commands. Unset by default. This 43 # can be useful in large clusters, where, e.g., slave rsyncs can 44 # otherwise arrive faster than the master can service them. 45 # export HADOOP_SLAVE_SLEEP=0.1 46 47 # The directory where pid files are stored. /tmp by default. 48 # NOTE: this should be set to a directory that can only be written to by 49 # the users that are going to run the hadoop daemons. Otherwise there is 50 # the potential for a symlink attack. 51 # export HADOOP_PID_DIR=/var/hadoop/pids 52 53 # A string representing this instance of hadoop. $USER by default. 54 # export HADOOP_IDENT_STRING=$USER 55 56 # The scheduling priority for daemon processes. See 'man nice'. 57 # export HADOOP_NICENESS=10 58 59 export JAVA_HOME=/usr/lib/jvm 60 61 export HADOOP_HOME=/home/xxx0624/hadoop 62 63 export PATH=$PATH:/home/xxx0624/hadoop/bin 64 65 export HBASE_CLASSPATH=/home/xxx0624/hadoop/conf