Ubuntu环境下Hadoop1.2.1, HBase0.94.25, nutch2.2.1各个配置文件一览

/×××××××××××××××××××××××××××××××××××××××××/

Author：xxx0624

HomePage：http://www.cnblogs.com/xxx0624/

/×××××××××××××××××××××××××××××××××××××××××/

Hadoop伪分布式配置过程：

Hadoop：1.2.1

Hbase：0.94.25

nutch：2.2.1

Java：1.8.0

SSH：1.0.1j

tomcat：7.0.57

zookeeper：3.4.6

（1）配置Java环境：http://www.cnblogs.com/xxx0624/p/4164744.html

（2）配置OpenSSH：http://www.cnblogs.com/xxx0624/p/4165252.html

（3）配置Hadoop：http://www.cnblogs.com/xxx0624/p/4166095.html

（4）配置tomcat：http://www.cnblogs.com/xxx0624/p/4166840.html

（5）配置zookeeper：http://www.cnblogs.com/xxx0624/p/4168440.html

（6）配置HBase：http://www.cnblogs.com/xxx0624/p/4170468.html

（7）配置ant：http://www.cnblogs.com/xxx0624/p/4172277.html

（8）配置nutch：http://www.cnblogs.com/xxx0624/p/4172601.html

（9）集成：http://www.cnblogs.com/xxx0624/p/4176199.html

Hadoop：

(1)hadoop/conf/core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
	<name>fs.default.name</name>
	<value>hdfs://localhost:9000</value>
	</property>
	<property>
	<name>hadoop.tmp.dir</name>
	<value>/home/xxx0624/hadoop/tmp</value>
	</property>
</configuration>

(2)hadoop/conf/hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
	<name>dfs.replication</name>
	<value>1</value>
	</property>
	<property>
	<name>dfs.name.dir</name>
	<value>/home/xxx0624/hadoop/hdfs/name</value>
	</property>
	<property>
	<name>dfs.data.dir</name>
	<value>/home/xxx0624/hadoop/hdfs/data</value>
	</property>
</configuration>

(3)hadoop/conf/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
    </property>
</configuration>

(4)hadoop/conf/hadoop-env.sh

 1 # Set Hadoop-specific environment variables here.
 2 
 3 # The only required environment variable is JAVA_HOME.  All others are
 4 # optional.  When running a distributed configuration it is best to
 5 # set JAVA_HOME in this file, so that it is correctly defined on
 6 # remote nodes.
 7 
 8 # The java implementation to use.  Required.
 9 # export JAVA_HOME=/usr/lib/j2sdk1.5-sun
10 
11 # Extra Java CLASSPATH elements.  Optional.
12 # export HADOOP_CLASSPATH=
13 
14 # The maximum amount of heap to use, in MB. Default is 1000.
15 # export HADOOP_HEAPSIZE=2000
16 
17 # Extra Java runtime options.  Empty by default.
18 # export HADOOP_OPTS=-server
19 
20 # Command specific options appended to HADOOP_OPTS when specified
21 export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"
22 export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS"
23 export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS"
24 export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS"
25 export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"
26 # export HADOOP_TASKTRACKER_OPTS=
27 # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
28 # export HADOOP_CLIENT_OPTS
29 
30 # Extra ssh options.  Empty by default.
31 # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
32 
33 # Where log files are stored.  $HADOOP_HOME/logs by default.
34 # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
35 
36 # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
37 # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
38 
39 # host:path where hadoop code should be rsync'd from.  Unset by default.
40 # export HADOOP_MASTER=master:/home/$USER/src/hadoop
41 
42 # Seconds to sleep between slave commands.  Unset by default.  This
43 # can be useful in large clusters, where, e.g., slave rsyncs can
44 # otherwise arrive faster than the master can service them.
45 # export HADOOP_SLAVE_SLEEP=0.1
46 
47 # The directory where pid files are stored. /tmp by default.
48 # NOTE: this should be set to a directory that can only be written to by 
49 #       the users that are going to run the hadoop daemons.  Otherwise there is
50 #       the potential for a symlink attack.
51 # export HADOOP_PID_DIR=/var/hadoop/pids
52 
53 # A string representing this instance of hadoop. $USER by default.
54 # export HADOOP_IDENT_STRING=$USER
55 
56 # The scheduling priority for daemon processes.  See 'man nice'.
57 # export HADOOP_NICENESS=10
58 
59 export JAVA_HOME=/usr/lib/jvm 
60 
61 export HADOOP_HOME=/home/xxx0624/hadoop
62 
63 export PATH=$PATH:/home/xxx0624/hadoop/bin
64 
65 export HBASE_CLASSPATH=/home/xxx0624/hadoop/conf

View Code