Hadoop生态圈-Oozie部署实战

                                      作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

 

 

 

一.Oozie简介

1>.什么是Oozie

  Oozie英文翻译为:驯象人。一个基于工作流引擎的开源框架,由Cloudera公司贡献给Apache,提供对Hadoop Mapreduce、Pig Jobs的任务调度与协调。Oozie需要部署到Java Servlet容器中运行。主要用于定时调度任务,多任务可以按照执行的逻辑顺序调度。

2>.Oozie的功能模块介绍

1>.Workflow
    顺序执行流程节点,支持fork(分支多个节点),join(合并多个节点为一个)。

2>.Coordinator
    定时触发workflow,它类似与一个定时器。

3>.Bundle Job
    绑定多个Coordinator,它是一个绑定任务的容器。

3>.Oozie的常用节点

1>.控制流节点(Control Flow Nodes)
    控制流节点一般都是定义在工作流开始或者结束的位置,比如start,end,kill等。以及提供工作流的执行路径机制,如decision,fork,join等。

2>.动作节点(Action  Nodes)
    负责执行具体动作的节点,比如:拷贝文件,执行某个Shell脚本等等。

 

二.部署Hadoop测试环境

1>.下载hadoop版本

  我把测试的版本放在百度云了,详细链接请参考:链接:https://pan.baidu.com/s/1w5G5ReKdJgDJe6931bA8Lw 密码:nal3

2>.解压CDH版本的hadoop

[yinzhengjie@s101 cdh]$ pwd
/home/yinzhengjie/download/cdh
[yinzhengjie@s101 cdh]$ 
[yinzhengjie@s101 cdh]$ ll
total 1298112
-rw-r--r-- 1 yinzhengjie yinzhengjie    3759787 Sep 26  2016 cdh5.3.6-snappy-lib-natirve.tar.gz
-rw-r--r-- 1 yinzhengjie yinzhengjie  293471952 Sep 19  2016 hadoop-2.5.0-cdh5.3.6.tar.gz
-rw-r--r-- 1 yinzhengjie yinzhengjie 1032028646 Sep 19  2016 oozie-4.0.0-cdh5.3.6.tar.gz
[yinzhengjie@s101 cdh]$ 
[yinzhengjie@s101 cdh]$ ll
total 1298116
-rw-r--r--  1 yinzhengjie yinzhengjie    3759787 Sep 26  2016 cdh5.3.6-snappy-lib-natirve.tar.gz
drwxr-xr-x 14 yinzhengjie yinzhengjie       4096 Jul 28  2015 hadoop-2.5.0-cdh5.3.6
-rw-r--r--  1 yinzhengjie yinzhengjie  293471952 Sep 19  2016 hadoop-2.5.0-cdh5.3.6.tar.gz
-rw-r--r--  1 yinzhengjie yinzhengjie 1032028646 Sep 19  2016 oozie-4.0.0-cdh5.3.6.tar.gz
[yinzhengjie@s101 cdh]$ 
[yinzhengjie@s101 cdh]$ cd hadoop-2.5.0-cdh5.3.6/lib/native/
[yinzhengjie@s101 native]$ 
[yinzhengjie@s101 native]$ ll
total 0
[yinzhengjie@s101 native]$ 

3>.解压snappy版本

[yinzhengjie@s101 native]$ tar -zxf /home/yinzhengjie/download/cdh/cdh5.3.6-snappy-lib-natirve.tar.gz -C ./
[yinzhengjie@s101 native]$ 
[yinzhengjie@s101 native]$ ll
total 0
drwxrwxr-x 3 yinzhengjie yinzhengjie 19 Sep 13  2015 lib
[yinzhengjie@s101 native]$ mv lib/native/* ./
[yinzhengjie@s101 native]$ 
[yinzhengjie@s101 native]$ ll
total 15472
drwxrwxr-x 3 yinzhengjie yinzhengjie      19 Sep 13  2015 lib
-rw-rw-r-- 1 yinzhengjie yinzhengjie 1279980 Sep 13  2015 libhadoop.a
-rw-rw-r-- 1 yinzhengjie yinzhengjie 1487052 Sep 13  2015 libhadooppipes.a
lrwxrwxrwx 1 yinzhengjie yinzhengjie      18 Sep 13  2015 libhadoop.so -> libhadoop.so.1.0.0
-rwxrwxr-x 1 yinzhengjie yinzhengjie  747310 Sep 13  2015 libhadoop.so.1.0.0
-rw-rw-r-- 1 yinzhengjie yinzhengjie  582056 Sep 13  2015 libhadooputils.a
-rw-rw-r-- 1 yinzhengjie yinzhengjie  359770 Sep 13  2015 libhdfs.a
lrwxrwxrwx 1 yinzhengjie yinzhengjie      16 Sep 13  2015 libhdfs.so -> libhdfs.so.0.0.0
-rwxrwxr-x 1 yinzhengjie yinzhengjie  228715 Sep 13  2015 libhdfs.so.0.0.0
-rw-rw-r-- 1 yinzhengjie yinzhengjie 7684148 Sep 13  2015 libnativetask.a
lrwxrwxrwx 1 yinzhengjie yinzhengjie      22 Sep 13  2015 libnativetask.so -> libnativetask.so.1.0.0
-rwxrwxr-x 1 yinzhengjie yinzhengjie 3060775 Sep 13  2015 libnativetask.so.1.0.0
-rw-r--r-- 1 yinzhengjie yinzhengjie  233506 Sep 13  2015 libsnappy.a
-rwxr-xr-x 1 yinzhengjie yinzhengjie     961 Sep 13  2015 libsnappy.la
lrwxrwxrwx 1 yinzhengjie yinzhengjie      18 Sep 13  2015 libsnappy.so -> libsnappy.so.1.2.0
lrwxrwxrwx 1 yinzhengjie yinzhengjie      18 Sep 13  2015 libsnappy.so.1 -> libsnappy.so.1.2.0
-rwxr-xr-x 1 yinzhengjie yinzhengjie  147718 Sep 13  2015 libsnappy.so.1.2.0
[yinzhengjie@s101 native]$ rm -rf lib
[yinzhengjie@s101 native]$ 

4>.编辑“mapred-site.xml”配置文件

[yinzhengjie@s101 hadoop-2.5.0-cdh5.3.6]$ pwd
/home/yinzhengjie/download/cdh/hadoop-2.5.0-cdh5.3.6
[yinzhengjie@s101 hadoop-2.5.0-cdh5.3.6]$ ll
total 20
drwxr-xr-x  2 yinzhengjie yinzhengjie  128 Jul 28  2015 bin
drwxr-xr-x  2 yinzhengjie yinzhengjie 4096 Jul 28  2015 bin-mapreduce1
drwxr-xr-x  3 yinzhengjie yinzhengjie 4096 Jul 28  2015 cloudera
drwxr-xr-x  6 yinzhengjie yinzhengjie  105 Jul 28  2015 etc
drwxr-xr-x  5 yinzhengjie yinzhengjie   40 Jul 28  2015 examples
drwxr-xr-x  3 yinzhengjie yinzhengjie   27 Jul 28  2015 examples-mapreduce1
drwxr-xr-x  2 yinzhengjie yinzhengjie  101 Jul 28  2015 include
drwxr-xr-x  3 yinzhengjie yinzhengjie   19 Jul 28  2015 lib
drwxr-xr-x  2 yinzhengjie yinzhengjie 4096 Jul 28  2015 libexec
drwxr-xr-x  3 yinzhengjie yinzhengjie 4096 Jul 28  2015 sbin
drwxr-xr-x  4 yinzhengjie yinzhengjie   29 Jul 28  2015 share
drwxr-xr-x 17 yinzhengjie yinzhengjie 4096 Jul 28  2015 src
[yinzhengjie@s101 hadoop-2.5.0-cdh5.3.6]$ cd etc/hadoop
[yinzhengjie@s101 hadoop]$ 
[yinzhengjie@s101 hadoop]$ rm -rf *.cmd
[yinzhengjie@s101 hadoop]$ 
[yinzhengjie@s101 hadoop]$ mv mapred-site.xml.template mapred-site.xml
[yinzhengjie@s101 hadoop]$ 
[yinzhengjie@s101 hadoop]$ vi mapred-site.xml 
[yinzhengjie@s101 hadoop]$ 
[yinzhengjie@s101 hadoop]$ more mapred-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property> 
        <name>mapreduce.framework.name</name> 
        <value>yarn</value> 
    </property>


    <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>s101:10020</value>
    </property>

    <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>s101:19888</value>
    </property>
</configuration>
[yinzhengjie@s101 hadoop]$ 
[yinzhengjie@s101 hadoop]$ more mapred-site.xml

相关文章: