hue
英 [hjuː] 
n. 色彩;色度;色调;叫声

 

oozie
['uːzɪ] 
(缅甸的)驯象人,驭象者
Hue是一个开源的Apache Hadoop UI系统。

通过使用Hue我们可以在浏览器端的Web控制台上与Hadoop集群进行交互来分析处理数据。
例如操作HDFS上的数据、运行Hive脚本、管理Oozie任务等等。

是基于Python Web框架Django实现的。

支持任何版本Hadoop

基于文件浏览器(File Browser)访问HDFS
基于web编辑器来开发和运行Hive查询
支持基于Solr进行搜索的应用,并提供可视化的数据视图,报表生成
通过web调试和开发impala交互式查询
spark调试和开发
Pig开发和调试
oozie任务的开发,监控,和工作流协调调度
Hbase数据查询和修改,数据展示
Hive的元数据(metastore)查询
MapReduce任务进度查看,日志追踪
创建和提交MapReduce,Streaming,Java job任务
Sqoop2的开发和调试
Zookeeper的浏览和编辑
数据库(MySQL,PostGres,SQlite,Oracle)的查询和展示

 

调度系统:
作业输入输出有依赖;A执行完成再给信号给B

  

添加hue服务。先添加依赖的服务oozie.

mysql> create database oozie     DEFAULT CHARACTER SET utf8;
mysql> grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';

cdh hue impala

 

 

 cdh hue impala

 

 

 

 cdh hue impala

 

 

 

 

Failed to create Oozie database tables.
直接返回主页启动zoozie即可。
https://blog.csdn.net/qiang0066/article/details/79214441

  

继续添加hue服务

  

cdh hue impala

 

 

 

 

进入hue服务页点击webUI

 cdh hue impala

 

 

 

创建用户

cdh hue impala

 

 

 

点击右侧下拉创建用户

cdh hue impala

 

 

 

点击文件浏览器创建和保存文件

  

 

cdh hue impala

 

 

 

拖拽设计作业链

  cdh hue impala

 

 

 cdh hue impala

 

 

 

hue hive 创建表

  

cdh hue impala

 

 

 

load上传文件到hive表

  

cdh hue impala

 

 

 

 

impala
英 [ɪm'pɑːlə; -'pælə]  美 
n. 黑斑羚(产于非洲中南部)
n. (Impala)人名;(意)因帕拉

  

impala 对应hive : 是基于hdfs的sql执行引擎

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 

cloudara推荐impala的服务器的内存为128G;cloudara manager server的内存是64G.

  

cdh hue impala

 

 cdh hue impala

 

 

 

hive <--> Catalog 数据同步,

 

cdh hue impala

 

 

添加impala服务

 cdh hue impala

 

 启动impala

cdh hue impala

 

 

 

[root@node21 ~]# impala-shell
Starting Impala Shell without Kerberos authentication
Connected to node21:21000
Server version: impalad version 2.3.0-cdh5.5.0 RELEASE (build 0c891d79aa38f297d244855a32f1e17280e2129b)
***********************************************************************************
Welcome to the Impala shell. Copyright (c) 2015 Cloudera, Inc. All rights reserved.
(Impala Shell v2.3.0-cdh5.5.0 (0c891d7) built on Mon Nov  9 12:18:12 PST 2015)

The HISTORY command lists all shell commands in chronological order.
**********[node21:21000] > show tables;
Query: show tables
+------+
| name |
+------+
| ooxx |
+------+
*************************************************************************
[node21:21000] > 

[root@node22 ~]# hive
hive> show tables;  ## 已创建的hive表在impala中能够看到。
OK
ooxx
hive> create table oxox(name string);

[node21:21000] > create table xoxo(name string); 
Query: create table xoxo(name string)
##在hive中创建的表不能在impala中及时看到,在impala中创建的表能够在hive中看到

[node21:21000] > select count(*) name from ooxx;
Query: select count(*) name from ooxx
+------+
| name |
+------+
| 7    |
+------+
Fetched 1 row(s) in 10.72s

hive> select count(*) name from ooxx;
Query ID = root_20190908133636_9a96e840-6a8a-4227-959f-1bb01a14a4d1
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1567832148877_0002, Tracking URL = http://node20:8088/proxy/application_1567832148877_0002/
Kill Command = /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop/bin/hadoop job  -kill job_1567832148877_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-09-08 13:37:11,874 Stage-1 map = 0%,  reduce = 0%
2019-09-08 13:38:17,502 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 27.07 sec
2019-09-08 13:38:27,805 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 33.58 sec
2019-09-08 13:38:43,354 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 36.66 sec
MapReduce Total cumulative CPU time: 36 seconds 660 msec
Ended Job = job_1567832148877_0002
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 36.66 sec   HDFS Read: 6317 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 36 seconds 660 msec
OK
7
Time taken: 162.49 seconds, Fetched: 1 row(s)
##hive比较慢

  cdh hue impala

 

 cdh hue impala

 

 默认是 -V 如:  [root@node21 ~]# implat-shell -V

 [root@node21 ~]# implala-shell -p  ## 显示详细执行计划

  

 

[root@node21 ~]# impala-shell -q "select * from ooxx"
Starting Impala Shell without Kerberos authentication
Connected to node21:21000
Server version: impalad version 2.3.0-cdh5.5.0 RELEASE (build 0c891d79aa38f297d244855a32f1e17280e2129b)
Query: select * from ooxx
+----------+
| name     |
+----------+
| zhangsan |
| lisi     |
| lisi     |
| zhangsan |
| wangwu   |
| wangsan  |
| wangsan  |
+----------+
Fetched 7 row(s) in 0.95s
[root@node21 ~]# impala-shell -B -q "select * from ooxx"
Starting Impala Shell without Kerberos authentication
Connected to node21:21000
Server version: impalad version 2.3.0-cdh5.5.0 RELEASE (build 0c891d79aa38f297d244855a32f1e17280e2129b)
Query: select * from ooxx
zhangsan
lisi
lisi
zhangsan
wangwu
wangsan
wangsan
Fetched 7 row(s) in 0.62s
[root@node21 ~]# impala-shell -B -q "select * from ooxx"
Starting Impala Shell without Kerberos authentication
Connected to node21:21000
Server version: impalad version 2.3.0-cdh5.5.0 RELEASE (build 0c891d79aa38f297d244855a32f1e17280e2129b)
Query: select * from ooxx
zhangsan
lisi
lisi
zhangsan
wangwu
wangsan
wangsan
Fetched 7 row(s) in 0.62s
[root@node21 ~]# impala-shell -B -q "select * from ooxx" >> ooxx
Starting Impala Shell without Kerberos authentication
Connected to node21:21000
Server version: impalad version 2.3.0-cdh5.5.0 RELEASE (build 0c891d79aa38f297d244855a32f1e17280e2129b)
Query: select * from ooxx
Fetched 7 row(s) in 7.77s
[root@node21 ~]# cat ooxx
zhangsan
lisi
lisi
zhangsan
wangwu
wangsan
wangsan


[root@node21 ~]# cat sql 
select * from ooxx;
select * from xxoo;
select * from ooxx;
[root@node21 ~]# impala-shell -f sql  ## 第二句错误了,第三句不在执行。
[root@node21 ~]# impala-shell -c -f sql   ## 第二句错误了,第三句仍然执行。

  cdh hue impala

 

 

[node21:21000] > show tables;
Query: show tables
+------+
| name |
+------+
| ooxx |
| xoxo |
+------+
Fetched 2 row(s) in 0.29s
[node21:21000] > invalidate metadata;
Query: invalidate metadata

Fetched 0 row(s) in 5.23s
[node21:21000] > show tables;
Query: show tables
+------+
| name |
+------+
| ooxx |
| oxox |
| xoxo |
+------+
[node21:21000] > set explain_level=0;
EXPLAIN_LEVEL set to 0
[node21:21000] > explain select count(*) from ooxx;
[node21:21000] > set explain_level=4;
EXPLAIN_LEVEL set to 4
[node21:21000] > explain select count(*) from ooxx;
[node21:21000] >  select count(*) name from ooxx;
Query: select count(*) name from ooxx
+------+
| name |
+------+
| 7    |
+------+
Fetched 1 row(s) in 30.80s
[node21:21000] > profile;

 cdh hue impala

 

 cdh hue impala

 

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

oozie

cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

 cdh hue impala

 

  

   cd /opt/cloudera-manager/cm-5.5.0/etc/init.d
    ./cloudera-scm-server start
    ./cloudera-scm-agent start  ## 三台都启动

  cdh hue impala

 

 

cdh hue impala

 

 

上图是我的页面cdh5.5.0 centos7版本。报错。

下图是老师的oozie报错。他安装Ext.js   Ext.2.2.zip 到 /var/lib/oozie/下。

  

 

cdh hue impala

 

并且修改了如下配置,再保存更改,重启。(老师操作,)可以看到如下页面。

  

 cdh hue impala

 

 cdh hue impala

 

 

例子演示:oozie提交job ,用上文创建的root用户登录

  

cdh hue impala

 

 

<workflow-app xmlns="uri:oozie:workflow:0.3" name="shell-wf">
    <start to="shell-node"/>
    <action name="shell-node">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <exec>echo</exec>
            <argument>hi shell in oozie</argument>
        </shell>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

 cdh hue impala

 

 

 

[root@node20 shell]# pwd
/root/shell
[root@node20 shell]# cat job.properties 
nameNode=hdfs://node20:8020
jobTracker=node20:8032
queueName=default
examplesRoot=examples

oozie.wf.application.path=${nameNode}/user/root/shell

  

[root@node20 shell]# oozie job --oozie http://node20:11000/oozie/  -config job.properties -run

cdh hue impala

 

 

  cdh hue impala

 

 cdh hue impala

 

 

看到上图的输出结果。
如下图,比较oozie workflow.xml 与hue配置workflow图操作的方便性

  

cdh hue impala

 

 

  

DAG :有向无环图

  

相关文章: