impala操作hase、hive

impala中使用复杂类型(Hive):
    如果Hive中创建的表带有复杂类型（array，struct，map），且储存格式（stored as textfile）为text或者默认，那么在impala中将无法查询到该表
解决办法：
    另建一张字段一致的表，将stored as textfile改为stored as parquet，再将源表数据插入（insert into tablename2 select * from tablename1），这张表即可在impala中查询。

查询方法：
    impala 和hive不同，对array，map，struct等复杂类型不使用explode，而使用如下方法：
select order_id,rooms.room_id, days.day_id,days.price from test2,test2.rooms,test2.rooms.days;
看起来是把一个复杂类型当作子表，进行join的查询
表结构：
test2 (
   order_id string,
   rooms array<struct<
         room_id:string,
         days:array<struct<day_id:string,price:int>>
         >
   >
)

Impala与HBase整合:
Impala与HBase整合，需要将HBase的RowKey和列映射到Impala的Table字段中。Impala使用Hive的Metastore来存储元数据信息，与Hive类似，在于HBase进行整合时，也是通过外部表（EXTERNAL）的方式来实现。

在HBase中创建表:

...
tname = TableName.valueOf("students");
HTableDescriptor tDescriptor = new HTableDescriptor(tname);
HColumnDescriptor famliy = new HColumnDescriptor("core");
tDescriptor.addFamily(famliy);
admin.createTable(tDescriptor);
//添加列：
...
HTable htable = (HTable) connection.getTable(tname);
//不要自动清理缓冲区
 htable.setAutoFlush(false);
for (int i = 1; i < 50; i++) {
            Put put = new Put(Bytes.toBytes("lisi" + format.format(i)));
            //关闭写前日志
            put.setWriteToWAL(false);

            put.addColumn(Bytes.toBytes("core"), Bytes.toBytes("math"), Bytes.toBytes(format.format(i)));
            put.addColumn(Bytes.toBytes("core"), Bytes.toBytes("english"), Bytes.toBytes(format.format(Math.random() * i)));
            put.addColumn(Bytes.toBytes("core"), Bytes.toBytes("chinese"), Bytes.toBytes(format.format(Math.random() * i)));
            htable.put(put);
            if (i % 2000 == 0) {
                htable.flushCommits();
            }
        }

部分代码