HBase表结构设计--练习篇
一、表结构操作
1.建立一个表scores,有两个列族grad和course
[hadoop@weekend01 ~]$ hbase shell
hbase(main):006:0> create \'scores\',\'grade\',\'course\'
0 row(s) in 0.4120 seconds
=> Hbase::Table - scores
2.查看Hbase中的所有表
hbase(main):007:0> list
TABLE
hbase_student
my_data
new_scores
scores
student
user
6 row(s) in 0.0150 seconds
=> ["hbase_student", "my_data", "new_scores", "scores", "student", "user"]
3.查看scores表结构
hbase(main):008:0> desc \'scores\'
Table scores is ENABLED
scores
COLUMN FAMILIES DESCRIPTION
{NAME => \'course\', DATA_BLOCK_ENCODING => \'NONE\', BLOOMFILTER => \'ROW\', REPLICAT
ION_SCOPE => \'0\', VERSIONS => \'1\', COMPRESSION => \'NONE\', MIN_VERSIONS => \'0\', T
TL => \'FOREVER\', KEEP_DELETED_CELLS => \'FALSE\', BLOCKSIZE => \'65536\', IN_MEMORY
=> \'false\', BLOCKCACHE => \'true\'}
{NAME => \'grade\', DATA_BLOCK_ENCODING => \'NONE\', BLOOMFILTER => \'ROW\', REPLICATI
ON_SCOPE => \'0\', VERSIONS => \'1\', COMPRESSION => \'NONE\', MIN_VERSIONS => \'0\', TT
L => \'FOREVER\', KEEP_DELETED_CELLS => \'FALSE\', BLOCKSIZE => \'65536\', IN_MEMORY =
> \'false\', BLOCKCACHE => \'true\'}
2 row(s) in 0.0550 seconds
4.查询表scores是否存在
hbase(main):009:0> exists \'scores\'
Table scores does exist
0 row(s) in 0.0180 seconds
5.判断表scores是否enable
hbase(main):010:0> is_enabled \'scores\'
true
0 row(s) in 0.0160 seconds
6.判断表scores是否disable
hbase(main):011:0> is_disabled \'scores\'
false
0 row(s) in 0.0210 seconds
7.使表scores不可用
hbase(main):012:0> disable \'scores\'
0 row(s) in 1.3380 seconds
8.恢复表scores可用
hbase(main):013:0> enable \'scores\'
0 row(s) in 0.5680 seconds
9.增加一个school列族
hbase(main):002:0> alter \'scores\',{NAME=>\'school\',VERSIONS=>3}
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 2.3390 seconds
10.删除一个school列族
hbase(main):005:0> alter \'scores\',{NAME=>\'school\',METHOD=>\'delete\'}
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 2.1970 seconds
11.删除表scores
操作步骤:先禁用表,再删除表
hbase(main):004:0> disable \'scores\' #先将scores表改为offline状态
0 row(s) in 1.3100 seconds
hbase(main):005:0> drop \'scores\' #再删除表
0 row(s) in 0.2250 seconds
二、DML操作(数据操纵)
1.添加scores数据
其scores表数据如下:
Tom grade: 5
Tom course:math 97
Tom course:art 87
Tom course:english 80
Jim grade: 4
Jim course:chinese 89
Jim course:english 80
hbase(main):006:0> put \'scores\',\'Tom\',\'grade:\',\'5\'
0 row(s) in 0.1810 seconds
hbase(main):007:0> put \'scores\',\'Tom\',\'course:math\',\'97\'
0 row(s) in 0.0200 seconds
hbase(main):009:0> put \'scores\',\'Tom\',\'course:art\',\'87\'
0 row(s) in 0.0150 seconds
hbase(main):010:0> put \'scores\',\'Tom\',\'course:english\',\'80\'
0 row(s) in 0.0070 seconds
hbase(main):012:0> put \'scores\',\'Jim\',\'grade:\',\'5\'
0 row(s) in 0.0080 seconds
hbase(main):013:0> put \'scores\',\'Jim\',\'course:chinese\',\'89\'
0 row(s) in 0.0110 seconds
hbase(main):014:0> put \'scores\',\'Jim\',\'course:english\',\'80\'
0 row(s) in 0.0070 seconds
2.查看数据
a.全表scores扫描
hbase(main):015:0> scan \'scores\'
ROW COLUMN+CELL
Jim column=course:chinese, timestamp=1464871444954, value=89
Jim column=course:english, timestamp=1464871467549, value=80
Jim column=grade:, timestamp=1464871414324, value=5
Tom column=course:art, timestamp=1464871299895, value=87
Tom column=course:english, timestamp=1464871325997, value=80
Tom column=course:math, timestamp=1464871198158, value=97
Tom column=grade:, timestamp=1464871151963, value=5
2 row(s) in 0.1160 secondsb.查看表scores中Jim的一行的数据
c.查看一个单元格的数据
hbase(main):016:0> get \'scores\',\'Tom\',{COLUMN=>\'course:math\'}
COLUMN CELL
course:math timestamp=1464871198158, value=97
1 row(s) in 0.0900 seconds
d.查看一个course列族的数据
hbase(main):017:0> scan \'scores\',{COLUMN=>\'course\'}
ROW COLUMN+CELL
Jim column=course:chinese, timestamp=1464871444954, value=89
Jim column=course:english, timestamp=1464871467549, value=80
Tom column=course:art, timestamp=1464871299895, value=87
Tom column=course:english, timestamp=1464871325997, value=80
Tom column=course:math, timestamp=1464871198158, value=97
2 row(s) in 0.0170 seconds
e.查看表scores中的course:english列的数据
hbase(main):019:0> scan \'scores\',{COLUMN=>\'course:english\'}
ROW COLUMN+CELL
Jim column=course:english, timestamp=1464871467549, value=80
Tom column=course:english, timestamp=1464871325997, value=80
2 row(s) in 0.0260 seconds
f.统计表scores的行数
hbase(main):020:0> count \'scores\'
2 row(s) in 0.0890 seconds
=> 2
3.修改数据
a.修改scores表中的Tom的math值为100
hbase(main):021:0> put \'scores\',\'Tom\',\'course:math\',\'100\'
0 row(s) in 0.0140 seconds
b.修改scores表中列族course的VERSIONS=>3
hbase(main):022:0> alter \'scores\',{NAME=>\'course\',VERSION=>3}
Unknown argument ignored for column family course: 1.8.7
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 2.2190 seconds
再进行修改cores表中的Tom的math值为100
hbase(main):023:0> put \'scores\',\'Tom\',\'course:math\',\'100\'
0 row(s) in 0.0490 seconds
再按VERSIONS=>3查看该表中course:math的值
hbase(main):025:0> get \'scores\',\'Tom\',{COLUMN=>\'course:math\',VERSIONS=>3}
COLUMN CELL
course:math timestamp=1464872264511, value=100
4.删除数据
a.删除scores表Jim的course:chinese的值
hbase(main):027:0> delete \'scores\',\'Jim\',\'course:chinese\'
0 row(s) in 0.0310 seconds
b.删除scores表Tom的course:english的值
hbase(main):029:0> delete \'scores\',\'Tom\',\'course:english\'
0 row(s) in 0.0080 seconds
c.删除scores表Tom一行
hbase(main):032:0> deleteall \'scores\',\'Tom\'
0 row(s) in 0.4160 seconds
d.清空scores表中的内容
hbase(main):031:0> truncate \'scores\'
Truncating \'scores\' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 1.5000 seconds
疑难小结:刚开始玩HBase看着文档还不是那么难,真做起来麻烦事还真多,刚开始一直报这个错,没有头绪,百度半天
hbase(main):001:0> list
TABLE
ERROR: Can\'t get master address from ZooKeeper; znode data == null
这东西让我改好多文件,还是不行,无奈又重新配置了一遍,同学有遇到一样的情况,说是myid文件与zookeeper配置文件中的指定id不一致导致的但是我检查过了是一样的主机名IP地址id号都是对应的,zookeeper启动正常,进程正常,还有就是HBase进程启动异常,在主节点上启动之后,只有主节点有HMaster进程另外weekend05、06上有HRegionServer进程weekend07上没有,无解了。。。按照步骤重来一遍好了,但还是没有明白哪里出了问题。