-
- hbase命令使用
Help帮助命令查询:
hbase(main):026:0> help
HBase Shell, version 1.3.1, r930b9a55528fe45d8edce7af42fef2d35e77677a, Thu Apr 6 19:36:54 PDT 2017
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, locate_region, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, trace, unassign, wal_roll, zk_dump
Group name: replication
Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs
Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot
Group name: configuration
Commands: update_all_config, update_config
Group name: quotas
Commands: list_quotas, set_quota
Group name: security
Commands: grant, list_security_capabilities, revoke, user_permission
Group name: procedures
Commands: abort_procedure, list_procedures
Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
hbase(main):327:0>
help查询指定命令用法:
hbase(main):327:0>
hbase(main):327:0> help 'update_config'
Reload a subset of configuration on server 'servername' where servername is
host, port plus startcode. For example: host187.example.com,60020,1289493121758
See http://hbase.apache.org/book.html?dyn_config for more details. Here is how
you would run the command in the hbase shell:
hbase> update_config 'servername'
hbase(main):328:0>
输入 help 可以看到命令分组
- DDL使用方法:
-
-
-
- Hbase链接
-
-
[[email protected] ~]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-1.3.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.3.1, r930b9a55528fe45d8edce7af42fef2d35e77677a, Thu Apr 6 19:36:54 PDT 2017
hbase(main):001:0>
-
-
-
- status查询服务器状态
-
-
[[email protected] ~]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-1.3.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.3.1, r930b9a55528fe45d8edce7af42fef2d35e77677a, Thu Apr 6 19:36:54 PDT 2017
hbase(main):001:0> status
1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load
hbase(main):002:0>
-
-
-
- version查询hbase版本
-
-
hbase(main):002:0>
hbase(main):003:0* version
1.3.1, r930b9a55528fe45d8edce7af42fef2d35e77677a, Thu Apr 6 19:36:54 PDT 2017
#当前使用hbase版本为1.3
hbase(main):004:0>
-
-
-
- table_help查询表命令使用
-
-
table_help用于查询create、get
hbase(main):019:0*
hbase(main):020:0* table_help get
ERROR: wrong number of arguments (0 for 2)
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase(main):034:0*
hbase(main):035:0* table_hlep desc
ERROR: wrong number of arguments (0 for 1)
Describe the named table. For example:
hbase> describe 't1'
hbase> describe 'ns1:t1'
Alternatively, you can use the abbreviated 'desc' for the same thing.
hbase> desc 't1'
hbase> desc 'ns1:t1'
NoMethodError: undefined method `table_hlep' for #<Object:0x569cc9e9>
hbase(main):036:0>
hbase(main):043:0>
hbase(main):043:0> table_hlep drop_all
ERROR: wrong number of arguments (0 for 1)
Drop all of the tables matching the given regex:
hbase> drop_all 't.*'
hbase> drop_all 'ns:t.*'
hbase> drop_all 'ns:.*'
NoMethodError: undefined method `table_hlep' for #<Object:0x569cc9e9>
hbase(main):044:0> table_hlep enable_all
ERROR: wrong number of arguments (0 for 1)
Enable all of the tables matching the given regex:
hbase> enable_all 't.*'
hbase> enable_all 'ns:t.*'
hbase> enable_all 'ns:.*'
NoMethodError: undefined method `table_hlep' for #<Object:0x569cc9e9>
hbase(main):045:0>
-
-
-
- list列出所有的表
-
-
hbase(main):004:0>
hbase(main):005:0* list
TABLE
0 row(s) in 0.0930 seconds
=> []
hbase(main):006:0>
hbase(main):084:0> t=get_table 'demo'
0 row(s) in 0.0030 seconds
=> Hbase::Table - demo
hbase(main):085:0> t.put 'r1','f1','v1'
0 row(s) in 0.0600 seconds
hbase(main):086:0> t.scan
ROW COLUMN+CELL
r1 column=f1:, timestamp=1534173820636, value=v1
1 row(s) in 0.0310 seconds
hbase(main):087:0>
-
-
-
- describe查看表详情
-
-
用法:describe ‘表名称’
hbase(main):010:0> describe 'user'
Table user is ENABLED
user
COLUMN FAMILIES DESCRIPTION
{NAME => 'data', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VE
RSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS =>
'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VE
RSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS =>
'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
2 row(s) in 0.2090 seconds
hbase(main):011:0>
-
-
-
- alter修改表结构
-
-
首先停用user表(新版本不用)
disable 'user'
添加两个列族f1和f2
alter 'people', NAME => 'f1'
alter 'user', NAME => 'f2'
启用表
enable 'user'
示例:
hbase(main):013:0> disable 'user1' #禁用表
0 row(s) in 2.3150 seconds
hbase(main):014:0> alter 'user1',NAME=>'f1',NAME=>'f2' #修改表,添加2个列簇
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 1.9720 seconds
hbase(main):015:0>
hbase(main):015:0> enable 'user1' #启动表
0 row(s) in 1.3040 seconds
hbase(main):016:0>
###disable 'user'(新版本不用)
删除一个列族:
alter 'user', NAME => 'f1', METHOD => 'delete' 或 alter 'user', 'delete' => 'f1'
添加列族f1同时删除列族f2
alter 'user', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}
示例:
hbase(main):016:0> alter 'user1',NAME=>'f1',METHOD=>'delete' #删除一个列族F1
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.2330 seconds
hbase(main):017:0>
hbase(main):018:0> alter 'user1',{NAME=>'f1'},{NAME=>'f2',METHOD=>'delete'} #添加列族f1同时删除列族f2
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.1810 seconds
hbase(main):019:0>
示例:
创建表
create 'demo',{NAME=>'f1',COMPRESSION=>'SNAPPY',REPLICATION_SCOPE=>'1'}
添加列簇f2
alter 'demo',{NAME=>'f2', NAME=>'f3'}
修改列簇f3版本号
alter 'demo',{NAME=>'f3',VERSIONS=>5}
删除列簇f3
alter 'demo',NAME=>'f3',METHOD=>'delete'
删除列簇f2
alter 'demo','delete'=>'f2'
删除表属性范围:
alter 'demo',{NAME=>'f1',VERSIONS=>3},{MAX_FILESIZE=>'134217728'},{METHOD=>'delete',NAME=>'f2'}
-
-
-
- alter_async同步修改表
-
-
修改表demo,添加列簇f4,删除列簇f3.
alter_async 'demo',{NAME=>'f4',VERSIONS=>'3'},{'delete'=>'f3'}
-
-
-
- alter_status查询修改表状态
-
-
hbase(main):068:0> alter_status 'demo'
1/1 regions updated.
Done.
hbase(main):069:0>
-
-
-
- disable/ disable_all禁用表
-
-
disable禁用表后,该表不能再使用修改、查询等操作。
hbase(main):069:0> disable 'demo'
0 row(s) in 2.5690 seconds
hbase(main):070:0>
disable_all:将所有表都禁用,支持正则表达式。
hbase(main):069:0> list
TABLE
BASE
TestTable
demo
demo01
demo02
demo03
test
testtable
8 row(s) in 0.0180 seconds
=> ["BASE", "TestTable", "demo", "demo01", "demo02", "demo03", "test", "testtable"]
hbase(main):070:0>
例如:禁用所有已demo开头的表。
hbase(main):070:0>
hbase(main):070:0> disable_all 'demo.*' # demo后的点表示任意字符,星号表示一个或者多个。
demo
demo01
demo02
demo03
Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
hbase(main):022:0>
-
-
-
- enable/ enable_all启用表
-
-
disable禁用表后,该表不能再使用修改、查询等操作。
hbase(main):069:0> disable 'demo'
0 row(s) in 2.5690 seconds
hbase(main):070:0>
enable_all:将所有禁用的表启用,支持正则表达式。
hbase(main):069:0> list
TABLE
BASE
TestTable
demo
demo01
demo02
demo03
test
testtable
8 row(s) in 0.0180 seconds
=> ["BASE", "TestTable", "demo", "demo01", "demo02", "demo03", "test", "testtable"]
hbase(main):070:0>
例如:将所有demo开头禁用的表全部启用。
hbase(main):072:0> enable_all 'demo.*' # demo后的点表示任意字符,星号表示一个或者多个。
demo
demo01
demo02
demo03
Enable the above 3 tables (y/n)?
y
3 tables successfully enabled
hbase(main):029:0>
-
-
-
- is_disabled判断表是否禁用
-
-
判断表是否禁用disable
hbase(main):032:0> is_disabled 'test'
false
0 row(s) in 0.0280 seconds
hbase(main):033:0>
-
-
-
- is_enabled判断表是否可用
-
-
判断表是否可用(enable)
hbase(main):030:0> is_enabled 'test'
true
0 row(s) in 0.0190 seconds
hbase(main):031:0>
-
-
-
- exists查询表是否存在
-
-
查询表是否存在
hbase(main):025:0>
hbase(main):026:0* list
TABLE
test
user
2 row(s) in 0.0140 seconds
=> ["test", "user"]
hbase(main):027:0> exists 'test'
Table test does exist
0 row(s) in 0.0160 seconds
hbase(main):028:0> exists 'user'
Table user does exist
0 row(s) in 0.0180 seconds
hbase(main):029:0>
-
-
-
- drop/drop_all删除表
-
-
用法:drop ‘表名称’
删除表时,先禁用表,在删除
hbase(main):020:0> disable 'user1'
0 row(s) in 2.2670 seconds
hbase(main):021:0> drop 'user1'
0 row(s) in 1.2990 seconds
hbase(main):022:0> list
TABLE
user
1 row(s) in 0.0090 seconds
=> ["user"]
hbase(main):023:0>
drop_all:删除所有被禁用的表。
hbase(main):042:0*
hbase(main):043:0* disable_all 'demo.*'
demo01
demo02
demo03
Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
hbase(main):044:0> is_enabled 'demo01'
false
0 row(s) in 0.0170 seconds
hbase(main):045:0>
hbase(main):050:0*
hbase(main):051:0* drop_all 'demo.*'
demo01
demo02
demo03
Drop the above 3 tables (y/n)?
y
3 tables successfully dropped
hbase(main):052:0>
-
-
-
- locate_region查询表rowkey行键region信息
-
-
hbase(main):054:0* scan 'test',LIMIT=>1
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533839984134, value=0
1 row(s) in 0.0390 seconds
hbase(main):055:0> locate_region 'test','row-0001'
HOST REGION
y3.worker.com:60020 {ENCODED => 4d0b6d5cea4be9acf13916616478aeaf, NAME => 'test,,1533670653099.
4d0b6d5cea4be9acf13916616478aeaf.', STARTKEY => '', ENDKEY => ''}
1 row(s) in 0.0050 seconds
hbase(main):056:0>
-
-
-
- show_filters查询可用的filter信息
-
-
hbase(main):062:0*
hbase(main):063:0* table_help show_filters
ColumnPrefixFilter
TimestampsFilter
PageFilter
MultipleColumnPrefixFilter
FamilyFilter
ColumnPaginationFilter
SingleColumnValueFilter
RowFilter
QualifierFilter
ColumnRangeFilter
ValueFilter
PrefixFilter
SingleColumnValueExcludeFilter
ColumnCountGetFilter
InclusiveStopFilter
DependentColumnFilter
FirstKeyOnlyFilter
KeyOnlyFilter
-
-
-
- namespace名称空间增删改查
-
-
- create_namespace创建名称空间
用法:create_namespace ‘名称’
hbase(main):032:0> create_namespace 'userdb' #创建名称空间
0 row(s) in 0.9700 seconds
- list_namespace查看名称空间
hbase(main):033:0> list_namespace #查看名称空间
NAMESPACE
default
hbase
userdb
3 row(s) in 0.0520 seconds
hbase(main):034:0>
hbase(main):079:0> list_namespace 'userdb'
NAMESPACE
userdb
1 row(s) in 0.0050 seconds
hbase(main):080:0>
- describe_namespace查看名称空间描述
hbase(main):082:0>
hbase(main):082:0> describe_namespace 'userdb'
DESCRIPTION
{NAME => 'userdb'}
1 row(s) in 0.0130 seconds
hbase(main):083:0>
- alter_namespace修改名称空间
alter_namespace 'ns1', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
alter_namespace 'ns1', {METHOD => 'unset', NAME=>'PROPERTY_NAME'}
- list_namespace_tables查看名称空间所有表
hbase(main):089:0> list_namespace_tables 'default'
TABLE
BASE
TestTable
test
testtable
4 row(s) in 0.0190 seconds
hbase(main):090:0>
- drop_namespace删除名称空间
hbase(main):090:0> list_namespace
NAMESPACE
default
hbase
ns1
testNamespace
userdb
5 row(s) in 0.0310 seconds
hbase(main):091:0> drop_namespace 'ns1'
0 row(s) in 0.0620 seconds
hbase(main):092:0> list_namespace
NAMESPACE
default
hbase
testNamespace
userdb
4 row(s) in 0.0060 seconds
hbase(main):093:0>
示例:
对名称空间userdb下的表操作如下
hbase(main):034:0>
hbase(main):035:0* create 'userdb:group','info','data' #创建表
0 row(s) in 1.2640 seconds
=> Hbase::Table - userdb:group
hbase(main):036:0> put 'userdb:group','rk001','info:name','lisi' #插入数据
0 row(s) in 0.0240 seconds
hbase(main):037:0> get 'userdb:group','rk001' #查询表数据
COLUMN CELL
info:name timestamp=1506437812150, value=lisi
1 row(s) in 0.0090 seconds
hbase(main):038:0> scan 'userdb:group' #查询所有数据
ROW COLUMN+CELL
rk001 column=info:name, timestamp=1506437812150, value=lisi
1 row(s) in 0.0270 seconds
hbase(main):039:0> drop 'userdb:group'
ERROR: Table userdb:group is enabled. Disable it first.
Here is some help for this command:
Drop the named table. Table must first be disabled:
hbase> drop 't1'
hbase> drop 'ns1:t1'
hbase(main):040:0> disable 'userdb:group' #禁用表
0 row(s) in 2.3340 seconds
hbase(main):041:0> drop 'userdb:group' #删除表
0 row(s) in 1.2680 seconds
hbase(main):042:0> drop_namespace 'userdb' #删除命名空间
0 row(s) in 0.8960 seconds
hbase(main):043:0> list_namespace
NAMESPACE
default
hbase
2 row(s) in 0.0240 seconds
hbase(main):044:0>
-
-
-
- create创建表
-
-
用法:create ’表名称‘,‘列族名称1’, ‘列族名称2’, ‘列族名称…..’
创建user表,包含info、data两个列族
create 'user', 'info', 'data'
create 'user', {NAME => 'info', VERSIONS => '3'}
示例:
hbase(main):004:0>
hbase(main):004:0>
hbase(main):005:0* list
TABLE
0 row(s) in 0.0930 seconds
=> []
hbase(main):006:0> create 'user','info','data'
0 row(s) in 2.5370 seconds
=> Hbase::Table - user
hbase(main):007:0> list
TABLE
user
1 row(s) in 0.0110 seconds
=> ["user"]
hbase(main):008:0>
或者
hbase(main):008:0> create 'user1',{NAME=>'info'},{NAME=>'data'}
0 row(s) in 1.2600 seconds
=> Hbase::Table - user1
hbase(main):009:0> list
TABLE
user
user1
2 row(s) in 0.0110 seconds
=> ["user", "user1"]
hbase(main):010:0>
-
-
-
- put插入表数据
-
-
向user表中插入信息,row key为rk0001,列族info中添加name列标示符,值为zhangsan
put 'user', 'rk0001', 'info:name', 'zhangsan'
向user表中插入信息,row key为rk0001,列族info中添加gender列标示符,值为female
put 'user', 'rk0001', 'info:gender', 'female'
向user表中插入信息,row key为rk0001,列族info中添加age列标示符,值为20
put 'user', 'rk0001', 'info:age', 20
向user表中插入信息,row key为rk0001,列族data中添加pic列标示符,值为picture
put 'user', 'rk0001', 'data:pic', 'picture'
示例:
hbase(main):033:0> put 'user','rk001','info:name','zhangsan'
0 row(s) in 0.2250 seconds
hbase(main):034:0> put 'user','rk001','info:sex','male'
0 row(s) in 0.0150 seconds
hbase(main):035:0> put 'user','rk001','info:age','20'
0 row(s) in 0.0090 seconds
hbase(main):036:0> put 'user','rk001','data:pic','1111'
0 row(s) in 0.0130 seconds
hbase(main):037:0>
append添加数据:
hbase(main):107:0> append 'demo','row-002','info:age','23'
0 row(s) in 0.0140 seconds
hbase(main):108:0> scan 'demo'
ROW COLUMN+CELL
row-001 column=info:name, timestamp=1534193888317, value=tom
row-002 column=info:age, timestamp=1534194043895, value=23
2 row(s) in 0.0150 seconds
hbase(main):109:0>
-
-
-
- get查询数据
-
-
get用法:
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase> get 't1', 'r1', 'c1', 'c2'
hbase> get 't1', 'r1', ['c1', 'c2']
hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
获取user表中row key为rk0001的所有信息
get 'user', 'rk0001'
示例:
hbase(main):037:0> get 'user','rk001'
COLUMN CELL
data:pic timestamp=1506430730921, value=1111
info:age timestamp=1506430708460, value=20
info:name timestamp=1506430666242, value=zhangsan
info:sex timestamp=1506430691490, value=male
1 row(s) in 0.1140 seconds
hbase(main):038:0>
hbase(main):039:0*
获取user表中row key为rk0001,info列族的所有信息
get 'user', 'rk0001', 'info'
示例:
hbase(main):038:0>
hbase(main):039:0* get 'user','rk001','info'
COLUMN CELL
info:age timestamp=1506430708460, value=20
info:name timestamp=1506430666242, value=zhangsan
info:sex timestamp=1506430691490, value=male
1 row(s) in 0.0290 seconds
hbase(main):040:0>
获取user表中row key为rk0001,info列族的name、age列标示符的信息
get 'user', 'rk0001', 'info:name', 'info:age'
示例:
hbase(main):040:0> get 'user','rk001','info:name','info:age','info:sex'
COLUMN CELL
info:age timestamp=1506430708460, value=20
info:name timestamp=1506430666242, value=zhangsan
info:sex timestamp=1506430691490, value=male
1 row(s) in 0.0710 seconds
hbase(main):041:0>
获取user表中row key为rk0001,info、data列族的信息
get 'user', 'rk0001', 'info', 'data'
get 'user', 'rk0001', {COLUMN => ['info', 'data']}
get 'user', 'rk0001', {COLUMN => ['info:name', 'data:pic']}
示例:
hbase(main):042:0> get 'user','rk001',{COLUMNS=>'info'}
COLUMN CELL
info:age timestamp=1506430708460, value=20
info:name timestamp=1506430666242, value=zhangsan
info:sex timestamp=1506430691490, value=male
1 row(s) in 0.0300 seconds
hbase(main):043:0> get 'user','rk001',{COLUMNS=>['info','data']}
COLUMN CELL
data:pic timestamp=1506430730921, value=1111
info:age timestamp=1506430708460, value=20
info:name timestamp=1506430666242, value=zhangsan
info:sex timestamp=1506430691490, value=male
1 row(s) in 0.0220 seconds
hbase(main):044:0>
获取user表中row key为rk0001,列族为info,版本号最新5个的信息
get 'user', 'rk0001', {COLUMN => 'info', VERSIONS => 2}
get 'user', 'rk0001', {COLUMN => 'info:name', VERSIONS => 5}
get 'user', 'rk0001', {COLUMN => 'info:name', VERSIONS => 5, TIMERANGE => [1392368783980, 1392380169184]}
示例:
hbase(main):045:0> get 'user','rk001',{COLUMN=>'info:name',VERSION=>5,TIMERANGE=>[1506430666242,1506430708460]}
COLUMN CELL
info:name timestamp=1506430666242, value=zhangsan
1 row(s) in 0.0140 seconds
hbase(main):046:0>
获取user表中row key为rk0001,cell的值为zhangsan的信息
get 'user', 'rk0001', {FILTER => "ValueFilter(=, 'binary:图片')"}
获取user表中row key为rk0001,列标示符中含有a的信息
get 'user', 'rk0001', {FILTER => "(QualifierFilter(=,'substring:a'))"}
示例:
hbase(main):046:0> get 'user', 'rk001', {FILTER => "(QualifierFilter(=,'substring:a'))"}
COLUMN CELL
info:age timestamp=1506430708460, value=20
info:name timestamp=1506430666242, value=zhangsan
1 row(s) in 0.0650 seconds
hbase(main):047:0>
-
-
-
- Scan查询表所有数据
-
-
Scan用法:
Some examples:
hbase> scan 'hbase:meta'
hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
hbase> scan 't1', {REVERSED => true}
hbase> scan 't1', {ALL_METRICS => true}
hbase> scan 't1', {METRICS => ['RPC_RETRIES', 'ROWS_FILTERED']}
hbase> scan 't1', {ROWPREFIXFILTER => 'row2', FILTER => "
(QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"}
hbase> scan 't1', {FILTER =>
org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
For setting the Operation Attributes
hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false). By
default it is enabled. Examples:
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
查询表所有数据
查询user表中的所有信息
scan 'user'
示例:
hbase(main):051:0> scan 'user'
ROW COLUMN+CELL
rk001 column=data:pic, timestamp=1506430730921, value=1111
rk001 column=info:age, timestamp=1506430708460, value=20
rk001 column=info:sex, timestamp=1506430691490, value=male
1 row(s) in 0.0570 seconds
hbase(main):052:0>
查询user表中列族为info的信息
scan 'user', {COLUMNS => 'info'}
查询user表中列族为info、列标示符为name的信息,并且版本最新的5个
scan 'user', {COLUMNS => 'info', RAW => true, VERSIONS => 5}
查询user表中列族为info和data的信息
scan 'user', {COLUMNS => ['info', 'data']}
scan 'user', {COLUMNS => ['info:name', 'data:pic']}
示例:
hbase(main):052:0> scan 'user',{COLUMNS=>'info',VERSION=>5}
ROW COLUMN+CELL
rk001 column=info:age, timestamp=1506430708460, value=20
rk001 column=info:sex, timestamp=1506430691490, value=male
1 row(s) in 0.0190 seconds
hbase(main):053:0> scan 'user',{COLUMNS=>['info:age','data:pic']}
ROW COLUMN+CELL
rk001 column=data:pic, timestamp=1506430730921, value=1111
rk001 column=info:age, timestamp=1506430708460, value=20
1 row(s) in 0.0210 seconds
hbase(main):054:0>
- QualifierFilter过滤:
QualifierFilter对列的名称进行过滤,而不是列的值
scan ‘qy’,{FILTER=>”PrefixFilter(‘t’) AND QualifierFilter(>=,’binary:b’)”}
查询user表中列族为info和data且列标示符中含有a字符的信息
scan 'user', {COLUMNS => ['info', 'data'], FILTER => "(QualifierFilter(=,'substring:a'))"}
示例:
hbase(main):054:0> scan 'user',{COLUMNS=>['info','data'],FILTER=>"(QualifierFilter(=,'substring:a'))"}
ROW COLUMN+CELL
rk001 column=info:age, timestamp=1506430708460, value=20
1 row(s) in 0.0290 seconds
hbase(main):055:0>
- Rowkey过滤:
查询user表中列族为info,rk范围是[rk0001, rk0003)的数据
scan 'people', {COLUMNS => 'info', STARTROW => 'rk0001', ENDROW => 'rk0003'}
示例:
hbase(main):055:0> scan 'user',{COLUMNS=>'info',STARTROW=>'rk001',ENDROW=>'rk003'}
ROW COLUMN+CELL
rk001 column=info:age, timestamp=1506430708460, value=20
rk001 column=info:sex, timestamp=1506430691490, value=male
1 row(s) in 0.0210 seconds
hbase(main):056:0>
- PrefixFilter:rowKey前缀过滤:
查询user表中row key以rk字符开头的
scan 'user',{FILTER=>"PrefixFilter('rk')"}
示例:
hbase(main):057:0> scan 'user',{FILTER=>"PrefixFilter('rk')"}
ROW COLUMN+CELL
rk001 column=data:pic, timestamp=1506430730921, value=1111
rk001 column=info:age, timestamp=1506430708460, value=20
rk001 column=info:sex, timestamp=1506430691490, value=male
1 row(s) in 0.0330 seconds
hbase(main):058:0>
- TimestampsFilter:时间戳过滤器
scan ‘qy’,{FILTER=>”TimestampsFilter(1448069941270,1548069941230)” }
scan ‘qy’,{FILTER=>”(QualifierFilter(>=,’binary:b’)) AND (TimestampsFilter(1348069941270,1548069941270))” }
查询user表中指定范围的数据
scan 'user', {TIMERANGE => [1392368783980, 1392380169184]}
示例:
hbase(main):058:0>
hbase(main):058:0> scan 'user',{TIMERANGE=>[1506430691490,1506430730921]}
ROW COLUMN+CELL
rk001 column=info:age, timestamp=1506430708460, value=20
rk001 column=info:sex, timestamp=1506430691490, value=male
1 row(s) in 0.0190 seconds
hbase(main):059:0>
查询user表中以rk开头,并且只有一条数据
hbase(main):059:0>
hbase(main):059:0> scan 'user',{ FILTER=>"PrefixFilter('rk')",LIMIT=>1 }
- SingleColumnValueFilter列过滤:
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
scan 't1', { COLUMNS => 'family:qualifier', FILTER =>
SingleColumnValueFilter.new
(Bytes.toBytes('列族'),
Bytes.toBytes('列名'),
CompareFilter::CompareOp.valueOf('EQUAL'),
Bytes.toBytes("值"))
}
scan 'LOG20180108',{STARTROW=>'T0',ENDROW=>'TZ',LIMIT=>10,COLUMNS => 'INFO:AreaCode', FILTER => org.apache.hadoop.hbase.filter.SingleColumnValueFilter.new(org.apache.hadoop.hbase.util.Bytes.toBytes('INFO'),org.apache.hadoop.hbase.util.Bytes.toBytes('AreaCode'),org.apache.hadoop.hbase.filter.CompareFilter::CompareOp.valueOf('EQUAL'),org.apache.hadoop.hbase.util.Bytes.toBytes("6101"))}
- ColumnPaginationFilter
scan ‘qy’,{FILTER=>org.apache.hbase.filter.ColumnPaginationFilter.new(2,0)}
- 多条件查询
#使用scan
hbase(main)>scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => '052072986'}
hbase(main)>scan 't1', {FILTER => "(PrefixFilter ('row2') AND (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}
hbase(main)>scan 't1', { COLUMNS =>[ 'c1','c2'], LIMIT => 2,STARTROW => '052072986'}
hbase(main)>scan 't1', {FILTER => "(PrefixFilter ('00036')) AND (SingleColumnValueFilter ('列族','列名',>=,'binary:20141208'))"}
hbase(main)>scan 't1', {FILTER => "(PrefixFilter ('00036')) AND (SingleColumnValueFilter ('列族','列名',>=,'binary:20141208')) AND (SingleColumnValueFilter ('列族','列名',<,'binary:20150101'))"}
- 分页扫描表
# 语法:scan <table>, {COLUMNS => [ <family:column>,.... ], LIMIT => num}
# 另外,还可以添加STARTROW、TIMERANGE和FITLER等高级功能
# 例如:扫描表t1的前5条数据,有两种方式:
hbase(main)> scan 't1',{LIMIT=>5}
hbase(main)> scan 't1', { COLUMNS => '列族', FILTER => org.apache.hadoop.hbase.filter.PageFilter.new(5)}
过滤器查询:
scan 'LOG20180108',{STARTROW=>'T0',ENDROW=>'TZ',LIMIT=>10,COLUMNS => 'INFO:AreaCode', FILTER =>}
scan 'LOG20180108',{ FILTER=>[PrefixFilter('T'),(QualifierFilter(=,'substring:a'))] }
scan 'LOG20180108',{COLUMNS =>'INFO',LIMIT=>1,FILTER=>"PrefixFilter('T') AND QualifierFilter(>=,'binary:AreaCode') AND ValueFilter(=,'substring:610113')"}
scan 'BASE',{FILTER=>"(QualifierFilter(=,'binary:FaceClusterID'))"}
scan 'LOG20180108',{COLUMNS =>'INFO',LIMIT=>1,FILTER=>"PrefixFilter('T') AND QualifierFilter(>=,'binary:AreaCode') AND ValueFilter(=,'binary:610113')"}
scan 'LOG20180108', {LIMIT=>1,FILTER=>"ValueFilter(=,'substring:610113')"}
scan 'LOG20180227',{FILTER=>"PrefixFilter('U1020004') AND QualifierFilter(=,'binary:ManufacturerCode') AND ValueFilter(=,'binary:335539067')"}
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
scan 'LOG20180108', { COLUMNS => 'INFO',LIMIT=>1, FILTER =>SingleColumnValueFilter.new(Bytes.toBytes('INFO'),Bytes.toBytes('AreaCode'),CompareFilter::CompareOp.valueOf('EQUAL'),Bytes.toBytes('610113')) }
echo "hbasae scan 'LOG20180108',{COLUMNS => 'INFO',LIMIT=>1,FILTER=>"(PrefixFilter('T')) AND (SingleColumnValueFilter('INFO','AreaCode',=,'binary:610103'))"}" >>/home/hive/logterminal-min.csv
echo "hbasae scan 'LOG20180108',{COLUMNS => 'INFO',LIMIT=>1,FILTER=>"(PrefixFilter('T')) AND (SingleColumnValueFilter('INFO','AreaCode',=,'binary:610103'))"}" >>/home/hive/logterminal-min.csv
scan 'STG20180703',{STARTROW=>'WEBWXK266WUT_29064004',ENDROW=>'WEBWXMZQUCUK_29064024',FILTER=>"(SingleColumnValueFilter('INFO','Time',>=,'binary:2018-07-03T10:20:36Z')) AND (SingleColumnValueFilter('INFO','Time',<=,'binary:2018-07-03T10:20:56Z'))"}
Compare比较器:
(1)二进制比较器:如’binary:abc’,按字典排序跟’abc’进行比较
(2)二进制前缀比较器:如’binaryprefix:abc’,按字典顺序只跟’abc’比较前3个字符
(3)正则表达式比较器:如’regexstring:ab*yz’,按正则表达式匹配以ab开头,以yz结尾的值。这个比较器只能使用=、!=两个比较运算符。
(4)子串比较器:如’substring:abc123’,匹配以abc123开头的值。这个比较顺也只能使用=、!=两个比较运算符。RegexStringComparator,SubstringComparator比较,只能通过EQUAL/NOT_EQUAL比较,主要是这些比较方法compareTo方法匹配返回0,不匹配返回1
比较运算符:
1. LESS (<)
2. LESS_OR_EQUAL (<=)
3. EQUAL (=)
4. NOT_EQUAL (!=)
5. GREATER_OR_EQUAL (>=)
6. GREATER (>)
7. NO_OP (no operation)
- RowFilter根据行健过滤
行键过滤器支持查询大于、大于等于、小于、小于等于、等于、不等于。
查询行键为111的数据:
BinaryComparator二进制比较器:
hbase(main):022:0>
hbase(main):021:0* scan 'test',{FILTER=>"RowFilter(=,'binary:row-00099')"}
ROW COLUMN+CELL
row-00099 column=info:name, timestamp=1533670912855, value=99
1 row(s) in 0.0100 seconds
hbase(main):022:0>
BinaryPrefixComparator二进制比较器:
hbase(main):072:0>
hbase(main):072:0> scan 'test',{FILTER=>"RowFilter(=,'binaryprefix:row')",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533839984134, value=0
row-0001 column=info:age, timestamp=1533828626975, value=23
row-0001 column=info:birthday, timestamp=1533828651988, value=2018-08-10
row-0001 column=info:gender, timestamp=1533828664954, value=1
row-0001 column=info:name, timestamp=1533839984134, value=1
row-0001 column=info:price, timestamp=1533828549983, value=23435345
2 row(s) in 0.0170 seconds
hbase(main):073:0>
SubstringComparator字符串比较器:
hbase(main):022:0> scan 'test',{FILTER=>"RowFilter(=,'substring:row-00099')"}
ROW COLUMN+CELL
row-00099 column=info:name, timestamp=1533670912855, value=99
1 row(s) in 0.0080 seconds
hbase(main):023:0>
RegexStringComparator字符串比较器:
hbase(main):071:0>
hbase(main):071:0> scan 'test',{FILTER=>"RowFilter(=,'regexstring:row-*')",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533839984134, value=0
row-0001 column=info:age, timestamp=1533828626975, value=23
row-0001 column=info:birthday, timestamp=1533828651988, value=2018-08-10
row-0001 column=info:gender, timestamp=1533828664954, value=1
row-0001 column=info:name, timestamp=1533839984134, value=1
row-0001 column=info:price, timestamp=1533828549983, value=23435345
2 row(s) in 0.0290 seconds
hbase(main):072:0>
注:substring只能使用是等于、不等于,因为通过BitComparator,RegexStringComparator,SubstringComparator比较,只能通过EQUAL/NOT_EQUAL比较,主要是这些比较方法compareTo方法匹配返回0,不匹配返回1。
- FamilyFilter根据列族过滤
查询表名称为test,列族包含in的数据。
hbase(main):031:0>
hbase(main):031:0> scan 'test',FILTER=>"FamilyFilter(=,'substring:in')"
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
row-00011 column=info:name, timestamp=1533670912855, value=11
r
hbase(main):034:0>
hbase(main):034:0> scan 'test',FILTER=>"FamilyFilter(>=,'binary:IN')"
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
- QualifierFilter根据列过滤
查询表名称为test,列值包含name的数据。
hbase(main):035:0>
hbase(main):035:0> scan 'test',FILTER=>"QualifierFilter(>=,'binary:NAME')"
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
row-00011 column=info:name, timestamp=1533670912855, value=11
hbase(main):036:0>
hbase(main):037:0* scan 'test',FILTER=>"QualifierFilter(=,'substring:NAME')"
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
- ValueFilter根据值过滤
查询表名为test,值为89的数据。
hbase(main):006:0* scan 'test',{FILTER=>"ValueFilter(=,'binary:89')"}
ROW COLUMN+CELL
row-00089 column=info:name, timestamp=1533670912855, value=89
1 row(s) in 0.0190 seconds
hbase(main):007:0>
hbase(main):040:0> scan 'test',FILTER=>"ValueFilter(=,'substring:90')"
ROW COLUMN+CELL
row-00090 column=info:name, timestamp=1533670912855, value=90
1 row(s) in 0.0060 seconds
hbase(main):041:0>
如上命令所示,查询的是表名为test,过滤方式是通过value过滤,匹配出value含111的数据。
可以通过如下方式查询:
[[email protected] ~]# echo "scan 'test', FILTER=>\"ValueFilter(=,'substring:111')\"" | hbase shell
18/08/09 06:25:46 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.0-cdh5.14.2, rUnknown, Tue Mar 27 13:32:17 PDT 2018
scan 'test', FILTER=>"ValueFilter(=,'substring:111')"
ROW COLUMN+CELL
0 row(s) in 0.1680 seconds
[[email protected] ~]#
- SingleColumnValueFilter根据单个列值过滤
用法:
scan 表名,FILTER=>"(SingleColumnValueFilter('列族','列名称',比较运算符,'binary|substring:列值'))"
这里比较运算符与使用比较器有关,substring只能使用等于、不等于;binary支持所有比较方式。
查询列name的值大于等于89小于等于99的数据。
hbase(main):001:0>
hbase(main):001:0> scan 'test',{FILTER=>"(SingleColumnValueFilter('info','name',>=,'binary:89') AND SingleColumnValueFilter('info','name',<=,'binary:99')"}
ROW COLUMN+CELL
row-00089 column=info:name, timestamp=1533670912855, value=89
row-0009 column=info:name, timestamp=1533670912855, value=9
row-00090 column=info:name, timestamp=1533670912855, value=90
row-00091 column=info:name, timestamp=1533670912855, value=91
row-00092 column=info:name, timestamp=1533670912855, value=92
row-00093 column=info:name, timestamp=1533670912855, value=93
row-00094 column=info:name, timestamp=1533670912855, value=94
row-00095 column=info:name, timestamp=1533670912855, value=95
row-00096 column=info:name, timestamp=1533670912855, value=96
row-00097 column=info:name, timestamp=1533670912855, value=97
row-00098 column=info:name, timestamp=1533670912855, value=98
row-00099 column=info:name, timestamp=1533670912855, value=99
12 row(s) in 0.2310 seconds
hbase(main):002:0>
substring比较:只能是比较等于、不等于,其他运算符报错。
hbase(main):004:0> scan 'test',{FILTER=>"(SingleColumnValueFilter('info','name',=,'substring:89') OR SingleColumnValueFilter('info','name',=,'substring:99')"}
ROW COLUMN+CELL
row-00089 column=info:name, timestamp=1533670912855, value=89
row-00099 column=info:name, timestamp=1533670912855, value=99
2 row(s) in 0.0100 seconds
hbase(main):005:0>
使用正则表达式比较:
hbase(main):001:0> scan 'LOG20180814',{FILTER=>"SingleColumnValueFilter('INFO','Time',=,'regexstring:2018-08-14')",LIMIT=>1}
ROW COLUMN+CELL
A1021990_-_18125338_216659252_314 column=INFO:AccessSystemID, timestamp=1534219642000, value=2289318359
0999456
A1021990_-_18125338_216659252_314 column=INFO:Address, timestamp=1534219642000, value=\xE7\x8F\xA0\xE6\xB5\xB7\xE5\xB8\x82\xE5\x90\x
0999456 89\xE5\xA4\xA7\xE6\x96\xB0\xE5\x8C\xBA\xE7\x99\xBD\xE8\x8E\xB2\xE8\xB7\xAF100\xE5\x8F\xB7
A1021990_-_18125338_216659252_314 column=INFO:AreaCode, timestamp=1534219642000, value=440402
0999456
A1021990_-_18125338_216659252_314 column=INFO:AuthAccount, timestamp=1534219642000, value=-
0999456
A1021990_-_18125338_216659252_314 column=INFO:AuthType, timestamp=1534219642000, value=1021990
0999456
A1021990_-_18125338_216659252_314 column=INFO:CertCode, timestamp=1534219642000, value=-
- PrefixFilter根据rowkey行键前缀过滤
查询表名为test,行键前缀包含row的数据,取出前2条。
hbase(main):007:0> scan 'test',{FILTER=>"PrefixFilter('row')",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
2 row(s) in 0.0170 seconds
hbase(main):008:0>
- PageFilter分页过过滤
用法:scan '表名',{FILTER=>"PageFilter(返回最大结果数)"}
hbase(main):009:0> scan 'test',{FILTER=>"PageFilter(2)"}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
2 row(s) in 0.0150 seconds
hbase(main):010:0>
根据rowkey前缀查询,返回4条数据。
hbase(main):010:0> scan 'test',{FILTER=>"PrefixFilter('row') AND PageFilter(4)"}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
row-00011 column=info:name, timestamp=1533670912855, value=11
4 row(s) in 0.0080 seconds
hbase(main):011:0>
- KeyOnlyFilter行键过滤器
行键过滤器,只返回行键,默认初始值是false。false表示返回rowkey行键值长度为0字节数组;为true时,表示返回rowkey行键值长度为原始长度字节数组
hbase(main):014:0> scan 'test',{FILTER=>"KeyOnlyFilter(false)",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=
row-0001 column=info:name, timestamp=1533670912855, value=
2 row(s) in 0.0130 seconds
hbase(main):015:0>
hbase(main):015:0> scan 'test',{FILTER=>"KeyOnlyFilter(true)",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=\x00\x00\x00\x01
row-0001 column=info:name, timestamp=1533670912855, value=\x00\x00\x00\x01
2 row(s) in 0.0090 seconds
hbase(main):016:0>
- FirstKeyOnlyFilter行键过滤器
行键过滤器,返回的结果每行只有第一列kv。
hbase(main):017:0> scan 'test',{FILTER=>"FirstKeyOnlyFilter()"}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
row-00011 column=info:name, timestamp=1533670912855, value=11
row-00012 column=info:name, timestamp=1533670912855, value=12
row-00013 column=info:name, timestamp=1533670912855, value=13
- InclusiveStopFilter检查包含结束行键过滤器过滤器
设置停止的行,检查包含结束行键过滤器
查询表名test,行键为row-0002的数据结束查询。
hbase(main):029:0> scan 'test',{FILTER=>"InclusiveStopFilter('row-0002')"}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
row-00011 column=info:name, timestamp=1533670912855, value=11
row-00012 column=info:name, timestamp=1533670912855, value=12
row-00013 column=info:name, timestamp=1533670912855, value=13
row-00014 column=info:name, timestamp=1533670912855, value=14
row-00015 column=info:name, timestamp=1533670912855, value=15
row-00016 column=info:name, timestamp=1533670912855, value=16
row-00017 column=info:name, timestamp=1533670912855, value=17
row-00018 column=info:name, timestamp=1533670912855, value=18
row-00019 column=info:name, timestamp=1533670912855, value=19
row-0002 column=info:name, timestamp=1533670912855, value=2
13 row(s) in 0.0120 seconds
hbase(main):030:0>
- TimestampsFilter时间戳过滤器
查询返回指定时间戳的数据
hbase(main):033:0>
hbase(main):033:0> scan 'test',{FILTER=>"TimestampsFilter(1533670912855,1533670912859)"}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
row-00010 column=info:name, timestamp=1533670912855, value=10
row-00011 column=info:name, timestamp=1533670912855, value=11
row-00012 column=info:name, timestamp=1533670912855, value=12
- ColumnCountGetFilter列值数量过滤器
查询小于指定列数的数据,当查询的列数大于指定的值时,停止扫描。
查询表名test,列大于2的行数。
hbase(main):034:0> scan 'test',{FILTER=>"ColumnCountGetFilter(2)",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
2 row(s) in 0.0110 seconds
hbase(main):035:0>
- ColumnPaginationFilter列值分页过滤器
ColumnPaginationFilter(int limit, int offset): limit-读取limit数目的行并返回;offset-跳过offset数目的行。
hbase(main):049:0>
hbase(main):048:0> get 'test','row-0001'
COLUMN CELL
info:age timestamp=1533828626975, value=23
info:birthday timestamp=1533828651988, value=2018-08-10
info:gender timestamp=1533828664954, value=1
info:name timestamp=1533670912855, value=1
info:price timestamp=1533828549983, value=23435345
5 row(s) in 0.0070 seconds
hbase(main):049:0>
查询列数大于1,返回结果为2的数据。
hbase(main):049:0> scan 'test',{FILTER=>"ColumnPaginationFilter(2,1)"}
ROW COLUMN+CELL
row-0001 column=info:birthday, timestamp=1533828651988, value=2018-08-10
row-0001 column=info:gender, timestamp=1533828664954, value=1
1 row(s) in 0.0070 seconds
hbase(main):050:0>
查询列数大于1,返回结果为4的数据。
hbase(main):051:0> scan 'test',{FILTER=>"ColumnPaginationFilter(4,1)"}
ROW COLUMN+CELL
row-0001 column=info:birthday, timestamp=1533828651988, value=2018-08-10
row-0001 column=info:gender, timestamp=1533828664954, value=1
row-0001 column=info:name, timestamp=1533670912855, value=1
row-0001 column=info:price, timestamp=1533828549983, value=23435345
1 row(s) in 0.0060 seconds
hbase(main):052:0>
查询列数大于3,返回结果为4的数据。
hbase(main):053:0> scan 'test',{FILTER=>"ColumnPaginationFilter(4,3)"}
ROW COLUMN+CELL
row-0001 column=info:name, timestamp=1533670912855, value=1
row-0001 column=info:price, timestamp=1533828549983, value=23435345
1 row(s) in 0.0050 seconds
hbase(main):054:0>
- ColumnPrefixFilter列前缀过滤器
列名前缀过滤器,返回包含列名前缀的数据。
查询表名为test,列前缀包含na的数据
hbase(main):065:0> scan 'test', {FILTER=>"ColumnPrefixFilter('na')",LIMIT=>2}
ROW COLUMN+CELL
row-0000 column=info:name, timestamp=1533670912855, value=0
row-0001 column=info:name, timestamp=1533670912855, value=1
2 row(s) in 0.0110 seconds
hbase(main):066:0>
- SkipFilter跳转过滤器
SkipFilter通过检查返回结果来处理一行数据,如果列为空,则过滤该行数据。
- multipleColumnPrefixFilter根据列名得前缀范围过滤
根据列名得前缀范围过滤,查询表名为base,列名称前缀从a开始到b结果的数据。
hbase(main):015:0>
hbase(main):016:0* scan 'test',{STARTROW=>'row-0001',STOPROW=>'row-0003',FILTER=>"MultipleColumnPrefixFilter('a','b')"}
ROW COLUMN+CELL
row-0001 column=info:age, timestamp=1533828626975, value=23
row-0001 column=info:birthday, timestamp=1533828651988, value=2018-08-10
row-00010 column=info:age, timestamp=1533840214424, value=23
2 row(s) in 0.0090 seconds
hbase(main):017:0>
- 多组过滤器
FilterList filters = new FilterList(Operator.MUST_PASS_ONE);
// MUST_PASS_ALL:当所有过滤器满足条件时返回结果,默认是MUST_PASS_ALL
// MUST_PASS_ONE:当只有一个过滤器满足条件时返回结果
1、查询表名为BASE,rowkey行键前缀为R111_110101198203070775,列名称大于RealCode,且时间戳等于1531997189000, 1531997189005的数据。
hbase(main):014:0>
hbase(main):014:0> scan 'BASE', {FILTER => "PrefixFilter('R111_110101198203070775') AND QualifierFilter (>=, 'binary:RealCode') AND TimestampsFilter ( 1531997189000, 1531997189005)"}
ROW COLUMN+CELL
R111_110101198203070775 column=INFO:RealCode, timestamp=1531997189000, value=110101198203070775
R111_110101198203070775 column=INFO:RealType, timestamp=1531997189000, value=111
R111_110101198203070775 column=INFO:Sex, timestamp=1531997189000, value=1
R111_110101198203070775 column=INFO:SystemType, timestamp=1531997189000, value=999
R111_110101198203070775 column=INFO:Time, timestamp=1531997189000, value=2018-07-19T10:46:29Z
R111_110101198203070775 column=INFO:WriterTime, timestamp=1531997189000, value=2018-07-20T03:02:34Z
1 row(s) in 0.0690 seconds
hbase(main):015:0>
2、查询表名为BASE,行键前缀为R111_110101198203070775,列值大于等于440401并且小于等于440402。
hbase(main):007:0> scan 'BASE',{FILTER=>"PrefixFilter('R111_110101198203070775') AND SingleColumnValueFilter('INFO','AreaCode',>=,'binary:440401') AND SingleColumnValueFilter('INFO','AreaCode',<=,'binary:440402')"}
3、查询表名为base,行键前缀为T,列名称为AreaCode,值为610113的数据。
hbase(main):013:0> scan 'BASE',{COLUMNS =>'INFO',LIMIT=>1,FILTER=>"PrefixFilter('T') AND QualifierFilter(>=,'binary:AreaCode') AND ValueFilter(=,'substring:610113')"}
ROW COLUMN+CELL
T00082202B9A0 column=INFO:AreaCode, timestamp=1531358962000, value=610113
1 row(s) in 0.3430 seconds
hbase(main):014:0>
4、查询表名为base,行键前缀为T,值为610113的数据。
hbase(main):027:0*
hbase(main):027:0* scan 'BASE',{COLUMNS =>'INFO',LIMIT=>1,FILTER=>"PrefixFilter('T') AND ValueFilter(=,'substring:610113')"}
ROW COLUMN+CELL
T00082202B9A0 column=INFO:AreaCode, timestamp=1531358962000, value=610113
1 row(s) in 0.0910 seconds
hbase(main):028:0>
5、查询表名为STG20180703,开始行WEBWXK266WUT_29064004,结束行WEBWXMZQUCUK_29064024,并且列值大于等于2018-07-03T10:20:36Z,且小于等于2018-07-03T10:20:56Z的数据。
scan 'STG20180703',{STARTROW=>'WEBWXK266WUT_29064004',ENDROW=>'WEBWXMZQUCUK_29064024',FILTER=>"(SingleColumnValueFilter('INFO','Time',>=,'binary:2018-07-03T10:20:36Z')) AND (SingleColumnValueFilter('INFO','Time',<=,'binary:2018-07-03T10:20:56Z'))"}
6、将查询hbase数据写入到文件中
echo "hbasae scan 'LOG20180108',{COLUMNS => 'INFO',LIMIT=>1,FILTER=>"(PrefixFilter('T')) AND (SingleColumnValueFilter('INFO','AreaCode',=,'binary:610103'))"}" >>/home/hive/logterminal-min.csv
-
-
-
- Count统计表中行数
-
-
查询表中有多少行
count 'user'
示例:
hbase(main):014:0> scan 'user'
ROW COLUMN+CELL
rk001 column=info:age, timestamp=1506432714125, value=20
rk001 column=info:name, timestamp=1506432700769, value=lisi
rk002 column=info:age, timestamp=1506432761021, value=20
rk002 column=info:name, timestamp=1506432771239, value=lisi
2 row(s) in 0.0360 seconds
hbase(main):015:0> count 'user'
2 row(s) in 0.0110 seconds
=> 2
hbase(main):016:0>
-
-
-
- incr实现递增
-
-
给row key ‘rk003’增加'info:age'字段,并使用counter实现递增
hbase(main):016:0> incr 'user','rk003','info:age'
COUNTER VALUE = 1
0 row(s) in 0.0280 seconds
hbase(main):017:0> get 'user','rk003','info:age'
COLUMN CELL
info:age timestamp=1506432893409, value=\x00\x00\x00\x00\x00\x00\x00\x01
1 row(s) in 0.0450 seconds
hbase(main):018:0> incr 'user','rk003','info:age'
COUNTER VALUE = 2
0 row(s) in 0.0400 seconds
hbase(main):019:0> get 'user','rk003','info:age'
COLUMN CELL
info:age timestamp=1506432938203, value=\x00\x00\x00\x00\x00\x00\x00\x02
1 row(s) in 0.0180 seconds
#获取当前count的值
hbase(main):020:0> count 'user'
3 row(s) in 0.0110 seconds
=> 3
hbase(main):021:0>
-
-
-
- delete/deleteall删除表数据
-
-
用法:delete ‘表名称’, ‘行名称’, ’列族:列名’
删除user表row key为rk0001,列标示符为info:name的数据
delete 'people', 'rk0001', 'info:name'
删除user表row key为rk0001,列标示符为info:name,timestamp为1392383705316的数据
delete 'user', 'rk0001', 'info:name', 1392383705316
示例:
hbase(main):049:0> delete 'user','rk001','info:name'
0 row(s) in 0.0420 seconds
hbase(main):050:0> get 'user','rk001'
COLUMN CELL
data:pic timestamp=1506430730921, value=1111
info:age timestamp=1506430708460, value=20
info:sex timestamp=1506430691490, value=male
1 row(s) in 0.0310 seconds
hbase(main):051:0>
deleteall删除整行数据:
删除user表row key为rk0001整行数据
deleteall 'user','rk001'
示例:
hbase(main):002:0>
hbase(main):002:0> scan 'user'
ROW COLUMN+CELL
rk001 column=data:pic, timestamp=1506430730921, value=1111
rk001 column=info:age, timestamp=1506430708460, value=20
rk001 column=info:name, timestamp=1506432276790, value=lisi
rk001 column=info:sex, timestamp=1506430691490, value=male
1 row(s) in 0.4290 seconds
hbase(main):003:0>
hbase(main):004:0>
hbase(main):005:0* deleteall 'user','rk001'
0 row(s) in 0.0610 seconds
hbase(main):006:0> scan 'user'
ROW COLUMN+CELL
0 row(s) in 0.0160 seconds
hbase(main):007:0>
-
-
-
- truncate/truncate_preserve清空整个表
-
-
清空user表中的数据
truncate 'user'
示例:
hbase(main):021:0>
hbase(main):022:0* scan 'user'
ROW COLUMN+CELL
rk001 column=info:age, timestamp=1506432714125, value=20
rk001 column=info:name, timestamp=1506432700769, value=lisi
rk002 column=info:age, timestamp=1506432761021, value=20
rk002 column=info:name, timestamp=1506432771239, value=lisi
rk003 column=info:age, timestamp=1506432938203, value=\x00\x00\x00\x00\x00\x00
\x00\x02
3 row(s) in 0.0770 seconds
hbase(main):023:0> truncate 'user'
Truncating 'user' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.0010 seconds
# hbase是先将掉disable掉,然后drop掉后重建表来实现truncate的功能。
hbase(main):024:0> scan 'user'
ROW COLUMN+CELL
0 row(s) in 0.1380 seconds
hbase(main):025:0>
hbase(main):178:0>
hbase(main):178:0> truncate_preserve 'demo'
Truncating 'demo' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.2780 seconds
hbase(main):179:0>
-
-
-
- 读取文件中命令行
-
-
create 'test', 'cf'
list 'test'
put 'test', 'row1', 'cf:a', 'value1'
put 'test', 'row2', 'cf:b', 'value2'
put 'test', 'row3', 'cf:c', 'value3'
put 'test', 'row4', 'cf:d', 'value4'
scan 'test'
get 'test', 'row1'
disable 'test'
enable 'test'
$ ./hbase shell ./sample_commands.txt
-
-
-
- Import/export导入/导出数据
-
-
数据导入:
./hbase org.apache.hadoop.hbase.mapreduce.Driver import 表名 数据文件位置
hdfs
数据文件位置 可以加 前缀 file:///
否则 会访问 hdfs 地址。
数据导出:
./hbase org.apache.hadoop.hbase.mapreduce.Driver export 表名 数据文件位置
1 HBase本身提供的接口
1)导入
./hbase org.apache.hadoop.hbase.mapreduce.Driver import 表名 数据文件位置
其中数据文件位置可为本地文件目录,也可以分布式文件系统hdfs的路径。
当其为前者时,直接指定即可,也可以加前缀file:///
而当其伟后者时,必须明确指明hdfs的路径,例如hdfs://mymaster:9000/path
- 导出
./hbase org.apache.hadoop.hbase.mapreduce.Driver export 表名 数据文件位置
同上,其中数据文件位置可为本地文件目录,也可以分布式文件系统hdfs的路径。
另外,该接口类还提供了一些其它的方法,例如表与表之间的数据拷贝,导入tsv文件等,可回车键查看
2 hadoop提供的工具(hadoop tool interface)
使用该工具的话,需要首先配置hadoop-env.sh,否则在执行时会提示缺少jar包或者class文件。
1)配置
切换到hadoop安装目录下,
vim hadoop-env.sh,加以下语句
export HADOOP_CLASSPATH="hbasejar包路径:zookeeper的jar包路径:hbase的config目录"
2)使用
导入:
hadoop jar /../hbase/hbase-*.jar import mytest /export/mybakup
导出:
hadoop jar /../hbase/hbase-*.jar import mytest /import/mybackup
直接将数据导出到hdfs目录中,当不指定file前缀时。
另外:
export,fs的参数为hdfs上的路径时,该路径必须不能已经存在,否则会报错。
import的表必须是hbase中已经创建好的,否则会报错。
-
-
-
- snapshot备份元数据
-
-
hbase(main):242:0> help 'snapshots'
Command: clone_snapshot
Create a new table by cloning the snapshot content.
There're no copies of data involved.
And writing on the newly created table will not influence the snapshot data.
Examples:
hbase> clone_snapshot 'snapshotName', 'tableName'
hbase> clone_snapshot 'snapshotName', 'namespace:tableName'
Command: delete_all_snapshot
Delete all of the snapshots matching the given regex. Examples:
hbase> delete_all_snapshot 's.*'
Command: delete_snapshot
Delete a specified snapshot. Examples:
hbase> delete_snapshot 'snapshotName',
Command: list_snapshots
List all snapshots taken (by printing the names and relative information).
Optional regular expression parameter could be used to filter the output
by snapshot name.
Examples:
hbase> list_snapshots
hbase> list_snapshots 'abc.*'
Command: restore_snapshot
Restore a specified snapshot.
The restore will replace the content of the original table,
bringing back the content to the snapshot state.
The table must be disabled.
Examples:
hbase> restore_snapshot 'snapshotName'
Command: snapshot
Take a snapshot of specified table. Examples:
hbase> snapshot 'sourceTable', 'snapshotName'
hbase> snapshot 'namespace:sourceTable', 'snapshotName', {SKIP_FLUSH => true}
hbase(main):243:0>
- snapshot备份数据
示例:
用法:snapshot [元数据名称] [备份元数据名称]
snapshot 'kylin_metadata', 'kylin_metadata_backup20171026'
hbase(main):266:0* list
TABLE
BASE
TestTable
demo
test
testtable
5 row(s) in 0.0150 seconds
=> ["BASE", "TestTable", "demo", "test", "testtable"]
hbase(main):267:0> snapshot 'test','testbak' #备份数
0 row(s) in 0.6590 seconds
- list_snapshots查看备份数据
用法:list_snapshots [备份元数据名称]
hbase(main):268:0> list_snapshots #查看备份数据
SNAPSHOT TABLE + CREATION TIME
testbak test (Tue Aug 14 05:27:01 +0800 2018)
1 row(s) in 0.0140 seconds
=> ["testbak"]
hbase(main):269:0>
hbase(main):269:0> list_snapshots 'testbak'
SNAPSHOT TABLE + CREATION TIME
testbak test (Tue Aug 14 05:27:01 +0800 2018)
1 row(s) in 0.0350 seconds
=> ["testbak"]
hbase(main):270:0>
- clone_snapshot将备份数据生成表结构
用法:clone_snapshot [备份元数据名称] [表名称]
hbase(main):281:0*
hbase(main):282:0* list
TABLE
BASE
TestTable
demo
test
test2
testtable
6 row(s) in 0.0160 seconds
=> ["BASE", "TestTable", "demo", "test", "test2", "testtable"]
hbase(main):283:0>
hbase(main):284:0* clone_snapshot 'testbak','test5'
0 row(s) in 0.8200 seconds
hbase(main):285:0> list
TABLE
BASE
TestTable
demo
test
test2
test5
testtable
7 row(s) in 0.0050 seconds
=> ["BASE", "TestTable", "demo", "test", "test2", "test5", "testtable"]
hbase(main):286:0>
- restore_snapshot还原备份数据
用法:restore_snapshot [备份元数据名称]
hbase(main):298:0> list
TABLE
BASE
TestTable
demo
testtable
4 row(s) in 0.0140 seconds
=> ["BASE", "TestTable", "demo", "testtable"]
hbase(main):299:0>
hbase(main):299:0>
hbase(main):300:0*
hbase(main):301:0* restore_snapshot 'testbak'
0 row(s) in 0.8610 seconds
hbase(main):302:0> list
TABLE
BASE
TestTable
demo
test
testtable
5 row(s) in 0.0140 seconds
=> ["BASE", "TestTable", "demo", "test", "testtable"]
hbase(main):303:0>
- delete_snapshot删除备份数据
用法:delete_snapshot [备份元数据名称]
hbase(main):307:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
testbak test (Tue Aug 14 05:27:01 +0800 2018)
testbak2 test (Tue Aug 14 05:40:12 +0800 2018)
testbak3 test (Tue Aug 14 05:40:28 +0800 2018)
3 row(s) in 0.0610 seconds
=> ["testbak", "testbak2", "testbak3"]
hbase(main):308:0>
hbase(main):309:0*
hbase(main):310:0*
hbase(main):311:0* delete_snapshot 'testbak'
0 row(s) in 0.0360 seconds
hbase(main):312:0>
hbase(main):313:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
testbak2 test (Tue Aug 14 05:40:12 +0800 2018)
testbak3 test (Tue Aug 14 05:40:28 +0800 2018)
2 row(s) in 0.0550 seconds
=> ["testbak2", "testbak3"]
hbase(main):314:0>
- delete_all_snapshot删除所有备份数据
用法:delete_all_snapshot [备份元数据名称]
hbase(main):313:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
testbak2 test (Tue Aug 14 05:40:12 +0800 2018)
testbak3 test (Tue Aug 14 05:40:28 +0800 2018)
2 row(s) in 0.0550 seconds
=> ["testbak2", "testbak3"]
hbase(main):314:0>
hbase(main):315:0> delete_all_snapshot 'test.*' #通过正则表达式支持删除多个备份
SNAPSHOT TABLE + CREATION TIME
testbak2 test (Tue Aug 14 05:40:12 +0800 2018)
testbak3 test (Tue Aug 14 05:40:28 +0800 2018)
Delete the above 2 snapshots (y/n)?
y
0 row(s) in 0.0670 seconds
2 snapshots successfully deleted.
hbase(main):316:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
0 row(s) in 0.0070 seconds
=> []
hbase(main):317:0>