4.6. Shell Tricks

4.6.1. Table variables

HBase 0.95 adds shell commands that provide a jruby-style object-oriented references for tables. Previously all of the shell commands that act upon a table have a procedural style that always took the name of the table as an argument. HBase 0.95 introduces the ability to assign a table to a jruby variable. The table reference can be used to perform data read write operations such as puts, scans, and gets well as admin functionality such as disabling, dropping, describing tables.

For example, previously you would always specify a table name:

hbase(main):000:0> create ‘t’, ‘f’
0 row(s) in 1.0970 seconds
hbase(main):001:0> put 't', 'rold', 'f', 'v'
0 row(s) in 0.0080 seconds

hbase(main):002:0> scan 't' 
ROW                                COLUMN+CELL                                                                                      
 rold                              column=f:, timestamp=1378473207660, value=v                                                      
1 row(s) in 0.0130 seconds

hbase(main):003:0> describe 't'
DESCRIPTION                                                                           ENABLED                                       
 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true                                          
 SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2                                               
 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false                                               
 ', BLOCKCACHE => 'true'}                                                                                 
1 row(s) in 1.4430 seconds

hbase(main):004:0> disable 't'
0 row(s) in 14.8700 seconds

hbase(main):005:0> drop 't'
0 row(s) in 23.1670 seconds

hbase(main):006:0> 
	  

Now you can assign the table to a variable and use the results in jruby shell code.

hbase(main):007 > t = create 't', 'f'
0 row(s) in 1.0970 seconds

=> Hbase::Table - t
hbase(main):008 > t.put 'r', 'f', 'v'
0 row(s) in 0.0640 seconds
hbase(main):009 > t.scan
ROW                           COLUMN+CELL                                                                        
 r                            column=f:, timestamp=1331865816290, value=v                                        
1 row(s) in 0.0110 seconds
hbase(main):010:0> t.describe
DESCRIPTION                                                                           ENABLED                                       
 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true                                          
 SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2                                               
 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false                                               
 ', BLOCKCACHE => 'true'}                                                                                 
1 row(s) in 0.0210 seconds
hbase(main):038:0> t.disable
0 row(s) in 6.2350 seconds
hbase(main):039:0> t.drop
0 row(s) in 0.2340 seconds
	  

If the table has already been created, you can assign a Table to a variable by using the get_table method:

hbase(main):011 > create 't','f'
0 row(s) in 1.2500 seconds

=> Hbase::Table - t
hbase(main):012:0> tab = get_table 't'
0 row(s) in 0.0010 seconds

=> Hbase::Table - t
hbase(main):013:0> tab.put ‘r1’ ,’f’, ‘v’ 
0 row(s) in 0.0100 seconds
hbase(main):014:0> tab.scan
ROW                                COLUMN+CELL                                                                                      
 r1                                column=f:, timestamp=1378473876949, value=v                                                      
1 row(s) in 0.0240 seconds
hbase(main):015:0> 
	  

The list functionality has also been extended so that it returns a list of table names as strings. You can then use jruby to script table operations based on these names. The list_snapshots command also acts similarly.

hbase(main):016 > tables = list(‘t.*’)
TABLE                                                                                                                               
t                                                                                                                                   
1 row(s) in 0.1040 seconds

=> #<#<Class:0x7677ce29>:0x21d377a4>
hbase(main):017:0> tables.map { |t| disable t ; drop  t}
0 row(s) in 2.2510 seconds

=> [nil]
hbase(main):018:0> 
            

4.6.2. irbrc

Create an .irbrc file for yourself in your home directory. Add customizations. A useful one is command history so commands are save across Shell invocations:

$ more .irbrc
require 'irb/ext/save-history'
IRB.conf[:SAVE_HISTORY] = 100
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"

See the ruby documentation of .irbrc to learn about other possible configurations.

4.6.3. LOG data to timestamp

To convert the date '08/08/16 20:56:29' from an hbase log into a timestamp, do:

hbase(main):021:0> import java.text.SimpleDateFormat
hbase(main):022:0> import java.text.ParsePosition
hbase(main):023:0> SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("08/08/16 20:56:29", ParsePosition.new(0)).getTime() => 1218920189000

To go the other direction:

hbase(main):021:0> import java.util.Date
hbase(main):022:0> Date.new(1218920189000).toString() => "Sat Aug 16 20:56:29 UTC 2008"

To output in a format that is exactly like that of the HBase log format will take a little messing with SimpleDateFormat.

4.6.4. Debug

4.6.4.1. Shell debug switch

You can set a debug switch in the shell to see more output -- e.g. more of the stack trace on exception -- when you run a command:

hbase> debug <RETURN>

4.6.4.2. DEBUG log level

To enable DEBUG level logging in the shell, launch it with the -d option.

$ ./bin/hbase shell -d

4.6.5. Commands

4.6.5.1. count

Count command returns the number of rows in a table. It's quite fast when configured with the right CACHE

hbase> count '<tablename>', CACHE => 1000

The above count fetches 1000 rows at a time. Set CACHE lower if your rows are big. Default is to fetch one row at a time.

comments powered by Disqus