Table of Contents
A good general introduction on the strength and weaknesses modelling on the various non-rdbms datastores is Ian Varley's Master thesis, No Relation: The Mixed Blessings of Non-Relational Databases. Recommended. Also, read Section 9.7.7.6, “KeyValue” for how HBase stores data internally, and the section on Section 6.11, “Schema Design Case Studies”.
HBase schemas can be created or updated with Chapter 4, The Apache HBase Shell or by using HBaseAdmin in the Java API.
Tables must be disabled when making ColumnFamily modifications, for example:
Configuration config = HBaseConfiguration.create(); HBaseAdmin admin = new HBaseAdmin(conf); String table = "myTable"; admin.disableTable(table); HColumnDescriptor cf1 = ...; admin.addColumn(table, cf1); // adding new ColumnFamily HColumnDescriptor cf2 = ...; admin.modifyColumn(table, cf2); // modifying existing ColumnFamily admin.enableTable(table);
See Section 2.4.4, “Client configuration and dependencies connecting to an HBase cluster” for more information about configuring client connections.
Note: online schema changes are supported in the 0.92.x codebase, but the 0.90.x codebase requires the table to be disabled.
When changes are made to either Tables or ColumnFamilies (e.g., region size, block size), these changes take effect the next time there is a major compaction and the StoreFiles get re-written.
See Section 9.7.7, “Store” for more information on StoreFiles.