Although at a conceptual level tables may be viewed as a sparse set of rows, they are physically stored by column family. A new column qualifier (column_family:column_qualifier) can be added to an existing column family at any time.
Table 5.2. ColumnFamily anchor
Row Key | Time Stamp | Column Family anchor |
---|---|---|
"com.cnn.www" | t9 | anchor:cnnsi.com = "CNN" |
"com.cnn.www" | t8 | anchor:my.look.ca = "CNN.com" |
Table 5.3. ColumnFamily contents
Row Key | Time Stamp | ColumnFamily "contents:" |
---|---|---|
"com.cnn.www" | t6 | contents:html = "<html>..." |
"com.cnn.www" | t5 | contents:html = "<html>..." |
"com.cnn.www" | t3 | contents:html = "<html>..." |
The empty cells shown in the
conceptual view are not stored at all.
Thus a request for the value of the contents:html
column at time stamp
t8
would return no value. Similarly, a request for an
anchor:my.look.ca
value at time stamp t9
would
return no value. However, if no timestamp is supplied, the most recent value for a
particular column would be returned. Given multiple versions, the most recent is also the
first one found, since timestamps
are stored in descending order. Thus a request for the values of all columns in the row
com.cnn.www
if no timestamp is specified would be: the value of
contents:html
from timestamp t6
, the value of
anchor:cnnsi.com
from timestamp t9
, the value of
anchor:my.look.ca
from timestamp t8
.
For more information about the internals of how Apache HBase stores data, see Section 9.7, “Regions”.