See Section 2.6.2, “Recommended Configurations”.
For larger systems, managing compactions and splits may be something you want to consider.
See hfile.block.cache.size
.
A memory setting for the RegionServer process.
HBASE-9857 adds a new option to prefetch HFile contents when opening the blockcache, if a columnfamily or regionserver property is set. This option is available for HBase 0.98.3 and later. The purpose is to warm the blockcache as rapidly as possible after the cache is opened, using in-memory table data, and not counting the prefetching as cache misses. This is great for fast reads, but is not a good idea if the data to be preloaded will not fit into the blockcache. It is useful for tuning the IO impact of prefetching versus the time before all data blocks are in cache.
To enable prefetching on a given column family, you can use HBase Shell or use the API.
Example 14.1. Enable Prefetch Using HBase Shell
hbase> create 'MyTable', { NAME => 'myCF', PREFETCH_BLOCKS_ON_OPEN => 'true' }
Example 14.2. Enable Prefetch Using the API
// ... HTableDescriptor tableDesc = new HTableDescriptor("myTable"); HColumnDescriptor cfDesc = new HColumnDescriptor("myCF"); cfDesc.setPrefetchBlocksOnOpen(true); tableDesc.addFamily(cfDesc); // ...
See the API documentation for CacheConfig.
See ???. This memory setting is often adjusted for the RegionServer process depending on needs.
See ???. This memory setting is often adjusted for the RegionServer process depending on needs.
See hbase.hstore.blockingStoreFiles
.
If there is blocking in the RegionServer logs, increasing this can help.
See hbase.hregion.memstore.block.multiplier
.
If there is enough RAM, increasing this can help.
Have HBase write the checksum into the datablock and save having to do the checksum seek whenever you read.
See hbase.regionserver.checksum.verify
,
hbase.hstore.bytes.per.checksum
and hbase.hstore.checksum.algorithm
For more information see the
release note on HBASE-5074 support checksums in HBase block cache.
HBASE-11355 introduces several callQueue tuning mechanisms which can increase performance. See the JIRA for some benchmarking information.
To increase the number of callqueues, set
hbase.ipc.server.num.callqueue
to a value greater than
1
.
To split the callqueue into separate read and write queues, set
hbase.ipc.server.callqueue.read.ratio
to a value between
0
and 1
. This factor weights the queues toward
writes (if below .5) or reads (if above .5). Another way to say this is that the factor
determines what percentage of the split queues are used for reads. The following
examples illustrate some of the possibilities. Note that you always have at least one
write queue, no matter what setting you use.
The default value of 0
does not split the queue.
A value of .3
uses 30% of the queues for reading and 60% for
writing. Given a value of 10
for
hbase.ipc.server.num.callqueue
, 3 queues would be used for reads
and 7 for writes.
A value of .5
uses the same number of read queues and write
queues. Given a value of 10
for
hbase.ipc.server.num.callqueue
, 5 queues would be used for reads
and 5 for writes.
A value of .6
uses 60% of the queues for reading and 30% for
reading. Given a value of 10
for
hbase.ipc.server.num.callqueue
, 7 queues would be used for reads
and 3 for writes.
A value of 1.0
uses one queue to process write requests, and
all other queues process read requests. A value higher than 1.0
has the same effect as a value of 1.0
. Given a value of
10
for hbase.ipc.server.num.callqueue
, 9
queues would be used for reads and 1 for writes.
You can also split the read queues so that separate queues are used for short reads
(from Get operations) and long reads (from Scan operations), by setting the
hbase.ipc.server.callqueue.scan.ratio
option. This option is a factor
between 0 and 1, which determine the ratio of read queues used for Gets and Scans. More
queues are used for Gets if the value is below .5
and more are used
for scans if the value is above .5
. No matter what setting you use,
at least one read queue is used for Get operations.
A value of 0
does not split the read queue.
A value of .3
uses 60% of the read queues for Gets and 30%
for Scans. Given a value of 20
for
hbase.ipc.server.num.callqueue
and a value of .5
for hbase.ipc.server.callqueue.read.ratio
, 10 queues
would be used for reads, out of those 10, 7 would be used for Gets and 3 for
Scans.
A value of .5
uses half the read queues for Gets and half for
Scans. Given a value of 20
for
hbase.ipc.server.num.callqueue
and a value of .5
for hbase.ipc.server.callqueue.read.ratio
, 10 queues
would be used for reads, out of those 10, 5 would be used for Gets and 5 for
Scans.
A value of .6
uses 30% of the read queues for Gets and 60%
for Scans. Given a value of 20
for
hbase.ipc.server.num.callqueue
and a value of .5
for hbase.ipc.server.callqueue.read.ratio
, 10 queues
would be used for reads, out of those 10, 3 would be used for Gets and 7 for
Scans.
A value of 1.0
uses all but one of the read queues for Scans.
Given a value of 20
for
hbase.ipc.server.num.callqueue
and a value of .5
for hbase.ipc.server.callqueue.read.ratio
, 10 queues
would be used for reads, out of those 10, 1 would be used for Gets and 9 for
Scans.
You can use the new option
hbase.ipc.server.callqueue.handler.factor
to programmatically tune
the number of queues:
A value of 0
uses a single shared queue between all the
handlers.
A value of 1
uses a separate queue for each handler.
A value between 0
and 1
tunes the number
of queues against the number of handlers. For instance, a value of
.5
shares one queue between each two handlers.
Having more queues, such as in a situation where you have one queue per handler, reduces contention when adding a task to a queue or selecting it from a queue. The trade-off is that if you have some queues with long-running tasks, a handler may end up waiting to execute from that queue rather than processing another queue which has waiting tasks.
For these values to take effect on a given Region Server, the Region Server must be restarted. These parameters are intended for testing purposes and should be used carefully.