public class HybridHashTableContainer extends Object implements MapJoinTableContainer, MapJoinTableContainerDirectAccess
| Modifier and Type | Class and Description |
|---|---|
static class |
HybridHashTableContainer.HashPartition
This class encapsulates the triplet together since they are closely related to each other
The triplet: hashmap (either in memory or on disk), small table container, big table container
|
MapJoinTableContainer.ReusableGetAdaptor| Constructor and Description |
|---|
HybridHashTableContainer(org.apache.hadoop.conf.Configuration hconf,
long keyCount,
long memoryAvailable,
long estimatedTableSize,
HybridHashTableConf nwayConf) |
| Modifier and Type | Method and Description |
|---|---|
static int |
calcNumPartitions(long memoryThreshold,
long dataSize,
int minNumParts,
int minWbSize,
HybridHashTableConf nwayConf)
Calculate how many partitions are needed.
|
void |
clear()
Clears the contents of the table.
|
MapJoinTableContainer.ReusableGetAdaptor |
createGetter(MapJoinKey keyTypeFromLoader)
Creates reusable get adaptor that can be used to retrieve rows from the table
based on either vectorized or non-vectorized input rows to MapJoinOperator.
|
void |
dumpMetrics() |
void |
dumpStats() |
MapJoinKey |
getAnyKey() |
HybridHashTableContainer.HashPartition[] |
getHashPartitions() |
LazyBinaryStructObjectInspector |
getInternalValueOi() |
long |
getMemoryThreshold() |
int |
getNumPartitions() |
boolean[] |
getSortableSortOrders() |
long |
getTableRowSize() |
int |
getToSpillPartitionId()
Gets the partition Id into which to spill the big table row
|
int |
getTotalInMemRowCount() |
MapJoinBytesTableContainer.KeyValueHelper |
getWriteHelper() |
boolean |
hasSpill()
Checks if the container has spilled any data onto disk.
|
boolean |
isHashMapSpilledOnCreation(int partitionId)
Check if the hash table of a specified partition has been "spilled" to disk when it was created.
|
boolean |
isOnDisk(int partitionId)
Check if the hash table of a specified partition is on disk (or "spilled" on creation)
|
void |
put(org.apache.hadoop.io.Writable currentKey,
org.apache.hadoop.io.Writable currentValue) |
MapJoinKey |
putRow(MapJoinObjectSerDeContext keyContext,
org.apache.hadoop.io.Writable currentKey,
MapJoinObjectSerDeContext valueContext,
org.apache.hadoop.io.Writable currentValue)
Adds row from input to the table.
|
long |
refreshMemoryUsed()
Get the current memory usage by recalculating it.
|
void |
seal()
Indicates to the container that the puts have ended; table is now r/o.
|
void |
setSpill(boolean isSpilled) |
void |
setTotalInMemRowCount(int totalInMemRowCount) |
long |
spillPartition(int partitionId)
Move the hashtable of a specified partition from memory into local file system
|
public HybridHashTableContainer(org.apache.hadoop.conf.Configuration hconf,
long keyCount,
long memoryAvailable,
long estimatedTableSize,
HybridHashTableConf nwayConf)
throws SerDeException,
IOException
SerDeExceptionIOExceptionpublic MapJoinBytesTableContainer.KeyValueHelper getWriteHelper()
public HybridHashTableContainer.HashPartition[] getHashPartitions()
public long getMemoryThreshold()
public long refreshMemoryUsed()
public LazyBinaryStructObjectInspector getInternalValueOi()
public boolean[] getSortableSortOrders()
public MapJoinKey putRow(MapJoinObjectSerDeContext keyContext, org.apache.hadoop.io.Writable currentKey, MapJoinObjectSerDeContext valueContext, org.apache.hadoop.io.Writable currentValue) throws SerDeException, HiveException, IOException
MapJoinTableContainerputRow in interface MapJoinTableContainerSerDeExceptionHiveExceptionIOExceptionpublic boolean isOnDisk(int partitionId)
partitionId - partition numberpublic boolean isHashMapSpilledOnCreation(int partitionId)
partitionId - hashMap IDpublic long spillPartition(int partitionId)
throws IOException
partitionId - the hashtable to be movedIOExceptionpublic static int calcNumPartitions(long memoryThreshold,
long dataSize,
int minNumParts,
int minWbSize,
HybridHashTableConf nwayConf)
throws IOException
memoryThreshold - memory threshold for the given tabledataSize - total data size for the tableminNumParts - minimum required number of partitionsminWbSize - minimum required write buffer sizenwayConf - the n-way join configurationIOExceptionpublic int getNumPartitions()
public int getTotalInMemRowCount()
public void setTotalInMemRowCount(int totalInMemRowCount)
public long getTableRowSize()
public boolean hasSpill()
MapJoinTableContainerhasSpill in interface MapJoinTableContainerpublic void setSpill(boolean isSpilled)
public int getToSpillPartitionId()
public void clear()
MapJoinTableContainerclear in interface MapJoinTableContainerpublic MapJoinKey getAnyKey()
getAnyKey in interface MapJoinTableContainerpublic MapJoinTableContainer.ReusableGetAdaptor createGetter(MapJoinKey keyTypeFromLoader)
MapJoinTableContainercreateGetter in interface MapJoinTableContainerkeyTypeFromLoader - Last key from hash table loader, to determine key type used
when loading hashtable (if it can vary).public void seal()
MapJoinTableContainerseal in interface MapJoinTableContainerpublic void put(org.apache.hadoop.io.Writable currentKey,
org.apache.hadoop.io.Writable currentValue)
throws SerDeException,
IOException
put in interface MapJoinTableContainerDirectAccessSerDeExceptionIOExceptionpublic void dumpMetrics()
dumpMetrics in interface MapJoinTableContainerpublic void dumpStats()
Copyright © 2017 The Apache Software Foundation. All rights reserved.