org.apache.hadoop.mapreduce.lib.partition
Class KeyFieldBasedPartitioner<K2,V2>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Partitioner<K2,V2>
      extended by org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedPartitioner<K2,V2>
All Implemented Interfaces:
Configurable

@InterfaceAudience.Public
@InterfaceStability.Stable
public class KeyFieldBasedPartitioner<K2,V2>
extends Partitioner<K2,V2>
implements Configurable

Defines a way to partition keys based on certain key fields (also see KeyFieldBasedComparator. The key specification supported is of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).


Field Summary
static String PARTITIONER_OPTIONS
           
 
Constructor Summary
KeyFieldBasedPartitioner()
           
 
Method Summary
 Configuration getConf()
          Return the configuration used by this object.
 String getKeyFieldPartitionerOption(JobContext job)
          Get the KeyFieldBasedPartitioner options
protected  int getPartition(int hash, int numReduceTasks)
           
 int getPartition(K2 key, V2 value, int numReduceTasks)
          Get the partition number for a given key (hence record) given the total number of partitions i.e.
protected  int hashCode(byte[] b, int start, int end, int currentHash)
           
 void setConf(Configuration conf)
          Set the configuration to be used by this object.
 void setKeyFieldPartitionerOptions(Job job, String keySpec)
          Set the KeyFieldBasedPartitioner options used for Partitioner
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARTITIONER_OPTIONS

public static String PARTITIONER_OPTIONS
Constructor Detail

KeyFieldBasedPartitioner

public KeyFieldBasedPartitioner()
Method Detail

setConf

public void setConf(Configuration conf)
Description copied from interface: Configurable
Set the configuration to be used by this object.

Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Description copied from interface: Configurable
Return the configuration used by this object.

Specified by:
getConf in interface Configurable

getPartition

public int getPartition(K2 key,
                        V2 value,
                        int numReduceTasks)
Description copied from class: Partitioner
Get the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

Typically a hash function on a all or a subset of the key.

Specified by:
getPartition in class Partitioner<K2,V2>
Parameters:
key - the key to be partioned.
value - the entry value.
numReduceTasks - the total number of partitions.
Returns:
the partition number for the key.

hashCode

protected int hashCode(byte[] b,
                       int start,
                       int end,
                       int currentHash)

getPartition

protected int getPartition(int hash,
                           int numReduceTasks)

setKeyFieldPartitionerOptions

public void setKeyFieldPartitionerOptions(Job job,
                                          String keySpec)
Set the KeyFieldBasedPartitioner options used for Partitioner

Parameters:
keySpec - the key specification of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).

getKeyFieldPartitionerOption

public String getKeyFieldPartitionerOption(JobContext job)
Get the KeyFieldBasedPartitioner options



Copyright © 2009 The Apache Software Foundation