org.apache.hadoop.mapreduce.lib.input
Class FileSplit

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputSplit
      extended by org.apache.hadoop.mapreduce.lib.input.FileSplit
All Implemented Interfaces:
Writable

public class FileSplit
extends InputSplit
implements Writable

A section of an input file. Returned by InputFormat.getSplits(JobContext) and passed to InputFormat.createRecordReader(InputSplit,TaskAttemptContext).


Constructor Summary
FileSplit(Path file, long start, long length, String[] hosts)
          Constructs a split with host information
 
Method Summary
 long getLength()
          The number of bytes in the file to process.
 String[] getLocations()
          Get the list of nodes by name where the data for the split would be local.
 Path getPath()
          The file containing this split's data.
 long getStart()
          The position of the first byte in the file to process.
 void readFields(DataInput in)
          Deserialize the fields of this object from in.
 String toString()
           
 void write(DataOutput out)
          Serialize the fields of this object to out.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

FileSplit

public FileSplit(Path file,
                 long start,
                 long length,
                 String[] hosts)
Constructs a split with host information

Parameters:
file - the file name
start - the position of the first byte in the file to process
length - the number of bytes in the file to process
hosts - the list of hosts containing the block, possibly null
Method Detail

getPath

public Path getPath()
The file containing this split's data.


getStart

public long getStart()
The position of the first byte in the file to process.


getLength

public long getLength()
The number of bytes in the file to process.

Specified by:
getLength in class InputSplit
Returns:
the number of bytes in the split

toString

public String toString()
Overrides:
toString in class Object

write

public void write(DataOutput out)
           throws IOException
Description copied from interface: Writable
Serialize the fields of this object to out.

Specified by:
write in interface Writable
Parameters:
out - DataOuput to serialize this object into.
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Description copied from interface: Writable
Deserialize the fields of this object from in.

For efficiency, implementations should attempt to re-use storage in the existing object where possible.

Specified by:
readFields in interface Writable
Parameters:
in - DataInput to deseriablize this object from.
Throws:
IOException

getLocations

public String[] getLocations()
                      throws IOException
Description copied from class: InputSplit
Get the list of nodes by name where the data for the split would be local. The locations do not need to be serialized.

Specified by:
getLocations in class InputSplit
Returns:
a new array of the node nodes.
Throws:
IOException


Copyright © 2009 The Apache Software Foundation