public class TextInputFormat extends FileInputFormat<LongWritable,Text> implements JobConfigurable
InputFormat
for plain text files. Files are broken into lines.
Either linefeed or carriage-return are used to signal end of line. Keys are
the position in the file, and values are the line of text..FileInputFormat.Counter
LOG
构造器和说明 |
---|
TextInputFormat() |
限定符和类型 | 方法和说明 |
---|---|
void |
configure(JobConf conf)
Initializes a new instance from a
JobConf . |
RecordReader<LongWritable,Text> |
getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
Get the
RecordReader for the given InputSplit . |
protected boolean |
isSplitable(FileSystem fs,
Path file)
Is the given filename splitable?
|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
public void configure(JobConf conf)
JobConfigurable
JobConf
.configure
在接口中 JobConfigurable
conf
- the configurationprotected boolean isSplitable(FileSystem fs, Path file)
FileInputFormat
FileInputFormat
implementations can override this and return
false
to ensure that individual input files are never split-up
so that Mapper
s process entire files.isSplitable
在类中 FileInputFormat<LongWritable,Text>
fs
- the file system that the file is onfile
- the file name to checkpublic RecordReader<LongWritable,Text> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException
InputFormat
RecordReader
for the given InputSplit
.
It is the responsibility of the RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader
在接口中 InputFormat<LongWritable,Text>
getRecordReader
在类中 FileInputFormat<LongWritable,Text>
genericSplit
- the InputSplit
job
- the job that this split belongs toRecordReader
IOException
Copyright © 2009 The Apache Software Foundation