public class DBInputFormat<T extends DBWritable> extends Object implements InputFormat<LongWritable,T>, JobConfigurable
DBInputFormat emits LongWritables containing the record number as key and DBWritables as value. The SQL query, and input class can be using one of the two setInput methods.
限定符和类型 | 类和说明 |
---|---|
protected static class |
DBInputFormat.DBInputSplit
A InputSplit that spans a set of rows
|
protected class |
DBInputFormat.DBRecordReader
A RecordReader that reads records from a SQL table.
|
static class |
DBInputFormat.NullDBWritable
A Class that does nothing, implementing DBWritable
|
构造器和说明 |
---|
DBInputFormat() |
限定符和类型 | 方法和说明 |
---|---|
void |
configure(JobConf job)
Initializes a new instance from a
JobConf . |
protected String |
getCountQuery()
Returns the query for getting the total number of rows,
subclasses can override this for custom behaviour.
|
RecordReader<LongWritable,T> |
getRecordReader(InputSplit split,
JobConf job,
Reporter reporter)
Get the
RecordReader for the given InputSplit . |
InputSplit[] |
getSplits(JobConf job,
int chunks)
Logically split the set of input files for the job.
|
static void |
setInput(JobConf job,
Class<? extends DBWritable> inputClass,
String inputQuery,
String inputCountQuery)
Initializes the map-part of the job with the appropriate input settings.
|
static void |
setInput(JobConf job,
Class<? extends DBWritable> inputClass,
String tableName,
String conditions,
String orderBy,
String... fieldNames)
Initializes the map-part of the job with the appropriate input settings.
|
public void configure(JobConf job)
JobConf
.configure
在接口中 JobConfigurable
job
- the configurationpublic RecordReader<LongWritable,T> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException
RecordReader
for the given InputSplit
.
It is the responsibility of the RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader
在接口中 InputFormat<LongWritable,T extends DBWritable>
split
- the InputSplit
job
- the job that this split belongs toRecordReader
IOException
public InputSplit[] getSplits(JobConf job, int chunks) throws IOException
Each InputSplit
is then assigned to an individual Mapper
for processing.
Note: The split is a logical split of the inputs and the input files are not physically split into chunks. For e.g. a split could be <input-file-path, start, offset> tuple.
getSplits
在接口中 InputFormat<LongWritable,T extends DBWritable>
job
- job configuration.chunks
- the desired number of splits, a hint.InputSplit
s for the job.IOException
protected String getCountQuery()
public static void setInput(JobConf job, Class<? extends DBWritable> inputClass, String tableName, String conditions, String orderBy, String... fieldNames)
job
- The jobinputClass
- the class object implementing DBWritable, which is the
Java object holding tuple fields.tableName
- The table to read data fromconditions
- The condition which to select data with, eg. '(updated >
20070101 AND length > 0)'orderBy
- the fieldNames in the orderBy clause.fieldNames
- The field names in the tablesetInput(JobConf, Class, String, String)
public static void setInput(JobConf job, Class<? extends DBWritable> inputClass, String inputQuery, String inputCountQuery)
job
- The jobinputClass
- the class object implementing DBWritable, which is the
Java object holding tuple fields.inputQuery
- the input query to select fields. Example :
"SELECT f1, f2, f3 FROM Mytable ORDER BY f1"inputCountQuery
- the input query that returns the number of records in
the table.
Example : "SELECT COUNT(f1) FROM Mytable"setInput(JobConf, Class, String, String, String, String...)
Copyright © 2009 The Apache Software Foundation