请参阅: 说明
接口 | 说明 |
---|---|
ClusterStory |
ClusterStory represents all configurations of a MapReduce cluster,
including nodes, network topology, and slot configurations. |
DeepCompare |
Classes that implement this interface can deep-compare [for equality only,
not order] with another instance.
|
HistoryEvent | |
InputDemuxer |
InputDemuxer dem-ultiplexes the input files into individual input
streams. |
JobHistoryParser |
JobHistoryParser defines the interface of a Job History file parser. |
JobStory |
JobStory represents the runtime information available for a
completed Map-Reduce job. |
JobStoryProducer |
JobStoryProducer produces the sequence of JobStory 's. |
Outputter<T> |
Interface to output a sequence of objects of type T.
|
类 | 说明 |
---|---|
AbstractClusterStory |
AbstractClusterStory provides a partial implementation of
ClusterStory by parsing the topology tree. |
CDFPiecewiseLinearRandomGenerator | |
CDFRandomGenerator |
An instance of this class generates random values that confirm to the
embedded
LoggedDiscreteCDF . |
ClusterTopologyReader |
Reading JSON-encoded cluster topology and produce the parsed
LoggedNetworkTopology object. |
DefaultInputDemuxer |
DefaultInputDemuxer acts as a pass-through demuxer. |
DefaultOutputter<T> |
The default
Outputter that outputs to a plain file. |
DeskewedJobTraceReader | |
Hadoop20JHParser |
JobHistoryParser to parse job histories for hadoop 0.20 (META=1). |
HadoopLogsAnalyzer | 已过时 |
JhCounter | |
JhCounterGroup | |
JhCounters | |
Job20LineHistoryEventEmitter | |
JobBuilder |
JobBuilder builds one job. |
JobConfigurationParser |
JobConfigurationParser parses the job configuration xml file, and
extracts configuration properties. |
JobFinishedEvent |
Event to record successful completion of job
|
JobHistoryParserFactory |
JobHistoryParserFactory is a singleton class that attempts to
determine the version of job history and return a proper parser. |
JobHistoryUtils | |
JobInfoChangeEvent |
Event to record changes in the submit and launch time of
a job
|
JobInitedEvent |
Event to record the initialization of a job
|
JobPriorityChangeEvent |
Event to record the change of priority of a job
|
JobStatusChangedEvent |
Event to record the change of status for a job
|
JobSubmittedEvent |
Event to record the submission of a job
|
JobTraceReader |
Reading JSON-encoded job traces and produce
LoggedJob instances. |
JobUnsuccessfulCompletionEvent |
Event to record Failed and Killed completion of jobs
|
JsonObjectMapperWriter<T> |
Simple wrapper around
JsonGenerator to write objects in JSON format. |
LoggedDiscreteCDF |
A
LoggedDiscreteCDF is a discrete approximation of a cumulative
distribution function, with this class set up to meet the requirements of the
Jackson JSON parser/generator. |
LoggedJob |
A
LoggedDiscreteCDF is a representation of an hadoop job, with the
details of this class set up to meet the requirements of the Jackson JSON
parser/generator. |
LoggedLocation |
A
LoggedLocation is a representation of a point in an hierarchical
network, represented as a series of membership names, broadest first. |
LoggedNetworkTopology |
A
LoggedNetworkTopology represents a tree that in turn represents a
hierarchy of hosts. |
LoggedSingleRelativeRanking |
A
LoggedSingleRelativeRanking represents an X-Y coordinate of a
single point in a discrete CDF. |
LoggedTask |
A
LoggedTask represents a [hadoop] task that is part of a hadoop job. |
LoggedTaskAttempt |
A
LoggedTaskAttempt represents an attempt to run an hadoop task in a
hadoop job. |
MachineNode |
MachineNode represents the configuration of a cluster node. |
MachineNode.Builder |
Builder for a NodeInfo object
|
MapAttempt20LineHistoryEventEmitter | |
MapAttemptFinishedEvent |
Event to record successful completion of a map attempt
|
MapTaskAttemptInfo |
MapTaskAttemptInfo represents the information with regard to a
map task attempt. |
Node |
Node represents a node in the cluster topology. |
ParsedJob |
This is a wrapper class around
LoggedJob . |
ParsedTask |
This is a wrapper class around
LoggedTask . |
ParsedTaskAttempt |
This is a wrapper class around
LoggedTaskAttempt . |
Pre21JobHistoryConstants | |
RackNode |
RackNode represents a rack node in the cluster topology. |
RandomSeedGenerator |
The purpose of this class is to generate new random seeds from a master
seed.
|
ReduceAttempt20LineHistoryEventEmitter | |
ReduceAttemptFinishedEvent |
Event to record successful completion of a reduce attempt
|
ReduceTaskAttemptInfo |
ReduceTaskAttemptInfo represents the information with regard to a
reduce task attempt. |
ResourceUsageMetrics |
Captures the resource usage metrics.
|
RewindableInputStream |
A simple wrapper class to make any input stream "rewindable".
|
Task20LineHistoryEventEmitter | |
TaskAttempt20LineEventEmitter | |
TaskAttemptFinishedEvent |
Event to record successful task completion
|
TaskAttemptInfo |
TaskAttemptInfo is a collection of statistics about a particular
task-attempt gleaned from job-history of the job. |
TaskAttemptStartedEvent |
Event to record start of a task attempt
|
TaskAttemptUnsuccessfulCompletionEvent |
Event to record unsuccessful (Killed/Failed) completion of task attempts
|
TaskFailedEvent |
Event to record the failure of a task
|
TaskFinishedEvent |
Event to record the successful completion of a task
|
TaskInfo | |
TaskStartedEvent |
Event to record the start of a task
|
TaskUpdatedEvent |
Event to record updates to a task
|
TopologyBuilder |
Building the cluster topology.
|
TraceBuilder |
The main driver of the Rumen Parser.
|
TreePath |
This describes a path from a node to the root.
|
ZombieCluster |
ZombieCluster rebuilds the cluster topology using the information
obtained from job history logs. |
ZombieJob | |
ZombieJobProducer |
Producing
JobStory s from job trace. |
枚举 | 说明 |
---|---|
EventType | |
JobConfPropertyNames | |
JobHistoryParserFactory.VersionDetector | |
LoggedJob.JobPriority | |
LoggedJob.JobType | |
Pre21JobHistoryConstants.Values |
This enum contains some of the values commonly used by history log events.
|
异常错误 | 说明 |
---|---|
DeepInequalityException |
We use this exception class in the unit test, and we do a deep comparison
when we run the
|
JobConfigurationParser
// An example to parse and filter out job name
String conf_filename = .. // assume the job configuration filename here
// construct a list of interesting properties
List interestedProperties = new ArrayList();
interestedProperties.add("mapreduce.job.name");
JobConfigurationParser jcp =
new JobConfigurationParser(interestedProperties);
InputStream in = new FileInputStream(conf_filename);
Properties parsedProperties = jcp.parse(in);
Some of the commonly used interesting properties are enumerated in
JobConfPropertyNames
. JobConfigurationParser
can be used to parse multiple job configuration files.
JobHistoryParser
JobHistoryParserFactory
. Note that
RewindableInputStream
InputStream
to make the input
stream rewindable.
// An example to parse a current job history file i.e a job history
// file for which the version is known
String filename = .. // assume the job history filename here
InputStream in = new FileInputStream(filename);
HistoryEvent event = null;
JobHistoryParser parser = new CurrentJHParser(in);
event = parser.nextEvent();
// process all the events
while (event != null) {
// ... process all event
event = parser.nextEvent();
}
// close the parser and the underlying stream
parser.close();
JobHistoryParserFactory
provides a
JobHistoryParserFactory.getParser(org.apache.hadoop.tools.rumen.RewindableInputStream)
API to get a parser for parsing the job history file. Note that this
API can be used if the job history version is unknown.
// An example to parse a job history for which the version is not
// known i.e using JobHistoryParserFactory.getParser()
String filename = .. // assume the job history filename here
InputStream in = new FileInputStream(filename);
RewindableInputStream ris = new RewindableInputStream(in);
// JobHistoryParserFactory will check and return a parser that can
// parse the file
JobHistoryParser parser = JobHistoryParserFactory.getParser(ris);
// now use the parser to parse the events
HistoryEvent event = parser.nextEvent();
while (event != null) {
// ... process the event
event = parser.nextEvent();
}
parser.close();
Note:
Create one instance to parse a job history log and close it after use.
TopologyBuilder
org.apache.hadoop.mapreduce.jobhistory.HistoryEvent
.
These events can be passed to TopologyBuilder
using
org.apache.hadoop.tools.rumen.TopologyBuilder#process(org.apache.hadoop.mapreduce.jobhistory.HistoryEvent)
.
A cluster topology can be represented using LoggedNetworkTopology
.
Once all the job history events are processed, the cluster
topology can be obtained using TopologyBuilder.build()
.
// Building topology for a job history file represented using
// 'filename' and the corresponding configuration file represented
// using 'conf_filename'
String filename = .. // assume the job history filename here
String conf_filename = .. // assume the job configuration filename here
InputStream jobConfInputStream = new FileInputStream(filename);
InputStream jobHistoryInputStream = new FileInputStream(conf_filename);
TopologyBuilder tb = new TopologyBuilder();
// construct a list of interesting properties
List interestingProperties = new ArrayList();
// add the interesting properties here
interestingProperties.add("mapreduce.job.name");
JobConfigurationParser jcp =
new JobConfigurationParser(interestingProperties);
// parse the configuration file
tb.process(jcp.parse(jobConfInputStream));
// read the job history file and pass it to the
// TopologyBuilder.
JobHistoryParser parser = new CurrentJHParser(jobHistoryInputStream);
HistoryEvent e;
// read and process all the job history events
while ((e = parser.nextEvent()) != null) {
tb.process(e);
}
LoggedNetworkTopology topology = tb.build();
JobBuilder
TraceBuilder
provides
TraceBuilder.extractJobID(String)
API for extracting job id from job history or job configuration files
which can be used for instantiating JobBuilder
.
JobBuilder
generates a
LoggedJob
object via
JobBuilder.build()
.
See LoggedJob
for more details.
// An example to summarize a current job history file 'filename'
// and the corresponding configuration file 'conf_filename'
String filename = .. // assume the job history filename here
String conf_filename = .. // assume the job configuration filename here
InputStream jobConfInputStream = new FileInputStream(job_filename);
InputStream jobHistoryInputStream = new FileInputStream(conf_filename);
String jobID = TraceBuilder.extractJobID(job_filename);
JobBuilder jb = new JobBuilder(jobID);
// construct a list of interesting properties
List interestingProperties = new ArrayList();
// add the interesting properties here
interestingProperties.add("mapreduce.job.name");
JobConfigurationParser jcp =
new JobConfigurationParser(interestingProperties);
// parse the configuration file
jb.process(jcp.parse(jobConfInputStream));
// parse the job history file
JobHistoryParser parser = new CurrentJHParser(jobHistoryInputStream);
try {
HistoryEvent e;
// read and process all the job history events
while ((e = parser.nextEvent()) != null) {
jobBuilder.process(e);
}
} finally {
parser.close();
}
LoggedJob job = jb.build();
Note:
The order of parsing the job configuration file or job history file is
not important. Create one instance to parse the history file and job
configuration.
DefaultOutputter
Outputter
and writes
JSON object in text format to the output file.
DefaultOutputter
can be
initialized with the output filename.
// An example to summarize a current job history file represented by
// 'filename' and the configuration filename represented using
// 'conf_filename'. Also output the job summary to 'out.json' along
// with the cluster topology to 'topology.json'.
String filename = .. // assume the job history filename here
String conf_filename = .. // assume the job configuration filename here
Configuration conf = new Configuration();
DefaultOutputter do = new DefaultOutputter();
do.init("out.json", conf);
InputStream jobConfInputStream = new FileInputStream(filename);
InputStream jobHistoryInputStream = new FileInputStream(conf_filename);
// extract the job-id from the filename
String jobID = TraceBuilder.extractJobID(filename);
JobBuilder jb = new JobBuilder(jobID);
TopologyBuilder tb = new TopologyBuilder();
// construct a list of interesting properties
List interestingProperties = new ArrayList();
// add the interesting properties here
interestingProperties.add("mapreduce.job.name");
JobConfigurationParser jcp =
new JobConfigurationParser(interestingProperties);
// parse the configuration file
tb.process(jcp.parse(jobConfInputStream));
// read the job history file and pass it to the
// TopologyBuilder.
JobHistoryParser parser = new CurrentJHParser(jobHistoryInputStream);
HistoryEvent e;
while ((e = parser.nextEvent()) != null) {
jb.process(e);
tb.process(e);
}
LoggedJob j = jb.build();
// serialize the job summary in json (text) format
do.output(j);
// close
do.close();
do.init("topology.json", conf);
// get the job summary using TopologyBuilder
LoggedNetworkTopology topology = topologyBuilder.build();
// serialize the cluster topology in json (text) format
do.output(topology);
// close
do.close();
JobTraceReader
LoggedJob
serialized using
DefaultOutputter
. LoggedJob
provides various APIs for extracting job details. Following are the most
commonly used ones
LoggedJob.getMapTasks()
: Get the map tasksLoggedJob.getReduceTasks()
: Get the reduce tasksLoggedJob.getOtherTasks()
: Get the setup/cleanup tasksLoggedJob.getOutcome()
: Get the job's outcomeLoggedJob.getSubmitTime()
: Get the job's submit timeLoggedJob.getFinishTime()
: Get the job's finish time
// An example to read job summary from a trace file 'out.json'.
JobTraceReader reader = new JobTracerReader("out.json");
LoggedJob job = reader.getNext();
while (job != null) {
// .... process job level information
for (LoggedTask task : job.getMapTasks()) {
// process all the map tasks in the job
for (LoggedTaskAttempt attempt : task.getAttempts()) {
// process all the map task attempts in the job
}
}
// get the next job
job = reader.getNext();
}
reader.close();
ClusterTopologyReader
LoggedNetworkTopology
serialized using
DefaultOutputter
. ClusterTopologyReader
can be
initialized using the serialized topology filename.
ClusterTopologyReader.get()
can
be used to get the
LoggedNetworkTopology
.
// An example to read the cluster topology from a topology output file
// 'topology.json'
ClusterTopologyReader reader = new ClusterTopologyReader("topology.json");
LoggedNetworkTopology topology = reader.get();
for (LoggedNetworkTopology t : topology.getChildren()) {
// process the cluster topology
}
reader.close();
Copyright © 2009 The Apache Software Foundation