org.apache.hadoop.mapred
Interface TaskUmbilicalProtocol

All Superinterfaces:
VersionedProtocol
All Known Implementing Classes:
TaskTracker

public interface TaskUmbilicalProtocol
extends VersionedProtocol

Protocol that task child process uses to contact its parent process. The parent is a daemon which which polls the central master for a new map or reduce task and runs it as a child process. All communication between child and parent is via this protocol.


Field Summary
static long versionID
          Changed the version to 2, since we have a new method getMapOutputs Changed version to 3 to have progress() return a boolean Changed the version to 4, since we have replaced TaskUmbilicalProtocol.progress(String, float, String, org.apache.hadoop.mapred.TaskStatus.Phase, Counters) with statusUpdate(String, TaskStatus) Version 5 changed counters representation for HADOOP-2248 Version 6 changes the TaskStatus representation for HADOOP-2208 Version 7 changes the done api (via HADOOP-3140).
 
Method Summary
 boolean canCommit(TaskAttemptID taskid, org.apache.hadoop.mapred.JvmContext jvmContext)
          Polling to know whether the task can go-ahead with commit
 void commitPending(TaskAttemptID taskId, TaskStatus taskStatus, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report that the task is complete, but its commit is pending.
 void done(TaskAttemptID taskid, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report that the task is successfully completed.
 void fatalError(TaskAttemptID taskId, String message, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report that the task encounted a fatal error.
 void fsError(TaskAttemptID taskId, String message, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report that the task encounted a local filesystem error.
 MapTaskCompletionEventsUpdate getMapCompletionEvents(JobID jobId, int fromIndex, int maxLocs, TaskAttemptID id, org.apache.hadoop.mapred.JvmContext jvmContext)
          Called by a reduce task to get the map output locations for finished maps.
 JvmTask getTask(org.apache.hadoop.mapred.JvmContext context)
          Called when a child task process starts, to get its task.
 boolean ping(TaskAttemptID taskid, org.apache.hadoop.mapred.JvmContext jvmContext)
          Periodically called by child to check if parent is still alive.
 void reportDiagnosticInfo(TaskAttemptID taskid, String trace, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report error messages back to parent.
 void reportNextRecordRange(TaskAttemptID taskid, org.apache.hadoop.mapred.SortedRanges.Range range, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report the record range which is going to process next by the Task.
 void shuffleError(TaskAttemptID taskId, String message, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report that a reduce-task couldn't shuffle map-outputs.
 boolean statusUpdate(TaskAttemptID taskId, TaskStatus taskStatus, org.apache.hadoop.mapred.JvmContext jvmContext)
          Report child's progress to parent.
 void updatePrivateDistributedCacheSizes(JobID jobId, long[] sizes)
          The job initializer needs to report the sizes of the archive objects and directories in the private distributed cache.
 
Methods inherited from interface org.apache.hadoop.ipc.VersionedProtocol
getProtocolVersion
 

Field Detail

versionID

static final long versionID
Changed the version to 2, since we have a new method getMapOutputs Changed version to 3 to have progress() return a boolean Changed the version to 4, since we have replaced TaskUmbilicalProtocol.progress(String, float, String, org.apache.hadoop.mapred.TaskStatus.Phase, Counters) with statusUpdate(String, TaskStatus) Version 5 changed counters representation for HADOOP-2248 Version 6 changes the TaskStatus representation for HADOOP-2208 Version 7 changes the done api (via HADOOP-3140). It now expects whether or not the task's output needs to be promoted. Version 8 changes {job|tip|task}id's to use their corresponding objects rather than strings. Version 9 changes the counter representation for HADOOP-1915 Version 10 changed the TaskStatus format and added reportNextRecordRange for HADOOP-153 Version 11 Adds RPCs for task commit as part of HADOOP-3150 Version 12 getMapCompletionEvents() now also indicates if the events are stale or not. Hence the return type is a class that encapsulates the events and whether to reset events index. Version 13 changed the getTask method signature for HADOOP-249 Version 14 changed the getTask method signature for HADOOP-4232 Version 15 Adds FAILED_UNCLEAN and KILLED_UNCLEAN states for HADOOP-4759 Version 16 Added numRequiredSlots to TaskStatus for MAPREDUCE-516 Version 17 Change in signature of getTask() for HADOOP-5488 Version 18 Added fatalError for child to communicate fatal errors to TT Version 19 Added jvmContext to most method signatures for MAPREDUCE-2429

See Also:
Constant Field Values
Method Detail

getTask

JvmTask getTask(org.apache.hadoop.mapred.JvmContext context)
                throws IOException
Called when a child task process starts, to get its task.

Parameters:
context - the JvmContext of the JVM w.r.t the TaskTracker that launched it
Returns:
Task object
Throws:
IOException

statusUpdate

boolean statusUpdate(TaskAttemptID taskId,
                     TaskStatus taskStatus,
                     org.apache.hadoop.mapred.JvmContext jvmContext)
                     throws IOException,
                            InterruptedException
Report child's progress to parent.

Parameters:
taskId - task-id of the child
taskStatus - status of the child
jvmContext - context the jvmContext running the task.
Returns:
True if the task is known
Throws:
IOException
InterruptedException

reportDiagnosticInfo

void reportDiagnosticInfo(TaskAttemptID taskid,
                          String trace,
                          org.apache.hadoop.mapred.JvmContext jvmContext)
                          throws IOException
Report error messages back to parent. Calls should be sparing, since all such messages are held in the job tracker.

Parameters:
taskid - the id of the task involved
trace - the text to report
jvmContext - context the jvmContext running the task.
Throws:
IOException

reportNextRecordRange

void reportNextRecordRange(TaskAttemptID taskid,
                           org.apache.hadoop.mapred.SortedRanges.Range range,
                           org.apache.hadoop.mapred.JvmContext jvmContext)
                           throws IOException
Report the record range which is going to process next by the Task.

Parameters:
taskid - the id of the task involved
range - the range of record sequence nos
jvmContext - context the jvmContext running the task.
Throws:
IOException

ping

boolean ping(TaskAttemptID taskid,
             org.apache.hadoop.mapred.JvmContext jvmContext)
             throws IOException
Periodically called by child to check if parent is still alive.

Parameters:
taskid - the id of the task involved
jvmContext - context the jvmContext running the task.
Returns:
True if the task is known
Throws:
IOException

done

void done(TaskAttemptID taskid,
          org.apache.hadoop.mapred.JvmContext jvmContext)
          throws IOException
Report that the task is successfully completed. Failure is assumed if the task process exits without calling this.

Parameters:
taskid - task's id
jvmContext - context the jvmContext running the task.
Throws:
IOException

commitPending

void commitPending(TaskAttemptID taskId,
                   TaskStatus taskStatus,
                   org.apache.hadoop.mapred.JvmContext jvmContext)
                   throws IOException,
                          InterruptedException
Report that the task is complete, but its commit is pending.

Parameters:
taskId - task's id
taskStatus - status of the child
jvmContext - context the jvmContext running the task.
Throws:
IOException
InterruptedException

canCommit

boolean canCommit(TaskAttemptID taskid,
                  org.apache.hadoop.mapred.JvmContext jvmContext)
                  throws IOException
Polling to know whether the task can go-ahead with commit

Parameters:
taskid -
jvmContext - context the jvmContext running the task.
Returns:
true/false
Throws:
IOException

shuffleError

void shuffleError(TaskAttemptID taskId,
                  String message,
                  org.apache.hadoop.mapred.JvmContext jvmContext)
                  throws IOException
Report that a reduce-task couldn't shuffle map-outputs.

Throws:
IOException

fsError

void fsError(TaskAttemptID taskId,
             String message,
             org.apache.hadoop.mapred.JvmContext jvmContext)
             throws IOException
Report that the task encounted a local filesystem error.

Throws:
IOException

fatalError

void fatalError(TaskAttemptID taskId,
                String message,
                org.apache.hadoop.mapred.JvmContext jvmContext)
                throws IOException
Report that the task encounted a fatal error.

Throws:
IOException

getMapCompletionEvents

MapTaskCompletionEventsUpdate getMapCompletionEvents(JobID jobId,
                                                     int fromIndex,
                                                     int maxLocs,
                                                     TaskAttemptID id,
                                                     org.apache.hadoop.mapred.JvmContext jvmContext)
                                                     throws IOException
Called by a reduce task to get the map output locations for finished maps. Returns an update centered around the map-task-completion-events. The update also piggybacks the information whether the events copy at the task-tracker has changed or not. This will trigger some action at the child-process.

Parameters:
jobId - the reducer job id
fromIndex - the index starting from which the locations should be fetched
maxLocs - the max number of locations to fetch
id - The attempt id of the task that is trying to communicate
Returns:
A MapTaskCompletionEventsUpdate
Throws:
IOException

updatePrivateDistributedCacheSizes

void updatePrivateDistributedCacheSizes(JobID jobId,
                                        long[] sizes)
                                        throws IOException
The job initializer needs to report the sizes of the archive objects and directories in the private distributed cache.

Parameters:
jobId - the job to update
sizes - the array of sizes that were computed
Throws:
IOException


Copyright © 2009 The Apache Software Foundation