The Java EE 7 Tutorial
55.4 Using the Job Specification Language
The Job Specification Language (JSL) enables you to define the steps in a job and their execution order using an XML file. The following example shows how to define a simple job that contains one chunk step and one task step:
<job id="loganalysis" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0"> <properties> <property name="input_file" value="input1.txt"/> <property name="output_file" value="output2.txt"/> </properties> <step id="logprocessor" next="cleanup"> <chunk checkpoint-policy="item" item-count="10"> <reader ref="com.example.pkg.LogItemReader"></reader> <processor ref="com.example.pkg.LogItemProcessor"></processor> <writer ref="com.example.pkg.LogItemWriter"></writer> </chunk> </step> <step id="cleanup"> <batchlet ref="com.example.pkg.CleanUp"></batchlet> <end on="COMPLETED"/> </step> </job>
This example defines the loganalysis
batch job, which consists of the logprocessor
chunk step and the cleanup
task step. The logprocessor
step transitions to the cleanup
step, which terminates the job when completed.
The job
element defines two properties, input_file
and output_file
. Specifying properties in this manner enables you to run a batch job with different configuration parameters without having to recompile its Java batch artifacts. The batch artifacts can access these properties using the context objects from the batch runtime.
The logprocessor
step is a chunk step that specifies batch artifacts for the reader (LogItemReader
), the processor (LogItemProcessor
), and the writer (LogItemWriter
). This step creates a checkpoint for every ten items processed.
The cleanup
step is a task step that specifies the CleanUp
class as its batch artifact. The job terminates when this step completes.
The following sections describe the elements of the Job Specification Language (JSL) in more detail and show the most common attributes and child elements.
55.4.1 The job Element
The job
element is always the top-level element in a job definition file. Its main attributes are id
and restartable
. The job
element can contain one properties
element and zero or more of each of the following elements: listener
, step
, flow
, and split
. For example:
<job id="jobname" restartable="true"> <listeners> <listener ref="com.example.pkg.ListenerBatchArtifact"/> </listeners> <properties> <property name="propertyName1" value="propertyValue1"/> <property name="propertyName2" value="propertyValue2"/> </properties> <step ...> ... </step> <step ...> ... </step> <decision ...> ... </decision> <flow ...> ... </flow> <split ...> ... </split> </job>
The listener
element specifies a batch artifact whose methods are invoked before and after the execution of the job. The batch artifact is an implementation of the javax.batch.api.listener.JobListener
interface. See The Listener Batch Artifacts for an example of a job listener implementation.
The first step
, flow
, or split
element inside the job
element executes first.
55.4.2 The step Element
The step
element can be a child of the job
and flow
elements. Its main attributes are id
and next
. The step
element can contain the following elements.
-
One
chunk
element for chunk-oriented steps or onebatchlet
element for task-oriented steps. -
One
properties
element (optional).This element specifies a set of properties that batch artifacts can access using batch context objects.
-
One
listener
element (optional); onelisteners
element if more than one listener is specified.This element specifies listener artifacts that intercept various phases of step execution.
For chunk steps, the batch artifacts for these listeners can be implementations of the following interfaces:
StepListener
,ItemReadListener
,ItemProcessListener
,ItemWriteListener
,ChunkListener
,RetryReadListener
,RetryProcessListener
,RetryWriteListener
,SkipReadListener
,SkipProcessListener
, andSkipWriteListener
.For task steps, the batch artifact for these listeners must be an implementation of the
StepListener
interface.See The Listener Batch Artifacts for an example of an item processor listener implementation.
-
One
partition
element (optional).This element is used in partitioned steps which execute in more than one thread.
-
One
end
element if this is the last step in a job.This element sets the batch status to
COMPLETED
. -
One
stop
element (optional) to stop a job at this step.This element sets the batch status to
STOPPED
. -
One
fail
element (optional) to terminate a job at this step.This element sets the batch status to
FAILED
. -
One or more
next
elements if thenext
attribute is not specified.This element is associated with an exit status and refers to another step, a flow, a split, or a decision element.
The following is an example of a chunk step:
<step id="stepA" next="stepB"> <properties> ... </properties> <listeners> <listener ref="MyItemReadListenerImpl"/> ... </listeners> <chunk ...> ... </chunk> <partition> ... </partition> <end on="COMPLETED" exit-status="MY_COMPLETED_EXIT_STATUS"/> <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step0"/> <fail on="MY_ERROR_EXIT_STATUS" exit-status="MY_ERROR_EXIT_STATUS"/> </step>
The following is an example of a task step:
<step id="stepB" next="stepC"> <batchlet ...> ... </batchlet> <properties> ... </properties> <listener ref="MyStepListenerImpl"/> </step>
55.4.2.1 The chunk Element
The chunk
element is a child of the step
element for chunk-oriented steps. The attributes of this element are listed in Table 55-2.
Table 55-2 Attributes of the chunk Element
Attribute Name | Description | Default Value |
---|---|---|
|
Specifies how to commit the results of processing each chunk:
The checkpoint is updated when the results of a chunk are committed. Every chunk is processed in a global Java EE transaction. If the processing of one item in the chunk fails, the transaction is rolled back and no processed items from this chunk are stored. |
|
|
Specifies the number of items to process before committing the chunk and taking a checkpoint. |
10 |
|
Specifies the number of seconds before committing the chunk and taking a checkpoint when If |
0 (no limit) |
|
Specifies if processed items are buffered until it is time to take a checkpoint. If true, a single call to the item writer is made with a list of the buffered items before committing the chunk and taking a checkpoint. |
true |
|
Specifies the number of skippable exceptions to skip in this step during chunk processing. Skippable exception classes are specified with the |
No limit |
|
Specifies the number of attempts to execute this step if retryable exceptions occur. Retryable exception classes are specified with the |
No limit |
The chunk
element can contain the following elements.
-
One
reader
element.This element specifies a batch artifact that implements the
ItemReader
interface. -
One
processor
element.This element specifies a batch artifact that implements the
ItemProcessor
interface. -
One
writer
element.This element specifies a batch artifact that implements the
ItemWriter
interface. -
One
checkpoint-algorithm
element (optional).This element specifies a batch artifact that implements the
CheckpointAlgorithm
interface and provides a custom checkpoint policy. -
One
skippable-exception-classes
element (optional).This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing should skip. The
skip-limit
attribute from thechunk
element specifies the maximum number of skipped exceptions. -
One
retryable-exception-classes
element (optional).This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing will retry. The
retry-limit
attribute from thechunk
element specifies the maximum number of attempts. -
One
no-rollback-exception-classes
element (optional).This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that should not cause the batch runtime to roll back the current chunk, but to retry the current operation without a rollback instead.
For exception types not specified in this element, the current chunk is rolled back by default when an exception occurs.
The following is an example of a chunk-oriented step:
<step id="stepC" next="stepD"> <chunk checkpoint-policy="item" item-count="5" time-limit="180" buffer-items="true" skip-limit="10" retry-limit="3"> <reader ref="pkg.MyItemReaderImpl"></reader> <processor ref="pkg.MyItemProcessorImpl"></processor> <writer ref="pkg.MyItemWriterImpl"></writer> <skippable-exception-classes> <include class="pkg.MyItemException"/> <exclude class="pkg.MyItemSeriousSubException"/> </skippable-exception-classes> <retryable-exception-classes> <include class="pkg.MyResourceTempUnavailable"/> </retryable-exception-classes> </chunk> </step>
This example defines a chunk step and specifies its reader, processor, and writer artifacts. The step updates a checkpoint and commits each chunk after processing five items. It skips all MyItemException
exceptions and all its subtypes, except for MyItemSeriousSubException
, up to a maximum of ten skipped exceptions. The step retries a chunk when a MyResourceTempUnavailable
exception occurs, up to a maximum of three attempts.
55.4.2.2 The batchlet Element
The batchlet
element is a child of the step
element for task-oriented steps. This element only has the ref
attribute, which specifies a batch artifact that implements the Batchlet
interface. The batch
element can contain a properties
element.
The following is an example of a task-oriented step:
<step id="stepD" next="stepE"> <batchlet ref="pkg.MyBatchletImpl"> <properties> <property name="pname" value="pvalue"/> </properties> </batchlet> </step>
This example defines a batch step and specifies its batch artifact.
55.4.2.3 The partition Element
The partition
element is a child of the step
element. It indicates that a step is partitioned. Most partitioned steps are chunk steps where the processing of each item does not depend on the results of processing previous items. You specify the number of partitions in a step and provide each partition with specific information on which items to process, such as the following.
-
A range of items. For example, partition 1 processes items 1 through 500, and partition 2 processes items 501 through 1000.
-
An input source. For example, partition 1 processes the items in
input1.txt
and partition 2 processes the items ininput2.txt
.
When the number of partitions, the number of items, and the input sources for a partitioned step are known at development or deployment time, you can use partition properties in the job definition file to specify partition-specific information and access these properties from the step batch artifacts. The runtime creates as many instances of the step batch artifacts (reader, processor, and writer) as partitions, and each artifact instance receives the properties specific to its partition.
In most cases, the number of partitions, the number of items, or the input sources for a partitioned step can only be determined at runtime. Instead of specifying partition-specific properties statically in the job definition file, you provide a batch artifact that can access your data sources at runtime and determine how many partitions are needed and what range of items each partition should process. This batch artifact is an implementation of the PartitionMapper
interface. The batch runtime invokes this artifact and then uses the information it provides to instantiate the step batch artifacts (reader, writer, and processor) for each partition and to pass them partition-specific data as parameters.
The rest of this section describes the partition
element in detail and shows two examples of job definition files: one that uses partition properties to specify a range of items for each partition, and one that relies on a PartitionMapper
implementation to determine partition-specific information.
See The Phone Billing Chunk Step in The phonebilling Example Application for a complete example of a partitioned chunk step.
The partition
element can contain the following elements.
-
One
plan
element, if themapper
element is not specified.This element defines the number of partitions, the number of threads, and the properties for each partition in the job definition file. The
plan
element is useful when this information is known at development or deployment time. -
One
mapper
element, if theplan
element is not specified.This element specifies a batch artifact that provides the number of partitions, the number of threads, and the properties for each partition. The batch artifact is an implementation of the
PartitionMapper
interface. You use this option when the information required for each partition is only known at runtime. -
One
reducer
element (optional).This element specifies a batch artifact that receives control when a partitioned step begins, ends, or rolls back. The batch artifact enables you to merge results from different partitions and perform other related operations. The batch artifact is an implementation of the
PartitionReducer
interface. -
One
collector
element (optional).This element specifies a batch artifact that sends intermediary results from each partition to a partition analyzer. The batch artifact sends the intermediary results after each checkpoint for chunk steps and at the end of the step for task steps. The batch artifact is an implementation of the
PartitionCollector
interface. -
One
analyzer
element (optional).This element specifies a batch artifact that analyzes the intermediary results from the partition collector instances. The batch artifact is an implementation of the
PartitionAnalyzer
interface.
The following is an example of a partitioned step using the plan
element:
<step id="stepE" next="stepF"> <chunk> <reader ...></reader> <processor ...></processor> <writer ...></writer> </chunk> <partition> <plan partitions="2" threads="2"> <properties partition="0"> <property name="firstItem" value="0"/> <property name="lastItem" value="500"/> </properties> <properties partition="1"> <property name="firstItem" value="501"/> <property name="lastItem" value="999"/> </properties> </plan> </partition> <reducer ref="MyPartitionReducerImpl"/> <collector ref="MyPartitionCollectorImpl"/> <analyzer ref="MyPartitionAnalyzerImpl"/> </step>
In this example, the plan
element specifies the properties for each partition in the job definition file.
The following example uses a mapper
element instead of a plan
element. The PartitionMapper
implementation dynamically provides the same information as the plan
element provides in the job definition file:
<step id="stepE" next="stepF"> <chunk> <reader ...></reader> <processor ...></processor> <writer ...></writer> </chunk> <partition> <mapper ref="MyPartitionMapperImpl"/> <reducer ref="MyPartitionReducerImpl"/> <collector ref="MyPartitionCollectorImpl"/> <analyzer ref="MyPartitionAnalyzerImpl"/> </partition> </step>
Refer to The phonebilling Example Application for an example implementation of the PartitionMapper
interface.
55.4.3 The flow Element
The flow
element can be a child of the job
, flow
, and split
elements. Its attributes are id
and next
. Flows can transition to flows, steps, splits, and decision elements. The flow
element can contain the following elements:
-
One or more
step
elements -
One or more
flow
elements (optional) -
One or more
split
elements (optional) -
One or more
decision
elements (optional)
The last step
in a flow is the one with no next
attribute or next
element. Steps and other elements in a flow cannot transition to elements outside the flow.
The following is an example of the flow
element:
<flow id="flowA" next="stepE"> <step id="flowAstepA" next="flowAstepB">...</step> <step id="flowAstepB" next="flowAflowC">...</step> <flow id="flowAflowC" next="flowAsplitD">...</flow> <split id="flowAsplitD" next="flowAstepE">...</split> <step id="flowAstepE">...</step> </flow>
This example flow contains three steps, one flow, and one split. The last step does not have the next
attribute. The flow transitions to stepE
when its last step completes.
55.4.4 The split Element
The split
element can be a child of the job
and flow
elements. Its attributes are id
and next
. Splits can transition to splits, steps, flows, and decision elements. The split
element can only contain one or more flow
elements that can only transition to other flow
elements in the split.
The following is an example of a split with three flows that execute concurrently:
<split id="splitA" next="stepB"> <flow id="splitAflowA">...</flow> <flow id="splitAflowB">...</flow> <flow id="splitAflowC">...</flow> </split>
55.4.5 The decision Element
The decision
element can be a child of the job
and flow
elements. Its attributes are id
and next
. Steps, flows, and splits can transition to a decision
element. This element specifies a batch artifact that decides the next step, flow, or split to execute based on information from the execution of the previous step, flow, or split. The batch artifact implements the Decider
interface. The decision
element can contain the following elements.
-
One or more
end
elements (optional).This element sets the batch status to
COMPLETED
. -
One or more
stop
elements (optional).This element sets the batch status to
STOPPED
. -
One or more
fail
elements (optional).This element sets the batch status to
FAILED
. -
One or more
next
elements (optional). -
One
properties
element (optional).
The following is an example of the decider
element:
<decision id="decisionA" ref="MyDeciderImpl"> <fail on="FAILED" exit-status="FAILED_AT_DECIDER"/> <end on="COMPLETED" exit-status="COMPLETED_AT_DECIDER"/> <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step2"/> </decision>