|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.pig.EvalFunc<DataBag>
org.apache.pig.builtin.COV
public class COV
Computes the covariance between sets of data. The returned value
will be a bag which will contain a tuple for each combination of input
schema and inside tuple we will have two schema name and covariance between
those two schemas.
A = load 'input.xml' using PigStorage(':');
B = group A all;
D = foreach B generate group,COV(A.$0,A.$1,A.$2);
| Nested Class Summary | |
|---|---|
static class |
COV.Final
|
static class |
COV.Initial
|
static class |
COV.Intermed
|
| Nested classes/interfaces inherited from class org.apache.pig.EvalFunc |
|---|
EvalFunc.SchemaType |
| Field Summary | |
|---|---|
protected Vector<String> |
schemaName
|
| Fields inherited from class org.apache.pig.EvalFunc |
|---|
log, pigLogger, reporter, returnType |
| Constructor Summary | |
|---|---|
COV()
|
|
COV(String... schemaName)
|
|
| Method Summary | |
|---|---|
protected static Tuple |
combine(DataBag values)
combine results of different data chunk |
protected static Tuple |
computeAll(DataBag first,
DataBag second)
compute sum(XY), sum(X), sum(Y) from given data sets |
DataBag |
exec(Tuple input)
Function to compute covariance between data sets. |
String |
getFinal()
Get the final function. |
String |
getInitial()
Get the initial function. |
String |
getIntermed()
Get the intermediate function. |
Schema |
outputSchema(Schema input)
Report the schema of the output of this UDF. |
String |
toString()
Function to return argument of constructor as string. |
| Methods inherited from class org.apache.pig.EvalFunc |
|---|
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
protected Vector<String> schemaName
| Constructor Detail |
|---|
public COV()
public COV(String... schemaName)
| Method Detail |
|---|
public DataBag exec(Tuple input)
throws IOException
exec in class EvalFunc<DataBag>input - input tuple which contains data sets.
IOExceptionpublic String toString()
toString in class Objectpublic String getInitial()
Algebraic
getInitial in interface Algebraicpublic String getIntermed()
Algebraic
getIntermed in interface Algebraicpublic String getFinal()
Algebraic
getFinal in interface Algebraic
protected static Tuple combine(DataBag values)
throws IOException
values - DataBag containing partial results computed on different data chunks
IOException
protected static Tuple computeAll(DataBag first,
DataBag second)
throws IOException
first - DataBag containing first data setsecond - DataBag containing second data set
IOExceptionpublic Schema outputSchema(Schema input)
EvalFunc
The default implementation interprets the OutputSchema annotation,
if one is present. Otherwise, it returns null (no known output schema).
outputSchema in class EvalFunc<DataBag>input - Schema of the input
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||