public class LimitPushdownOptimizer extends Object implements Transform
Operator.acceptLimitPushdown()
If RS is only for limiting rows, RSHash counts row with same key separately.
But if RS is for GBY, RSHash should forward all the rows with the same key.
Legend : A(a) --> key A, value a, row A(a)
If each RS in mapper tasks is forwarded rows like this
MAP1(RS) : 40(a)-10(b)-30(c)-10(d)-70(e)-80(f)
MAP2(RS) : 90(g)-80(h)-60(i)-40(j)-30(k)-20(l)
MAP3(RS) : 40(m)-50(n)-30(o)-30(p)-60(q)-70(r)
OBY or GBY makes result like this,
REDUCER : 10(b,d)-20(l)-30(c,k,o,p)-40(a,j,m)-50(n)-60(i,q)-70(e,r)-80(f,h)-90(g)
LIMIT 3 for GBY: 10(b,d)-20(l)-30(c,k,o,p)
LIMIT 3 for OBY: 10(b,d)-20(l)
with the optimization, the amount of shuffling can be reduced, making identical result
For GBY,
MAP1 : 40(a)-10(b)-30(c)-10(d)
MAP2 : 40(j)-30(k)-20(l)
MAP3 : 40(m)-50(n)-30(o)-30(p)
REDUCER : 10(b,d)-20(l)-30(c,k,o,p)-40(a,j,m)-50(n)
LIMIT 3 : 10(b,d)-20(l)-30(c,k,o,p)
For OBY,
MAP1 : 10(b)-30(c)-10(d)
MAP2 : 40(j)-30(k)-20(l)
MAP3 : 40(m)-50(n)-30(o)
REDUCER : 10(b,d)-20(l)-30(c,k,o)-40(j,m)-50(n)
LIMIT 3 : 10(b,d)-20(l)| Constructor and Description |
|---|
LimitPushdownOptimizer() |
public ParseContext transform(ParseContext pctx) throws SemanticException
Transformtransform in interface Transformpctx - input parse contextSemanticExceptionCopyright © 2017 The Apache Software Foundation. All rights reserved.