Introduction

Hive is used for both interactive queries as well as part. The hive variable substitution mechanism was designed to avoid some of the code that was getting baked into the scripting language ontop of hive. For example:

$ a=b $ hive -e " describe $a "

are becoming common place. This is frustrating as hive becomes closely coupled with scripting languages. The hive startup time of a couple seconds is non-trivial when doing thousands of manipulations multiple hive -e invocations.

Hive Variables combine the set capability you know and love with some limited yet powerful (evil laugh) substitution ability. For example:

$ bin/hive -hiveconf a=b -e 'set a; set hiveconf:a; \ create table if not exists b (col int); describe ${hiveconf:a}'

Results in:

Hive history file=/tmp/edward/hive_job_log_edward_201011240906_1463048967.txt a=b hiveconf:a=b OK Time taken: 5.913 seconds OK col int Time taken: 0.754 seconds

Using variables

There are three namespaces for variables hiveconf,system, and env. hiveconf variables are set as normal:

set x=myvalue

However they are retrieved using

${hiveconf:x}

Annotated examples of usage from the test case ql/src/test/queries/clientpositive/set_processor_namespaces.q

set zzz=5; -- sets zzz=5 set zzz; set system:xxx=5; set system:xxx; -- sets a system property xxx to 5 set system:yyy=${system:xxx}; set system:yyy; -- sets yyy with value of xxx set go=${hiveconf:zzz}; set go; -- sets go base on value on zzz set hive.variable.substitute=false; set raw=${hiveconf:zzz}; set raw; -- disable substitution set a value to the literal set hive.variable.substitute=true; EXPLAIN SELECT * FROM src where key=${hiveconf:zzz}; SELECT * FROM src where key=${hiveconf:zzz}; --use a variable in a query set a=1; set b=a; set c=${hiveconf:${hiveconf:b}}; set c; --uses nested variables. set jar=../lib/derby.jar; add file ${hiveconf:jar}; list file; delete file ${hiveconf:jar}; list file;

Disabling

Variable substitution is on by default. If this causes an issue with an already existing script disable it.

set hive.variable.substitute=false;