Apache Hive v2.0 Released
Categories: BigData
Apache Hive 2.0 was released in Feb 2016. The most significant changes are:
- hive metadata can be stored in HBase rather than a relational database (alpha)
- long-lived worker processes - ie somewhat like Impala (beta)
- built-in support for HPL/SQL - a “procedural SQL” language mostly compatible with Oracle PL/SQL, Teradata BTEQ etc. See the HPSQL site
- performance improvements, particularly when using Spark as the back-end execution engine.
- web-based admin interface for HiveServer2 daemon
Hive 2.0 is not included in Hortonworks HDP2.4 (released 2016-03-01) - maybe next version. It isn’t in the current Cloudera release either.
In related news Kafka 0.9 has a new Java API, and initial support for “native streaming”.
Update 2016-07-03: I’ve added an article on Hive to this site.
Update 2016-10-22: And Hive 2.1 is already available, and looking impressive - average of 25x speedup for queries from a standard test-suite (by internally using the new LLAP framework)! It is included as a “technical preview” in Hortonworks HDP2.5 - ie not yet considered production-ready.