Apache Hive v2.0 Released

Categories: BigData

Apache Hive 2.0 was released in Feb 2016. The most significant changes are:

  • hive metadata can be stored in HBase rather than a relational database (alpha)
  • long-lived worker processes - ie somewhat like Impala (beta)
  • built-in support for HPL/SQL - a “procedural SQL” language mostly compatible with Oracle PL/SQL, Teradata BTEQ etc. See the HPSQL site
  • performance improvements, particularly when using Spark as the back-end execution engine.
  • web-based admin interface for HiveServer2 daemon

Hive 2.0 is not included in Hortonworks HDP2.4 (released 2016-03-01) - maybe next version. It isn’t in the current Cloudera release either.

In related news Kafka 0.9 has a new Java API, and initial support for “native streaming”.

Update 2016-07-03: I’ve added an article on Hive to this site.

Update 2016-10-22: And Hive 2.1 is already available, and looking impressive - average of 25x speedup for queries from a standard test-suite (by internally using the new LLAP framework)! It is included as a “technical preview” in Hortonworks HDP2.5 - ie not yet considered production-ready.