Hive/Tez tasks with OutOfMemoryError

Categories: BigData

Overview

I recently wrote about problems with a simple Hive query causing a Hive/Tez Application Master process to consume excessive memory and be killed by Yarn. Just a few days later I encountered another Hive command that also caused a crash in Hive/Tez - this time in the task process.

The command itself was really just a copy of the contents of one table into another table. The tables were large, but that really should not be a problem. The end result was a Tez worker process terminating with OutOfMemoryError - ie the heap was expanding beyond the -Xmx param specified when the JVM was started. The Application Master process retried a few times with the same results, and so terminated the whole job.

This failure was very repeatable - the job would run for about 6 minutes, but crash before generating any output files in HDFS. It looked like:

>>> Status: Failed
>>>  Vertex failed, vertexName=Map 1, vertexId=vertex_12345_0123_1_00, diagnostics=[Task failed, taskId=task_12345_0123_1_00_000282,
>>>    diagnostics=[TaskAttempt 0 failed, info=[Error:   Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
>>>        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159)

>>>  Caused by: java.lang.OutOfMemoryError: Java heap space
>>>        at org.apache.hadoop.hive.ql.io.orc.DynamicByteArray.grow(DynamicByteArray.java:64)
>>>        at org.apache.hadoop.hive.ql.io.orc.DynamicByteArray.readAll(DynamicByteArray.java:142)

Eventually I set option hive.tez.container.size to a large value (32GB) which let the process run long enough to eventually provide a more useful error message:

Number of partitions exceeds limit of 2000

The problem was that both the source table and the target table were partitioned. However my HiveQL statement accidentally passed the wrong column from the source table as the target table partition column. And unfortunately the column passed had high cardinality - it was actually different for each record in the table.

It appears that as a Tez worker scans its input data, it allocates an in-memory buffer for each output Hive partition it will be writing its output records into. This is sensible - but when the number of partitions is extremely large, these buffers can trigger OutOfMemoryError.

In short: if you get OutOfMemoryError while doing a simple insert-into command in Hive, check the partitioning!