Sunday, 3 March 2013

Hortonworks Data Platform 2.0 (Alpha)

In the last days I was testing Hortonworks Data Platform 2.0 (Alpha). Previously I mainly used Cloudera distributions but because of this bug in CDH 4.1.3 I wanted to test alternatives. And I choose HDP.


Note: This bug practically means that using RCFILE is useless with hive-0.9.0. The column pruning is not used by hive at all. Now it seems that the problem is in HIVE-0.9.0.

Unfortunately there is a bug also in HDP 2.0. This is not so serious however. When Ambari is  used for automated installation it can fail with  "Oozie test Fails" or if Oozie is not selected than with "Hive/HCatalog test Fails" message and the deployment log will show the following error message:

 "\"Sun Mar 03 21:38:03 +0100 2013 /Stage[2]/Hdp2-hive::Hive::Service_check/Exec[/tmp/hiveSmoke.sh]/returns (notice): FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(MetaException(message:Could not connect to meta store using any of the URIs provided))\"",

Searched for that message and found this thread mentioning that similar error can be caused by setting MYSQL host instead of leaving blank. 

I made many installation to tetst it and this is true. If you specify MYSQL host - even if you specify it properly - installation is always failing. But workaround is easy. Just leave MYSQL host field empty.

Note: I really like the Hortonworks approach - installation, configuration file handling and operation - compared to the Cloudera one but also missing some features like  decommissioning, role changes (datanode,tasktracker) of nodes.