2019年3月18日月曜日

Standalone Spark and Spark Thrift Server

Install

su - spark # or spark service user

cd /var/tmp/share
curl --retry 3 -C - -O https://archive.apache.org/dist/spark/spark-2.3.3/spark-2.3.3-bin-hadoop2.7.tgz

mkdir -p /usr/local/apache-spark
tar -xf /var/tmp/share/spark-2.3.3-bin-hadoop2.7.tgz -C /usr/local/apache-spark/

Setup / Start

su - spark # or spark service user

alternatives --display java | grep currently
 link currently points to /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/bin/java

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre

/usr/local/apache-spark/spark-2.3.3-bin-hadoop2.7/sbin/start-thriftserver.sh --hiveconf hive.metastore.warehouse.dir=/user/hive/warehouse --hiveconf hive.server2.thrift.port=10000 --executor-memory 2g
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /usr/local/apache-spark/spark-2.3.3-bin-hadoop2.7/logs/spark-atscale-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-spark.standalone.localdomain.out

TEST:

/usr/local/apache-spark/spark-2.3.3-bin-hadoop2.7/bin/beeline -u "jdbc:hive2://localhost:10000/" -e "CREATE DATABASE IF NOT EXISTS default;SHOW DATABASES;"

NOTE:

To specify an existing derby metastore location:
--hiveconf javax.jdo.option.ConnectionURL=jdbc:derby:/usr/local/apache-spark/metastore_db

2019年3月5日火曜日

Ambari HDP > HDFS NameNodeをスタートできない

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 408, in <module>
    NameNode().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 141, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 173, in namenode
    create_log_dir=True
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 277, in service
    Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
    returns=self.resource.returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh --config /usr/hdp/2.6.5.0-292/hadoop/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-node1.sethdesktop.localdomain.out
Error occurred during initialization of VM
GC triggered before VM initialization completed. Try increasing NewSize, current value 192K.

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 408, in <module>
    NameNode().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 141, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 121, in namenode
    format_namenode()
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 341, in format_namenode
    logoutput=True
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
    returns=self.resource.returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'hdfs --config /usr/hdp/2.6.5.0-292/hadoop/conf namenode -format -nonInteractive' returned 1. Error occurred during initialization of VM
GC triggered before VM initialization completed. Try increasing NewSize, current value 192K.


Ambari 2.7.3とHDP 2.6.5

回避策として、export HADOOP_CLIENT_OPTS="-XX:NewSize=1024k"をhadoop-env templateの最初に追加。