2017年1月26日木曜日

HDP 2.5.3でBeelineがKnox経由でこまめに接続できない

hiveserver2.logを見ると、下記のようなエラーが:

ERROR [HiveServer2-HttpHandler-Pool: Thread-xxxx]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(209)) - Error: 
org.apache.hive.service.auth.HttpAuthenticationException: java.lang.reflect.UndeclaredThrowableException 
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:407) 
...
at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.reflect.UndeclaredThrowableException 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) 
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:404) 
... 23 more 
Caused by: org.apache.hive.service.auth.HttpAuthenticationException: Authorization header received from the client is empty. 
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.getAuthHeader(ThriftHttpServlet.java:548) 
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.access$100(ThriftHttpServlet.java:74) 
at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:449) 
at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:412) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)


HDP2.5.0のKnoxはhttpclient-4.5.2.jarを使っていたため同様のERRORが発生する場合がありましたが、2.5.3では直っているはず!

[root@node7 ~]# lsof -nPp `cat /var/run/knox/gateway.pid` | grep httpclient
java    6637 knox  mem    REG              252,2   736658   1713656 /usr/hdp/2.5.3.0-37/knox/ext/ranger-knox-plugin-impl/httpclient-4.5.2.jar
java    6637 knox  mem    REG              252,2   732765   1713763 /usr/hdp/2.5.3.0-37/knox/dep/httpclient-4.5.1.jar
java    6637 knox  mem    REG              252,2   150895   1713764 /usr/hdp/2.5.3.0-37/knox/dep/httpclient-cache-4.3.6.jar
java    6637 knox  179r   REG              252,2   150895   1713764 /usr/hdp/2.5.3.0-37/knox/dep/httpclient-cache-4.3.6.jar
java    6637 knox  192r   REG              252,2   732765   1713763 /usr/hdp/2.5.3.0-37/knox/dep/httpclient-4.5.1.jar
java    6637 knox  255r   REG              252,2   736658   1713656 /usr/hdp/2.5.3.0-37/knox/ext/ranger-knox-plugin-impl/httpclient-4.5.2.jar

Ranger!!!
Knoxのプラグインを止めるか、このjarを4.5.1.jarで置き換えると、今のところエラーなし。

2017年1月25日水曜日

HDP 2.5.3のKnox Demo Ldapを使ってみる(とBeeline接続テスト)

1)KnoxのLDAPが走っているか確認。ない場合はAmbariからStart Demo LDAP
[knox@node7 ~]$ ps auxwww | grep ldap
knox     13913 16.8  1.5 7471672 247728 ?      Sl   04:11   0:12 /usr/jdk64/jdk1.8.0_77/bin/java -jar /usr/hdp/current/knox-server/bin/ldap.jar /usr/hdp/current/knox-server/conf
...

Curlで起動する場合:
curl 'http://sandbox.hortonworks.com:8080/api/v1/clusters/Sandbox/requests' --data '{"RequestInfo":{"context":"Start Demo LDAP","command":"STARTDEMOLDAP"},"Requests/resource_filters":[{"service_name":"KNOX","component_name":"KNOX_GATEWAY","hosts":"sandbox.hortonworks.com"}]}'

2)ldapsearchを使うためPortを確認
[knox@node7 ~]$ lsof -p 13913 | grep LISTEN
java    13913 knox  288u  IPv6          587185915      0t0       TCP *:33389 (LISTEN)

3)同様にユーザ名とパスワードを確認
[knox@node7 ~]$ grep -E '^uid|^userPassword' /etc/knox/conf/users.ldif
uid: guest
userPassword:guest-password
uid: admin
userPassword:admin-password
uid: sam
userPassword:sam-password
uid: tom
userPassword:tom-password

4)Ldapsearchで接続テスト
[knox@node7 ~]$ ldapsearch -x -h `hostname -f`:33389 -D 'uid=admin,ou=people,dc=hadoop,dc=apache,dc=org' -w admin-password -s sub '(objectclass=person)' uid
# extended LDIF
#
# LDAPv3
# base <> (default) with scope subtree
# filter: (objectclass=person)
# requesting: uid
#

# admin, people, hadoop.apache.org
dn: uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
uid: admin

# guest, people, hadoop.apache.org
dn: uid=guest,ou=people,dc=hadoop,dc=apache,dc=org
uid: guest
...

失敗する場合は、/etc/openldap/ldap.conf を確認してみる(変な設定があるかも)

5)BeelineでLDAPユーザで接続できるか確認
beeline --verbose
!connect "jdbc:hive2://node7.localdomain:8443/;ssl=true;sslTrustStore=/usr/hdp/current/knox-server/data/security/keystores/gateway.jks;trustStorePassword=hadoop;transportMode=http;httpPath=gateway/default/hive"
このあとユーザ名adiminとパスワードadmin-passwordを入力
または、
beeline --verbose -u "jdbc:hive2://node7.localdomain:8443/;ssl=true;sslTrustStore=/tmp/myNewTrustStore.jks;trustStorePassword=changeit;transportMode=http;httpPath=gateway/default/hive" -n admin -p admin-password -e 'SELECT from_unixtime(unix_timestamp());'


参考1)HiveServer2 (HTTP+Kerberos)
kinit -kt /etc/security/keytabs/smokeuser.headless.keytab ambari-qa-c6@LAB.HORTONWORKS.NET
beeline --verbose
!connect "jdbc:hive2://node7.localdomain:10001/;transportMode=http;httpPath=cliservice;principal=hive/_HOST@LAB.HORTONWORKS.NET"

参考2)ZK discovery
beeline --verbose
!connect "jdbc:hive2://node7.localdomain:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST@LAB.HORTONWORKS.NET;transportMode=http;httpPath=cliservice"

参考3)ちなみにKnoxのtopologyのLDAP関連:
                <provider>
                    <role>authentication</role>
                    <name>ShiroProvider</name>
                    <enabled>true</enabled>
                    <param>
                        <name>sessionTimeout</name>
                        <value>30</value>
                    </param>
                    <param>
                        <name>main.ldapRealm</name>
                        <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
                    </param>
                    <param>
                        <name>main.ldapRealm.userDnTemplate</name>
                        <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
                    </param>
                    <param>
                        <name>main.ldapRealm.contextFactory.url</name>
                        <value>ldap://{{knox_host_name}}:33389</value>
                    </param>
                    <param>
                        <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
                        <value>simple</value>
                    </param>
                    <param>
                        <name>urls./**</name>
                        <value>authcBasic</value>
                    </param>
                </provider>

2017年1月24日火曜日

HDP 2.4.2: KerberosクラスターでSparkのExample HBaseTestを使う

# 必要なし:ln -sf /etc/hbase/conf/hbase-site.xml /etc/spark/conf/hbase-site.xml
# うまくいかない:export SPARK_CLASSPATH="`hbase classpath`"

export SPARK_CLASSPATH="$SPARK_CLASSPATH:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar:/etc/hbase/conf/hbase-site.xml"

# ambarismoketestを先に作らないとだめ?

spark-submit --verbose --class org.apache.spark.examples.HBaseTest --master yarn-cluster --jars /usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar --files /etc/hbase/conf/hbase-site.xml --keytab /etc/security/keytabs/smokeuser.headless.keytab --principal ambari-qa-bne_c1@HO-UBU02 /usr/hdp/current/spark-client/lib/spark-examples-*.jar ambarismoketest --conf spark.driver.extraJavaOptions = "-Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true"

# KeytabとPrincipalなしでも動く
kinit -kt /etc/security/keytabs/smokeuser.headless.keytab ambari-qa-bne_c1
spark-submit --verbose --class org.apache.spark.examples.HBaseTest --master yarn-cluster --jars /usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar --files /etc/hbase/conf/hbase-site.xml /usr/hdp/current/spark-client/lib/spark-examples-*.jar ambarismoketest

備考:Yarn Application LogにDEBUGを出力するには、/etc/spark/conf/log4j.propertiesの変更が必要
log4j.rootCategory=DEBUG, console (https://issues.apache.org/jira/browse/SPARK-12279) 

2017年1月11日水曜日

HDP2.5 SandboxでOozie Example/sampleを試す

https://oozie.apache.org/docs/4.3.0/DG_Examples.html

1)Ooize Exampleがインストールされているかチェック
yum list installed | grep oozie
hue-oozie.x86_64                        2.6.1.2.5.0.0-1245.el6       @HDP-2.5
oozie_2_5_0_0_1245.noarch               4.2.0.2.5.0.0-1245.el6       @HDP-2.5
oozie_2_5_0_0_1245-client.noarch        4.2.0.2.5.0.0-1245.el6       @HDP-2.5

rpm -Vv oozie_2_5_0_0_1245-client | grep example
.........  d /usr/hdp/2.5.0.0-1245/oozie/doc/oozie-examples.tar.gz

1.2)上記のように削除されている場合、/etc/yum.confを変更し、再インストール
sed -i.bak 's/^tsflags=nodocs/#tsflags=nodocs/g' /etc/yum.conf

yum reinstall oozie_*-client
# 確認 tar tvf /usr/hdp/current/oozie-client/doc/oozie-examples.tar.gz

1.3)yum.confをもどす
mv /etc/yum.conf.bak /etc/yum.conf

1.4)HDFSにExampleをプッシュ
su - <test user>    #例えば oozie
tar xvf /usr/hdp/current/oozie-client/doc/oozie-examples.tar.gz
hdfs dfs -put examples examples
hdfs dfs -chmod -R 777 examples/*-data

2)Hiveアクションを実行してみる
# ダミーデータを作る
[ -s ~/examples/input-data/table/int.txt ] || for i in {1..10}; do echo $i >> ~/examples/input-data/table/int.txt; done
hdfs dfs -put -f ~/examples/input-data/table/int.txt examples/input-data/table/int.txt

# 実行してみる
cd ~/examples/apps/hive

ls -l
total 20
-rw-r--r-- 1 oozie hadoop 1000 Aug 26 03:50 job.properties
-rw-r--r-- 1 oozie hadoop  110 Aug 26 03:50 README
-rw-r--r-- 1 oozie hadoop  966 Aug 26 03:50 script.q
-rw-r--r-- 1 oozie hadoop 2003 Aug 26 03:50 workflow.xml
-rw-r--r-- 1 oozie hadoop 2398 Aug 26 03:50 workflow.xml.security

cat job.properties
nameNode=hdfs://sandbox.hortonworks.com:8020
jobTracker=sandbox.hortonworks.com:8050 # check yarn.resourcemanager.address
queueName=test
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/hive

oozie job -config ./job.properties -run
Job ID : 0000000-170111083341216-oozie-oozi-W

oozie job -info 0000000-170111083341216-oozie-oozi-W -verbose

# 確認
hdfs dfs -ls -R examples/output-data
drwxrwxrwx   - oozie hdfs          0 2017-01-11 09:34 examples/output-data/hive
-rwxrwxrwx   1 oozie hdfs         21 2017-01-11 09:34 examples/output-data/hive/000000_0


3)Hive2アクションを実行してみる
# 事前にダミーデータを作っておく

cd ~/examples/apps/hive2

ls -l
total 24
-rw-r--r-- 1 oozie hadoop 1046 Aug 26 03:50 job.properties
-rw-r--r-- 1 oozie hadoop 1087 Aug 26 03:50 job.properties.security
-rw-r--r-- 1 oozie hadoop  681 Aug 26 03:50 README
-rw-r--r-- 1 oozie hadoop  966 Aug 26 03:50 script.q
-rw-r--r-- 1 oozie hadoop 2073 Aug 26 03:50 workflow.xml
-rw-r--r-- 1 oozie hadoop 2481 Aug 26 03:50 workflow.xml.security

cat job.properties
nameNode=hdfs://sandbox.hortonworks.com:8020
jobTracker=sandbox.hortonworks.com:8050    # If yarn HA, 8032
queueName=test
jdbcURL=jdbc:hive2://sandbox.hortonworks.com:10000/default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/hive2

oozie job -config ./job.properties -run
job: 0000004-170111083341216-oozie-oozi-W

oozie job -info 0000004-170111083341216-oozie-oozi-W -verbose

3.1)tez.queue.nameを変更するために、workflow.xmlにargumentタグを追加

            <param>OUTPUT=/user/${wf:user()}/${examplesRoot}/output-data/hive2</param>
            <argument>-hiveconf</argument>
            <argument>tez.queue.name=${queueName}</argument>
        </hive2>


追記 HDP2.6.0 Sandbox: 

もしKerberosが適用されていてかつHTTPモードの場合、Hive2 Actionはjob.properties.securityの変更が必要:
nameNode=hdfs://sandbox.hortonworks.com:8020
jobTracker=sandbox.hortonworks.com:8032 #2.6は8032の模様
queueName=default
jdbcURL=jdbc:hive2://sandbox.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
#jdbcURL=jdbc:hive2://sandbox.hortonworks.com:10001/default;transportMode=http;httpPath=cliservice
#jdbcURL=jdbc:hive2://sandbox.hortonworks.com:10000/default
jdbcPrincipal=hive/_HOST@EXAMPLE.COM
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/hive2/workflow.xml.security

configファイルは拡張子が.propertiesか.xmlでないとダメなので
mv ./job.properties.security ./job.security.properties
oozie job -config ./job.security.properties -run -verbose


Hive Actionはworkflow.xml.securityの変更が必要:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive-wf">
    <credentials>
      <credential name='hcatauth' type='hcat'>
         <property>
            <name>hcat.metastore.uri</name>
            <value>thrift://sandbox.horonworks.com:9083</value>
         </property>
         <property>
             <name>hcat.metastore.principal</name>
             <value>hive/_HOST@EXAMPLE.COM</value>
         </property>

hdfs dfs -put -f workflow.xml.security examples/apps/hive/workflow.xml

AmbariからYARN container excutorのbanned.usersを変更するには

NOTE: HDP2.5.3以前(またはAmbari2.4.2以前?)では壊れている模様。カンマ区切りではなく%にする必要がある。(Jiraはない?)

1)必要なプロパティを再チェック
yarn.nodemanager.container-executor.class = org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = false

2)Ambariのテンプレートを編集
vim /var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2
# 例えば下記のように変更(hdfsを取ってみた)
banned.users=yarn,mapred,bin

3)Ambari Serverのリスタート
ambari-server restart

そして、次回NodeManagerの再起動時には/etc/hadoop/conf/container-executor.cfgが変更されるはず。



2017年1月10日火曜日

Grafanaをアップグレードしてみる

Ambari/HDPのGrafanaは2.6。これを4.0に変えてみる

wget https://grafanarel.s3.amazonaws.com/builds/grafana-4.0.2-1481203731.x86_64.rpm
yum localinstall grafana-4.0.2-1481203731.x86_64.rpm
for d in `ls -d /usr/share/grafana/*`; do _d="`basename $d`"; mv /usr/lib/ambari-metrics-grafana/$_d /usr/lib/ambari-metrics-grafana/${_d}.bak; ln -s /usr/share/grafana/${_d} /usr/lib/ambari-metrics-grafana/${_d}; done
mv /usr/lib/ambari-metrics-grafana/bin/grafana-server /usr/lib/ambari-metrics-grafana/bin/grafana-server.bak && cp /usr/sbin/grafana-server /usr/lib/ambari-metrics-grafana/bin/grafana-server

必要?https://grafana.net/plugins/praj-ams-datasource
grafana-cli plugins install praj-ams-datasource

なお、今してみたところなので、ちゃんと動いているかは不明

2017年1月4日水曜日

HDP SandboxにHUEをインストールする

1)何がインストールされているか確認
[root@sandbox ~]# yum list installed | grep -i hue
Failed to set locale, defaulting to C
hue.x86_64                              2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-beeswax.x86_64                      2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-common.x86_64                       2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-hcatalog.x86_64                     2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-oozie.x86_64                        2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-pig.x86_64                          2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-sandbox.noarch                      1.2.1-88                      @sandbox
hue-server.x86_64                       2.6.1.2.4.0.0-169.el6         @HDP-2.4
hue-tutorials.noarch                    1.2.1-88                      @sandbox

2)アンインストール
[root@sandbox ~]# yum remove hue hue-common
...
[root@sandbox ~]# yum list installed | grep -i hue
Failed to set locale, defaulting to C
[root@sandbox ~]#

3)適当にレポファイルを変更 (例は2.5.3.0)
[root@sandbox ~]# sed -i.bak 's@^baseurl=.*$@baseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.3.0@' /etc/yum.repos.d/HDP.repo

4)HDP2.5.3のHUEをインストール
[root@sandbox ~]# yum clean all
[root@sandbox ~]# yum install hue

5)/etc/hue/conf/hue.iniを編集する
localhostをsandbox.hortonworks.comに変更する

まとめ:

# まず、Repoファイルを確認(不必要なファイルは取り除く):
grep -E 'HDP-[23]' /etc/yum.repos.d/*

_os_type="centos6"
_hdp_ver="`hdp-select versions | tail -n 1 | cut -d '-' -f 1`"
_repo="`grep -lE 'HDP-[23]' /etc/yum.repos.d/*`"
_user="root"

service hue stop
yum remove hue hue-common -y
sed -i.bak "s/^baseurl=.*$/baseurl=http:\/\/public-repo-1.hortonworks.com\/HDP\/${_os_type}\/2.x\/updates\/${_hdp_ver}/" ${_repo}
#yum clean all
yum install hue -y
#mv -f /etc/yum.repos.d/${_repo}.bak /etc/yum.repos.d/${_repo}
sed -i.bak "s/localhost/`hostname -f`/g" /etc/hue/conf/hue.ini
sed -i "s/server_user=hue$/server_user=${_user}/" /etc/hue/conf/hue.ini
sed -i.bak "s/^USER=hue$/USER=${_user}/" /etc/init.d/hue
chown -R ${_user}:hadoop /usr/lib/hue
service hue start # or restart




補足:データベースをHiveで使っているMySQLに変更する


[root@sandbox ~]# mysql -u root -h localhost -phadoop

mysql> create user hue identified by 'hue';
Query OK, 0 rows affected (0.23 sec)

mysql> grant all privileges on *.* to 'hue'@'localhost' with grant option;
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on hue.* to 'hue'@'localhost' identified by 'hue'
    -> ;
Query OK, 0 rows affected (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> ^DBye

[root@sandbox ~]# mysql -u hue -phue

mysql> create database hue;
Query OK, 1 row affected (0.12 sec)

mysql> ^DBye


[root@sandbox ~]# service hue stop

[root@sandbox ~]# /usr/lib/hue/build/env/bin/hue dumpdata > /tmp/hue_db_dump.json

[root@sandbox ~]# vim /etc/hue/conf/hue.ini
    engine=mysql
    host=localhost
    ## host=sandbox.hortonworks.com
    port=3306
    user=hue
    password=hue
    name=hue

su - hue
cd /usr/lib/hue
source build/env/bin/activate
hue syncdb --noinput
hue migrate
hue loaddata /tmp/hue_db_dump.json
deactivate


NOTE: if python error: ibmysqlclient_r.so.16: cannot open shared object file: No such file or directory
(env)[root@sandbox hue]# pip uninstall mysql-python
(env)[root@sandbox hue]# yum install mysql-devel
(env)[root@sandbox hue]# pip install mysql-python

NOTE2: IntegrityError: (1062, "Duplicate entry 'oozie-bundle' for key 'app_label'")
mysqldump -uhue -phue hue > /tmp/hue.sql
(env)-bash-4.1$ mysql -uhue -phue -e 'SET GLOBAL foreign_key_checks=0;DELETE FROM hue.django_content_type;'
Then immediately set to 1