2017年8月30日水曜日

Ambariのアップグレードがschema does not existでできない場合(PostgreSQL)

ambari-server.log内のエラー

INFO [main] DBAccessorImpl:841 - Executing query: ALTER SCHEMA ambarischema OWNER TO "ambari";
ERROR [main] DBAccessorImpl:847 - Error executing query: ALTER SCHEMA ambarischema OWNER TO "ambari";
org.postgresql.util.PSQLException: ERROR: schema "ambarischema" does not exist
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:559)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:395)
        at org.apache.ambari.server.orm.DBAccessorImpl.executeQuery(DBAccessorImpl.java:844)
        at org.apache.ambari.server.orm.DBAccessorImpl.executeQuery(DBAccessorImpl.java:836)
        at org.apache.ambari.server.upgrade.AbstractUpgradeCatalog.changePostgresSearchPath(AbstractUpgradeCatalog.java:361)
        at org.apache.ambari.server.upgrade.AbstractUpgradeCatalog.upgradeSchema(AbstractUpgradeCatalog.java:922)
        at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.executeUpgrade(SchemaUpgradeHelper.java:207)
        at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.main(SchemaUpgradeHelper.java:425)

回避方法

概ね3通り
A) Schema Ownerを変えてみる
Ambari DBユーザかpostgresでpsqlを起動し、Information_schemaを確認する:
select catalog_name, schema_name, schema_owner from information_schema.schemata;
select table_catalog, table_schema from information_schema.tables where table_name = 'hostcomponentdesiredstate'
備考: AmbariのDBチエッカーは二つ目のクエリーと似たものでスキーマが正しいかチェックします。
現在のSchemaオーナーで再ログインして(もしくはpostgresユーザ)
\c ambari
ALTER SCHEMA ambari_tests OWNER TO "ambari"

B) データベースのエクスポート・インポート
pg_dump -Uambari ambari > ambari.sql
ambari.sqlファイル内の下記のラインが正しいことを確認
CREATE SCHEMA ambarischema;
ALTER SCHEMA ambarischema OWNER TO ambari;
SET search_path = ambarischema, pg_catalog;

sudo -u postgres psql -c "ALTER DATABASE ambari RENAME TO ambari_bakup"
sudo -u postgres psql -c "CREATE DATABASE ambari"
sudo -u postgres psql ambari < ambari.sql

備考:ALTERステートメントにスキーマが指定してあるので、スキーマを変更するのであれば確認が必要です。
例:ALTER TABLE ambarischema.metainfo OWNER TO ambari;
また、DBユーザのsearch_pathにこのスキーマが入っていないと、Ambariが起動しません。

C) テスト環境などで、一時的にUpgradeを完了したい場合は、"server.jdbc.postgres.schema" をambari.propertiesから削除するとALTER SCHEMAは実行されません。
ただし、 DBチェッカーが"select table_schema from information_schema.tables where table_name = 'hostcomponentdesiredstate'"のスキーマと違うためエラーを出します。

HDP 2.4.2でMulti-homing+Kerberos+SSLで、distcpができるテスト環境を作る

VM上でクラスターを二つ作成する

sudo -i
wget https://raw.githubusercontent.com/hajimeo/samples/master/bash/start_hdp.sh -O ./start_hdp.sh
chmod u+x ./start_hdp.sh
# after preparing the response files
./start_hdp.sh -a -r node1_HDP2420_ambari2503.resp     # -a for automate or -i for interactive
./start_hdp.sh -a -r node6_HDP2420_ambari2503.resp

NOTE: Responseファイルの中身
root@ho-ubu04:~# cat node1_HDP2420_ambari2503.resp
r_AMBARI_BLUEPRINT="Y"
r_AMBARI_BLUEPRINT_CLUSTERCONFIG_PATH=""
r_AMBARI_BLUEPRINT_HOSTMAPPING_PATH=""
r_AMBARI_HOST="node1.localdomain"
r_AMBARI_REPO_FILE="http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.5.0.3/ambari.repo"
r_AMBARI_VER="2.5.0.3"
r_APTGET_UPGRADE="N"
r_CLUSTER_NAME="ubu04c1"
r_CONTAINER_OS="centos"
r_CONTAINER_OS_VER="6.8"
r_DEFAULT_PASSWORD="hadoop"
r_DOCKERFILE_URL="https://raw.githubusercontent.com/hajimeo/samples/master/docker/DockerFile"
r_DOCKER_HOST_IP="172.17.0.1"
r_DOCKER_KEEP_RUNNING="Y"
r_DOCKER_NETWORK_ADDR="172.17.140."
r_DOCKER_NETWORK_MASK="/16"
r_DOCKER_PRIVATE_HOSTNAME="dockerhost1"
r_DOMAIN_SUFFIX=".localdomain"
r_HDP_LOCAL_REPO="N"
r_HDP_REPO_URL="http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.2.0/"
r_HDP_REPO_VER="2.4.2.0"
r_HDP_STACK_VERSION="2.4"
r_NODE_START_NUM="1"
r_NTP_SERVER="ntp.ubuntu.com"
r_NUM_NODES="4"
r_PROXY="Y"
r_PROXY_PORT="28080"
r_REPO_OS_VER="6"

root@ho-ubu04:~# diff node1_HDP2420_ambari2503.resp node6_HDP2420_ambari2503.resp
4c4
< r_AMBARI_HOST="node1.localdomain"
---
> r_AMBARI_HOST="node6.localdomain"
8c8
< r_CLUSTER_NAME="ubu04c1"
---
> r_CLUSTER_NAME="ubu04c6"
23c23
< r_NODE_START_NUM="1"
---
> r_NODE_START_NUM="6"

Dockerコンテナに二つ目のNICを追加する

curl -O https://raw.githubusercontent.com/jpetazzo/pipework/master/pipework
chmod u+x pipework
mv pipework /usr/sbin/

NOTE: 普通は同一のIPは避けたほうがいいと思いますが、今回は敢えて同じIP 
pipework br1 node1 192.168.100.1/24
pipework br1 node2 192.168.100.2/24
pipework br1 node3 192.168.100.3/24
pipework br1 node4 192.168.100.4/24
pipework br6 node6 192.168.100.1/24
pipework br6 node7 192.168.100.2/24
pipework br6 node8 192.168.100.3/24
pipework br6 node9 192.168.100.4/24

MIT KDCをVM上に設定

curl -O https://raw.githubusercontent.com/hajimeo/samples/master/bash/setup_security.sh
. ./setup_security.sh
f_kdc_install_on_host

そのあと各AmbariからKerberosをWizardから設定

/etc/hostsを設定

root@ho-ubu04:~# cat hosts_a
172.17.140.6 node6.localdomain node6.localdomain. node6
172.17.140.7 node7.localdomain node7.localdomain. node7
172.17.140.8 node8.localdomain node8.localdomain. node8
172.17.140.9 node9.localdomain node9.localdomain. node9
192.168.100.1 node1.localdomain node1.localdomain. node1
192.168.100.2 node2.localdomain node2.localdomain. node2
192.168.100.3 node3.localdomain node3.localdomain. node3
192.168.100.4 node4.localdomain node4.localdomain. node4

root@ho-ubu04:~# for i in {1..4}; do scp ./hosts_a node$i.localdomain:/etc/hosts; done

root@ho-ubu04:~# cat hosts_b
172.17.140.1 node1.localdomain node1.localdomain. node1
172.17.140.2 node2.localdomain node2.localdomain. node2
172.17.140.3 node3.localdomain node3.localdomain. node3
172.17.140.4 node4.localdomain node4.localdomain. node4
192.168.100.1 node6.localdomain node6.localdomain. node6
192.168.100.2 node7.localdomain node7.localdomain. node7
192.168.100.3 node8.localdomain node8.localdomain. node8
192.168.100.4 node9.localdomain node9.localdomain. node9

root@ho-ubu04:~# for i in {6..9}; do scp ./hosts_b node$i.localdomain:/etc/hosts; done

Ambari(CLI)からコンフィグを変更

XXXX-bind-hostを0.0.0.0に変更
# cluster name
root@ho-ubu04:~# grep r_CLUSTER_NAME *.resp
node1_HDP2420_ambari2503.resp:r_CLUSTER_NAME="ubu04c1"
node6_HDP2420_ambari2503.resp:r_CLUSTER_NAME="ubu04c6"

# on each Ambari Node (node1 and node6)
_CLS="ubu04c1"    # and ubu04c6 from node6
for _p in http https rpc serverrpc; do
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS hdfs-site dfs.namenode.${_p}-bind-host 0.0.0.0
done

HTTP Auth(Spenego)を設定
_CLS="ubu04c1"    # and ubu04c6 from node6
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.security.token.service.use_ip false
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.http.authentication.signature.secret.file /etc/security/http_secret
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.http.authentication.type kerberos
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.http.authentication.kerberos.keytab /etc/security/keytabs/spnego.service.keytab
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.http.authentication.kerberos.principal org.apache.hadoop.security.AuthenticationFilterInitializer
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.http.authentication.kerberos.principal HTTP/_HOST@EXAMPLE.COM
/var/lib/ambari-server/resources/scripts/configs.sh set localhost $_CLS core-site hadoop.http.authentication.cookie.domain localdomain

# on all nodes
dd if=/dev/urandom of=/etc/security/http_secret bs=1024 count=1 && chown hdfs:hadoop /etc/security/http_secret && chmod 440 /etc/security/http_secret

auth_to_localも変更
RULE:[1:$1@$0](ambari-qa-ubu04c6@EXAMPLE.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-ubu04c6@EXAMPLE.COM)s/.*/hdfs/
RULE:[1:$1@$0](ambari-qa-ubu04c1@EXAMPLE.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-ubu04c1@EXAMPLE.COM)s/.*/hdfs/
RULE:[1:$1@$0](.*@EXAMPLE.COM)s/@.*//
RULE:[2:$1@$0](dn@EXAMPLE.COM)s/.*/hdfs/
RULE:[2:$1@$0](hive@EXAMPLE.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@EXAMPLE.COM)s/.*/mapred/
RULE:[2:$1@$0](nm@EXAMPLE.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@EXAMPLE.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@EXAMPLE.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@EXAMPLE.COM)s/.*/yarn/
DEFAULT

テスト中に出たエラー

上記の”RULE:[1:$1@$0](.*@EXAMPLE.COM)s/@.*//”がないと、"Usernames not matched: name=hdfs != expected=hdfs-ubu04c1"

"Requested user hdfs is not whitelisted and has id 504,which is below the minimum allowed 1000"
YARN => Advanced yarn-env => Minimum user ID for submitting job

MAPREDUCE-6565 回避方法
su - hdfs
kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-ubu04c1
hdfs dfs -get /hdp/apps/2.4.2.0-258/mapreduce/mapreduce.tar.gz
tar xzvf mapreduce.tar.gz
# hadoop.security.token.service.use_ipを追加
vi ./hadoop/etc/hadoop/core-site.xml
mv mapreduce.tar.gz mapreduce.tar.gz.orig
tar czvf mapreduce.tar.gz hadoop
# 'hdfs'ユーザーとしてアップロード
hdfs dfs -put -f ./mapreduce.tar.gz /hdp/apps/2.4.2.0-258/mapreduce/mapreduce.tar.gz

最後に":$PWD/mr-framework/hadoop/etc/hadoop/"(アスタリスク*は使わない)がmapred-site mapreduce.application.classpathにあることを確認

補足:

NameNode HAでWebhdfs/SwebhdfsをDistcpで使う場合は下記のコンフィグが必要
dfs.namenode.http-address.<REMOTE_NAMESERVICE>.nn1=remote_namenode1:50070
dfs.namenode.http-address.<REMOTE_NAMESERVICE>.nn2=remote_namenode2:50070
dfs.namenode.https-address.<REMOTE_NAMESERVICE>.nn1=remote_namenode1:50470
dfs.namenode.https-address.<REMOTE_NAMESERVICE>.nn2=remote_namenode2:50470



2017年8月28日月曜日

HDPテスト環境をAmbari Blueprintでクローン(コピー)する

Ambari = 2.5.1.0
HDP = 2.6.1.0

ブループリントを取得する
curl -o hdp261_hive_llap_bp.json -u admin:admin 'http://node1.localdomain:8080/api/v1/clusters/c1?format=blueprint'

ブループリント名を追加する
  "Blueprints" : {
    "blueprint_name": "multinode-hdp",
    "stack_name" : "HDP",
    "stack_version" : "2.6"
  }

スクリプトをダウンロード
curl -O https://raw.githubusercontent.com/hajimeo/samples/master/bash/start_hdp.sh

実行する(.respファイルはオプショナル)太字のところは注意
bash ./start_hdp.sh -i [-r start_hdp.resp]
INFO : Loading start_hdp.resp...
Would you like to review your responses? [Y]:
INFO : Starting Interview mode...
INFO : You can stop this interview anytime by pressing 'Ctrl+c' (except while typing secret/password).

Run apt-get upgrade before setting up? [N]:

Keep running containers when you start this script with another response file? [N]:
NTP Server [ntp.ubuntu.com]:
First 24 bits (xxx.xxx.xxx.) of docker container IP Address [172.17.100.]: 172.17.120.
Network Mask (/16 or /24) for docker containers [/16]:
IP address for docker0 interface [172.17.0.1]:
Domain Suffix for docker containers [.localdomain]:
Container OS type (small letters) [centos]:
Container OS version (use 7.3.1611 or higher for Centos 7) [6.8]:
DockerFile URL or path (use DockerFile7 for Centos 7) [https://raw.githubusercontent.com/hajimeo/samples/master/docker/DockerFile]:
Hostname for docker host in docker private network? [dockerhost1]:
How many nodes (docker containers) creating? [4]:
Node starting number (hostname will be sequential from this number) [1]: 6
Node hostname prefix [node]:
DNS Server (Note: Remote DNS requires password less ssh) [localhost]:
Avoid installing Ambari? (to create just containers) [N]:
Ambari server hostname [node6.localdomain]:
Ambari version (used to build repo URL) [2.5.1.0]:
If you have set up a Local Repo, please change below
Ambari repo file URL or path [http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.5.1.0/ambari.repo]:
Ambari JDK URL (optional):
Ambari JCE URL (optional):
Stack Version [2.6]:
HDP Version for repository [2.6.1.0]:
Would you like to set up a local repo for HDP? (may take long time to downlaod) [N]:
HDP Repo URL [http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.6.1.0/]:
Would you like to use Ambari Blueprint? [Y]:
Cluster name [c6]:
Default password [hadoop]:
Host mapping json path (optional):
Cluster config json path (optional): ./hdp261_hive_llap_bp.json
Would you like to set up a proxy server for yum on this server? [Y]:
Proxy port [28080]:
INFO : Interview completed.
Would you like to save your response? [Y]:
INFO : Saved start_hdp.resp
Would you like to start setting up this host? [Y]:
Would you like to stop all running containers now? [Y]:
INFO : Stopping the followings
...

2017年8月21日月曜日

Hive (CLI)をJDBでデバッグ

[root@node4 ~]# su - admin
[admin@node4 ~]$ hive --debug
WARNING: Use "yarn jar" to launch YARN applications.
Listening for transport dt_socket at address: 8000

別のターミナルから

[admin@node4 ~]$ . /etc/hadoop/conf/hadoop-env.sh
[admin@node4 ~]$ $JAVA_HOME/bin/jdb -connect com.sun.jdi.SocketAttach:hostname=localhost,port=8000
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
>
VM Started: No frames on the current call stack

main[1] catch org.apache.hadoop.hive.ql.parse.SemanticException
Deferring all org.apache.hadoop.hive.ql.parse.SemanticException.
It will be set after the class is loaded.
main[1] run    <<< ここで別のターミナルからエラーを発生するクエリーを実行
> Set deferred all org.apache.hadoop.hive.ql.parse.SemanticException

Exception occurred: org.apache.hadoop.hive.ql.parse.SemanticException (to be caught at: org.apache.hadoop.hive.ql.Driver.compile(), line=524 bci=1,065)"thread=main", org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.transform(), line=116 bci=151

main[1] where
  [1] org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.transform (SimpleFetchOptimizer.java:116)
  [2] org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize (Optimizer.java:205)
  [3] org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal (SemanticAnalyzer.java:10,198)
  [4] org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal (CalcitePlanner.java:211)
  [5] org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze (BaseSemanticAnalyzer.java:227)
  [6] org.apache.hadoop.hive.ql.Driver.compile (Driver.java:459)
  [7] org.apache.hadoop.hive.ql.Driver.compile (Driver.java:316)
  [8] org.apache.hadoop.hive.ql.Driver.compileInternal (Driver.java:1,189)
  [9] org.apache.hadoop.hive.ql.Driver.runInternal (Driver.java:1,237)
  [10] org.apache.hadoop.hive.ql.Driver.run (Driver.java:1,126)
  [11] org.apache.hadoop.hive.ql.Driver.run (Driver.java:1,116)
  [12] org.apache.hadoop.hive.cli.CliDriver.processLocalCmd (CliDriver.java:216)
  [13] org.apache.hadoop.hive.cli.CliDriver.processCmd (CliDriver.java:168)
  [14] org.apache.hadoop.hive.cli.CliDriver.processLine (CliDriver.java:379)
  [15] org.apache.hadoop.hive.cli.CliDriver.executeDriver (CliDriver.java:739)
  [16] org.apache.hadoop.hive.cli.CliDriver.run (CliDriver.java:684)
  [17] org.apache.hadoop.hive.cli.CliDriver.main (CliDriver.java:624)
  [18] sun.reflect.NativeMethodAccessorImpl.invoke0 (native method)
  [19] sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
  [20] sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
  [21] java.lang.reflect.Method.invoke (Method.java:498)
  [22] org.apache.hadoop.util.RunJar.run (RunJar.java:221)
  [23] org.apache.hadoop.util.RunJar.main (RunJar.java:136)
main[1] locals
Method arguments:
pctx = instance of org.apache.hadoop.hive.ql.parse.ParseContext(id=6092)
Local variables:
topOps = instance of java.util.LinkedHashMap(id=6093)
alias = "sample_07_sym"
topOp = instance of org.apache.hadoop.hive.ql.exec.TableScanOperator(id=6095)
e = instance of java.io.FileNotFoundException(id=6096)

Hive SymlinkTextInputFormatを使ってみる

https://issues.apache.org/jira/browse/HIVE-1272

まず、ダミーテーブル達を作ります

curl -O https://raw.githubusercontent.com/hajimeo/samples/master/bash/hive_dummies.sh
bash -x ./hive_dummies.sh

"sample_07"をベースにSymlinkTextテーブルを作ってみます。

CREATE TABLE `sample_07_sym`(
  `code` string,
  `description` string,
  `total_emp` int,
  `salary` int)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\t'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

[admin@node1 ~]$ cat link.txt
/apps/hive/warehouse/*/sample_07/*
[admin@node1 ~]$ hdfs dfs -put -f link.txt /apps/hive/warehouse/dummies.db/sample_07_sym/

hive> set hive.fetch.task.conversion.threshold;
hive.fetch.task.conversion.threshold=1073741824
hive> select * from sample_07_sym limit 1;
OK
00-0000 All Occupations 134354250       40690
Time taken: 0.048 seconds, Fetched: 1 row(s)

HDFS HA パフォーマンス関係のプロパティ


hdfs-site
dfs.namenode.audit.log.async=true

# DataNode block reports and heartbeats (if no separate lifeline), also ZKFC periodic health checks. Not for client application.
#dfs.namenode.servicerpc-address=$NN_HOSTNAME:8040
# or if HA,
dfs.namenode.servicerpc-address.$NN_SERVICENAME.nn1=$NN1_HOSTNAME:8040
dfs.namenode.servicerpc-address.$NN_SERVICENAME.nn2=$NN2_HOSTNAME:8040
# Above change requires stopping ZKFC and "sudo -u hdfs hdfs zkfc –formatZK"

# 20 * log2(Cluster Size) but lower than 200. NOTE: log2(1000) = 9.965...
dfs.namenode.handler.count=200
dfs.namenode.service.handler.count=40 # default is 10

# DataNode Lifeline Protocol https://issues.apache.org/jira/browse/HDFS-9239 hadoop 2.8.0
dfs.namenode.lifeline.rpc-address.$NN_SERVICENAME.nn1=$NN1_HOSTNAME:8050
dfs.namenode.lifeline.rpc-address.$NN_SERVICENAME.nn2=$NN2_HOSTNAME:8050

# RPC Congestion Control only for NameNode (now client) port 8020 https://issues.apache.org/jira/browse/HADOOP-10597 hadoop 2.8.0
ipc.8020.backoff.enable=true

# RPC FairCallQueue https://issues.apache.org/jira/browse/HADOOP-10282 hadoop 2.6.0
ipc.8020.callqueue.impl=org.apache.hadoop.ipc.FairCallQueue

# (optional) Enable RPC Caller Context to track the “bad” jobs https://issues.apache.org/jira/browse/HDFS-9184
hadoop.caller.context.enabled=true


dfs.namenode.startup.delay.block.deletion.sec


hadoop-env
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -Xms1G -Xmx1G -XX:NewSize=128M -XX:MaxNewSize=128M -XX:PermSize=128M -XX:MaxPermSize=256M -verbose:gc -Xloggc:/Users/chris/hadoop-deploy-trunk/hadoop-3.0.0-SNAPSHOT/logs/gc.log-`date +'%Y%m%d%H%M'` -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:ErrorFile=/Users/chris/hadoop-deploy-trunk/hadoop-3.0.0-SNAPSHOT/logs/hs_err_pid%p.log -XX:+HeapDumpOnOutOfMemoryError $HADOOP_NAMENODE_OPTS"




2017年8月9日水曜日

HDP 2.6.1でInternal Root CAでHDFSをSSLを使うように変更する。あとKnoxも。

クラスターは start_hdp.sh で作成されたものを想定しています。
それ以外のクラスターではファンクション内のroot@node${i}${_domain_suffix}をなんとかする必要があります。

curl -O https://raw.githubusercontent.com/hajimeo/samples/master/bash/setup_security.sh
source ./setup_security.sh && f_loadResp "your_response_file.resp"
f_hadoop_ssl_setup


KnoxをAmbariから追加後KnoxのGateway.jksを入れ替え:

[root@node3 ~]# cd /usr/hdp/current/knox-server/data/security/keystores/
[root@node3 keystores]# mkdir backup
[root@node3 keystores]# mv __gateway-credentials.jceks gateway.jks backup/

[root@node3 keystores]# cp /etc/hadoop/security/server-keystore.jks ./gateway.jks
[root@node3 keystores]# keytool -changealias -keystore ./gateway.jks -alias node3 -destalias gateway-identity -storepass XXXXXX
[root@node3 keystores]# /usr/hdp/current/knox-server/bin/knoxcli.sh create-alias gateway-identity-passphrase --value XXXXXX

KnoxをAmbariから再起動、また、Demo LDAPも開始

テスト1:普通に接続(-kを使わない)
[root@node3 keystores]# curl -s -u admin:admin-password "https://`hostname -f`:8443/gateway/default/webhdfs/v1?op=LISTSTATUS"
[root@node3 keystores]# echo $?
60

テスト2:Internal Root CAの証明書で(PEM形式)=> OK
[root@node3 keystores]# curl --cacert /etc/hadoop/security/rootCA.pem -u admin:admin-password "https://`hostname -f`:8443/gateway/default/webhdfs/v1?op=LISTSTATUS"
{"FileStatuses":{"FileStatus":[{"acces...

テスト3:Knoxサーバーの証明書では?
[root@node3 keystores]# keytool -exportcert -rfc -file ./knox.crt -keystore ./gateway.jks -alias gateway-identity -storepass XXXXXX
[root@node3 keystores]# curl -s --cacert ./knox.crt -u admin:admin-password "https://`hostname -f`:8443/gateway/default/webhdfs/v1?op=LISTSTATUS"
[root@node3 keystores]# echo $?
60

テスト4:Root CAって必要なの?(必要なし)
[root@node3 keystores]# cp -p gateway.jks gateway.jks.bak
[root@node3 keystores]# keytool -delete -alias rootca -keystore ./gateway.jks -storepass XXXXXX

Restart Knox

[root@node3 keystores]# curl --cacert /etc/hadoop/security/rootCA.pem -u admin:admin-password "https://`hostname -f`:8443/gateway/default/webhdfs/v1?op=LISTSTATUS"
{"FileStatuses":{"FileStatus":[{"acces...


参考:
https://community.hortonworks.com/articles/14900/demystify-knox-ldap-ssl-ca-cert-integration-1.html
https://community.hortonworks.com/articles/56939/replace-knox-self-signed-certificate-with-ca-certi.html

2017年8月8日火曜日

Ambari 2.5.xでMetricが表示されない

Ambari Serverログにあるエラー

ERROR [ambari-metrics-retrieval-service-thread-6437] MetricsRetrievalService:421 - Unable to retrieve metrics from https://node2.localdomain:50070/jmx. Subsequent failures will be suppressed from the log for 20 minutes. 

AbstractProviderModule.java

String[] getPortProperties(Service.Type service, String componentName, String hostName, Map<String, Object> properties, boolean httpsEnabled) {
  componentName = httpsEnabled ? componentName + "-HTTPS" : componentName;
  if(componentName.startsWith("NAMENODE") && properties.containsKey("dfs.internal.nameservices")) {
    componentName += "-HA";
    return getNamenodeHaProperty(properties, serviceDesiredProperties.get(service).get(componentName), hostName);
  }
  return serviceDesiredProperties.get(service).get(componentName);
}


Map<String, String[]> initPropMap = new HashMap<String, String[]>();
initPropMap.put("NAMENODE", new String[]{"dfs.http.address", "dfs.namenode.http-address"});
initPropMap.put("NAMENODE-HTTPS", new String[]{"dfs.namenode.https-address", "dfs.https.port"});
initPropMap.put("NAMENODE-HA", new String[]{"dfs.namenode.http-address.%s.%s"});
initPropMap.put("NAMENODE-HTTPS-HA", new String[]{"dfs.namenode.https-address.%s.%s"});
initPropMap.put("DATANODE", new String[]{"dfs.datanode.http.address"});
initPropMap.put("DATANODE-HTTPS", new String[]{"dfs.datanode.https.address"});
initPropMap.put("JOURNALNODE-HTTPS", new String[]{"dfs.journalnode.https-address"});
initPropMap.put("JOURNALNODE", new String[]{"dfs.journalnode.http-address"});
serviceDesiredProperties.put(Service.Type.HDFS, initPropMap);
SSL/HTTPSを設定したのに、dfs.namenode.https-address.<nameservice>.nnXを忘れたため。。。
ちなみにAmbariは下記のプロパティでHAかどうかを判断します。
if (configProperties.containsKey("dfs.internal.nameservices")) {
  componentName += "-HA";