2017年9月27日水曜日

HDP Sandbox (2.6.1)のAmbariでOracle XEを使う

https://community.hortonworks.com/content/supportkb/49135/how-to-install-oracle-express-xe-on-sandbox.html
https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_ambari_reference_guide/content/_using_ambari_with_oracle.html
http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/sqldev/r31/datapump_OBE/datapump.html


1. Oracle XEのインストール

Dockerホスト(Ubuntu)でSwapが2GB以上あるか確認して、内容であれば追加する
dd if=/dev/zero of=/var/swap.file count=2560 bs=1M
chmod go= /var/swap.file
mkswap /var/swap.file
grep -qw swap /etc/fstab || echo "/var/swap.file swap swap defaults 0 0" >> /etc/fstab
swapon /var/swap.file

適当にあるCentOS系イメージからコンテナーを作成する(Sandboxのもので可)
docker run --name oracle --hostname "node100.localdomain" --network=hdp --ip=172.17.130.100  --privileged -d hdp/base:6.8  /usr/sbin/sshd -D

Container上でOracleのウエブサイトからOracle XE Linux 64bit rpm zipファイルをダウンロードし、適当なところに解凍する
/dev/shmが2GB以上あることを確認する。
ない場合は
mount -t tmpfs shmfs -o size=2g /dev/shm

Oracle rpmをインストール
cd ./Disk1
rpm -ivh oracle-xe-11.2.0-*.0.x86_64.rpm
/etc/init.d/oracle-xe configure

確認

su - oracle
. /u01/app/oracle/product/11.2.0/xe/bin/oracle_env.sh
sqlplus / as sysdba


2. AmbariようにOracleを設定する

CREATE USER ambari IDENTIFIED BY bigdata default tablespace USERS temporary tablespace TEMP;
GRANT unlimited tablespace to ambari;
GRANT create session to ambari;
GRANT create TABLE to ambari;
GRANT create SEQUENCE to ambari;

新規にDBを作成する場合は

sqlplus ambari/bigdata < /var/lib/ambari-server/resources/Ambari-DDL-Oracle-CREATE.sql

既存のDBをImportするには

SELECT directory_name, directory_path FROM dba_directories WHERE directory_name='DATA_PUMP_DIR';
/u01/app/oracle/admin/XE/dpdump/

mv ambari.dmp /u01/app/oracle/admin/XE/dpdump/

もし、1521ポートがマッピングされていない場合は
. ./start_hdp.sh
f_port_forward 1521 sandbox.hortonworks.com 1521

Oracle SQL DeveloperをPC/Macから開始
New Connectionを作成(SYSTEMユーザで)
View => DBAからもコネクション作成ボタンを押して上記で作ったコネクションを使用
Data Pumpを右クリックしてData Pump Import Wizardを選択
Step 1でFile Namesに.dmpファイル名をタイプ(Type of ImportはTablesかSchema)
Step 2で全テーブルを選択
Step 3でRe-Map SchemasでSchemaをAMBARI、Re-Map TablespacesでDestinationをUSERS
Step 4でAction On Table if Table ExistsをReplace

tail -f /u01/app/oracle/admin/XE/dpdump/IMPORT.LOG

うまくいった場合はSQL DeveloperでOther Users => AMBARI => Tables (Filtered)にテーブルがたくさんあるはずです。

新規作成かインポート後ambari-server setup

その前に、
ln -s /u01/app/oracle/product/11.2.0/xe/jdbc/lib/ojdbc6.jar /usr/share/java/ojdbc6.jar
echo 'server.jdbc.driver.path=/usr/share/java/ojdbc6.jar' >> /etc/ambari-server/conf/ambari.properties

もしインポートした場合はadminユーザのパスワードを変更する必要があります。
[root@sandbox ~]# su - oracle
-bash-4.1$ . /u01/app/oracle/product/11.2.0/xe/bin/oracle_env.sh
-bash-4.1$ sqlplus ambari/bigdata

SQL*Plus: Release 11.2.0.2.0 Production on Wed Sep 27 06:23:08 2017

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production

SQL> UPDATE users SET user_password='538916f8943ec225d97a9a86a2c6ec0818c1cd400e09e03b660fdaaec4af29ddbb6f2b1033b81b00' WHERE user_name='admin' and user_type='LOCAL';

1 row updated.

Ambariセットアップの開始
ambari-server stop
ambari-server setup
Using python  /usr/bin/python
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)?
Adjusting ambari-server permissions and ownership...
Checking firewall status...
WARNING: iptables is running. Confirm the necessary Ambari ports are accessible. Refer to the Ambari documentation for more details on ports.
OK to continue [y/n] (y)?
Checking JDK...
Do you want to change Oracle JDK [y/n] (n)?
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? y
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
==============================================================================
Enter choice (2): 2
Hostname (localhost):
Port (1521):
Select Oracle identifier type:
1 - Service Name
2 - SID
(1): 2
SID (XE):
Username (ambari):
Enter Database Password (bigdata):
Configuring ambari database...
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-Oracle-CREATE.sql'
Proceed with configuring remote database connection properties [y/n] (y)?
Extracting system views...
............
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.

ambari-server start

補足:コンテナを止めてしまって、再度Oracleをスタートするには
[root@sandbox ~]# rm -rf /var/tmp/.oracle
[root@sandbox ~]# mount -t tmpfs shmfs -o size=2g /dev/shm
[root@sandbox ~]# su - oracle
-bash-4.1$ . /u01/app/oracle/product/11.2.0/xe/bin/oracle_env.sh
-bash-4.1$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Tue Oct 3 12:30:23 2017

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2233344 bytes
Variable Size             729811968 bytes
Database Buffers          331350016 bytes
Redo Buffers                5541888 bytes
Database mounted.
Database opened.
SQL> Disconnected from Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production
-bash-4.1$ lsnrctl start


補足2:HDFだと変なエラーが出る
30 Jan 2018 01:23:53,156  WARN [Stack Version Loading Thread] RepoVdfCallable:142 - Could not load version definition for HDP-2.6 identified by http://public-repo-1.hortonworks.com/HDP/ubuntu12/2.x/updates/2.6.4.0/HDP-2.6.4.0-91.xml. null
javax.xml.bind.UnmarshalException
 - with linked exception:
[org.xml.sax.SAXParseException; lineNumber: 54; columnNumber: 15; cvc-complex-type.2.4.d: Invalid content was found starting with element 'tags'. No child element is expected at this point.]
        at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.handleStreamException(UnmarshallerImpl.java:431)
        at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:368)
        at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:338)
        at org.apache.ambari.server.state.repository.VersionDefinitionXml.load(VersionDefinitionXml.java:442)
...
30 Jan 2018 01:23:53,238 ERROR [main] AmbariServer:1073 - Failed to run the Ambari Server
org.apache.ambari.server.AmbariException: An error occured during updating current repository versions with stack repositories.
        at org.apache.ambari.server.stack.UpdateActiveRepoVersionOnStartup.process(UpdateActiveRepoVersionOnStartup.java:99)
        at org.apache.ambari.server.orm.AmbariJpaLocalTxnInterceptor.invoke(AmbariJpaLocalTxnInterceptor.java:128)
        at org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:1061)
Caused by: java.lang.NullPointerException
        at org.apache.ambari.server.stack.UpdateActiveRepoVersionOnStartup.updateRepoVersion(UpdateActiveRepoVersionOnStartup.java:106)
        at org.apache.ambari.server.stack.UpdateActiveRepoVersionOnStartup.process(UpdateActiveRepoVersionOnStartup.java:92)
        ... 2 more

回避策
ambari-server install-mpack --mpack=http://public-repo-1.hortonworks.com/HDF/centos6/3.x/updates/3.0.2.0/tars/hdf_ambari_mp/hdf-ambari-mpack-3.0.2.0-76.tar.gz --verbose

2017年9月25日月曜日

Dockerで後からポートフォーワーディング(ポートバインディング)を変更する

コンテナを作成後はバインドされたポート番号を追加・変更できないと思っていましたが、できる模様です。

1.まずコンテナIDを取得します。
root@ubuntu:~# docker ps | grep sandbox-hdp
5a9ac4445e23        sandbox-hdp         "/usr/sbin/sshd -D"   13 days ago         Up 14 minutes       0.0.0.0:1000->1000/tcp, 0.0...

2.Dockerサービスを止めます。
root@ubuntu:~# service docker stop

3.hostconfig.jsonとconfig.v2.jsonを変更します。
下記の例では6182を追加しています。
root@ubuntu:~# vim /var/lib/docker/containers/5a9ac4445e23fa04dc567f3f72d6bf21a349a4fb1fd08d7442477cd283450a6a/hostconfig.json
:%!python -m json.tool
        ...
        ],
        "6080/tcp": [
            {
                "HostIp": "",
                "HostPort": "6080"
            }
        ],
        "6182/tcp": [
            {
                "HostIp": "",
                "HostPort": "6182"
            }
        ],
        "61310/tcp": [
            { ...

root@ubuntu:~# vim /var/lib/docker/containers/5a9ac4445e23fa04dc567f3f72d6bf21a349a4fb1fd08d7442477cd283450a6a/config.v2.json
:%!python -m json.tool
            ...
            "61310/tcp": {},
            "6182/tcp": {},
            "6188/tcp": {},
            ...

4.Dockerサービスを開始後、コンテナを開始します。
root@ubuntu:~# service docker start
root@ubuntu:~# docker start sandbox-hdp

確認
root@ubuntu:~# docker ps | grep sandbox-hdp
5a9ac4445e23        sandbox-hdp         "/usr/sbin/sshd -D"   13 days ago         Up 19 minutes       0.0.0.0:10..., 0.0.0.0:6182->6182/tcp, 0.0.0.0:...   sandbox-hdp

2017年9月22日金曜日

HDP Sandbox(Docker版)にKerberos+SPNEGOとSSLを設定する

Ubuntu 16.04をインストールしDockerを設定する

https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/

Sandbox Docker版をインストールする

https://hortonworks.com/downloads/#sandbox からDocker版をダウンロードします(ほぼ11GB!)
ダウンロード後、下記のようなコマンドを実行(*は実際は数字)

docker load < /path/to/HDP_2_6_1_docker_image_*_*_*_*_*_*.tar

しばらく待つと、イメージが完了です。

必須ではないですが、ポートフォワーディングがあると直接(プロキシーなどを使わずに)Sandboxにアクセスできるようになるので、設定したほうがおすすめです。
その場合はhttps://raw.githubusercontent.com/hortonworks/data-tutorials/master/tutorials/hdp/sandbox-port-forwarding-guide/assets/start-sandbox-hdp.sh を利用するといいでしょう。
このスクリプトをダウンロードし実行するとSandboxのコンテナーが始まります。(docker psで確認)
AmbariにログインしてHDPがちゃんと動いているか確認してください。

* Ambariにログインできない場合は/usr/sbin/ambari-admin-password-resetコマンドでAdminユーザのパスワードを変更する必要があるかもしれません。
* HiveなどのMySQLを使用するサービスが開始できない場合はmysqldが開始しているか確認してください。
* HDFS DataNodeが始まらない場合は"chown -R hdfs:hadoop /hadoop/hdfs"を試してみてください。Zookeeperなども/hadoop/zookeeperのパーミッションを確認してください。
* yumコマンドが"PYCURL ERROR 22"などでうまくいかない場合は"yum clean all"を試してください。(補足2も参照)

または、上記をスクリプトにして見ました。

HDP Sandbox DockerにKerberosを設定する

ここからが本番です。
ところで、もし、コンテナの調子が悪くなったときは、削除して作り直すほうが、トラブルシュートするより早い場合があります。(docker rm -f sandbox-hdp)

curl -O https://raw.githubusercontent.com/hajimeo/samples/master/bash/setup_security.sh
source ./setup_security.sh
# MIT KDCをUbuntu上にインストールする関数を実行(最初の一回だけ)
f_kdc_install_on_host
# Kerberos化をAmbariから実行する関数を実行
f_ambari_kerberos_setup "EXAMPLE.COM" "172.17.0.1" "" "sandbox.hortonworks.com" "sandbox.hortonworks.com"

* 最後の関数の引数は、1)Realm名、2)KDCサーバーのホスト名かIP Address、3)パスワード(空欄の場合はhadoop)、4)Ambari Serverが走っているホスト名、5)Sandboxホスト名
Sandboxホスト名は正しく名前解決できる必要があります。(/etc/hostsを確認)

HDP Sandbox DockerにSPNEGOを設定する

上記のf_ambari_kerberos_setup以外を実行後、

f_hadoop_spnego_setup "EXAMPLE.COM" "hortonworks.com" "sandbox.hortonworks.com" "8080" "sandbox.hortonworks.com"

引数は、1)REALM、2)Cookieに使用されるDomain名、3)Ambari Serverが走っているホスト名、4)同ポート番号、5)Sandboxほホスト名
Ambariから必要なサービスを再起動してください。

確認!

# --negotiate無しだと失敗するべき
[hdfs@sandbox ~]$ curl -I "http://`hostname -f`:50070/jmx"
HTTP/1.1 401 Authentication required
Cache-Control: must-revalidate,no-cache,no-store
Date: Fri, 22 Sep 2017 01:16:35 GMT
Pragma: no-cache
Date: Fri, 22 Sep 2017 01:16:35 GMT
Pragma: no-cache
Content-Type: text/html; charset=iso-8859-1
X-FRAME-OPTIONS: SAMEORIGIN
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Domain=hortonworks.com; HttpOnly
Content-Length: 1396
Server: Jetty(6.1.26.hwx)

# kinit後、--negotiate付きだと成功するべき
[hdfs@sandbox ~]$ curl -I --negotiate -u: "http://`hostname -f`:50070/jmx"
HTTP/1.1 401 Authentication required
Cache-Control: must-revalidate,no-cache,no-store
Date: Fri, 22 Sep 2017 01:17:30 GMT
Pragma: no-cache
Date: Fri, 22 Sep 2017 01:17:30 GMT
Pragma: no-cache
Content-Type: text/html; charset=iso-8859-1
X-FRAME-OPTIONS: SAMEORIGIN
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Domain=hortonworks.com; HttpOnly
Content-Length: 1396
Server: Jetty(6.1.26.hwx)

HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Fri, 22 Sep 2017 01:17:30 GMT
Date: Fri, 22 Sep 2017 01:17:30 GMT
Pragma: no-cache
Expires: Fri, 22 Sep 2017 01:17:30 GMT
Date: Fri, 22 Sep 2017 01:17:30 GMT
Pragma: no-cache
Content-Type: application/json; charset=utf8
X-FRAME-OPTIONS: SAMEORIGIN
Set-Cookie: hadoop.auth="u=hdfs&p=hdfs-sandbox@EXAMPLE.COM&t=kerberos&e=1506079050370&s=UMYdk7fS8byXBRACypD6kHJQtn4="; Path=/; Domain=hortonworks.com; HttpOnly
Content-Length: 105102
Access-Control-Allow-Methods: GET
Access-Control-Allow-Origin: *
Server: Jetty(6.1.26.hwx)

HDP Sandbox DockerのHDFS,YARN,MR2にSSL(HTTPS)を設定する

上記と同様に、最後のコマンドが下記になります。

f_hadoop_ssl_setup "" "" "sandbox.hortonworks.com" "8080" "sandbox.hortonworks.com"


Ambariから必要なサービスを再起動してください。

確認!

[hdfs@sandbox ~]$ curl -kI --negotiate -u: "https://`hostname -f`:50470/jmx"
HTTP/1.1 401 Authentication required
Cache-Control: must-revalidate,no-cache,no-store
Date: Fri, 22 Sep 2017 01:33:49 GMT
Pragma: no-cache
Date: Fri, 22 Sep 2017 01:33:49 GMT
Pragma: no-cache
Content-Type: text/html; charset=iso-8859-1
X-FRAME-OPTIONS: SAMEORIGIN
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Domain=hortonworks.com; Secure; HttpOnly
Content-Length: 1396
Server: Jetty(6.1.26.hwx)

HTTP/1.1 200 OK
...

補足

補足1:ちなみに、"help f_xxxxxxx"と打つとちょっとだけヘルプが出ます。

root@ho-ubu03:~# help f_hadoop_ssl_setup
Setup SSL for hadoop https://community.hortonworks.com/articles/92305/how-to-transfer-file-using
-secure-webhdfs-in-distc.html

Parameters:
    local _dname_extra="$1";
    local _password="$2";
    local _ambari_host="${3-$r_AMBARI_HOST}";
    local _ambari_port="${4-8080}";
    local _how_many="${5-$r_NUM_NODES}";
    local _start_from="${6-$r_NODE_START_NUM}";
    local _domain_suffix="${7-$r_DOMAIN_SUFFIX}";
    local _work_dir="${8-./}";

Show source code? [N]:


補足2:今日現在で、Sandboxのチュートリアル関係と思われるレポジトリーがアクセスできない模様です。
AmbariがKerberos関係のパッケージをインストールする際に失敗します。

http://dev2.hortonworks.com.s3.amazonaws.com/repo/dev/master/utils/repodata/repomd.xml: [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 403 Forbidden"

取り敢えず、今は/etc/yum.repos.d/sandbox.repoを移動して使ってます。

2017年9月15日金曜日

HadoopのSplit Metadataって何?

YarnRuntimeException: java.io.IOException: Split metadata size exceeded 10000000 exception

[root@sandbox ~]# strings /hadoop/yarn/local/usercache/admin/appcache/application_1505439442865_0004/filecache/10/job.splitmetainfo
META-SPL
sandbox.hortonworks.com



org.apache.hadoop.mapreduce.split.JobSplit class:

/**
 * This class groups the fundamental classes associated with
 * reading/writing splits. The split information is divided into
 * two parts based on the consumer of the information. The two
 * parts are the split meta information, and the raw split 
 * information. The first part is consumed by the JobTracker to
 * create the tasks' locality data structures. The second part is
 * used by the maps at runtime to know what to do!
 * These pieces of information are written to two separate files.
 * The metainformation file is slurped by the JobTracker during 
 * job initialization. A map task gets the meta information during
 * the launch and it reads the raw split bytes directly from the 
 * file.
 */


JobSplitWriter class:

  private static void writeJobSplitMetaInfo(FileSystem fs, Path filename, 
      FsPermission p, int splitMetaInfoVersion, 
      JobSplit.SplitMetaInfo[] allSplitMetaInfo

  public static <T extends InputSplit> void createSplitFiles(Path jobSubmitDir, 
      Configuration conf, FileSystem fs, T[] splits) 
  throws IOException, InterruptedException {
    FSDataOutputStream out = createFile(fs, 
        JobSubmissionFiles.getJobSplitFile(jobSubmitDir), conf);
    SplitMetaInfo[] info = writeNewSplits(conf, splits, out);
    out.close();
    writeJobSplitMetaInfo(fs,JobSubmissionFiles.getJobSplitMetaFile(jobSubmitDir), 
        new FsPermission(JobSubmissionFiles.JOB_FILE_PERMISSION), splitVersion,
        info);
  }

  private static <T extends InputSplit> 
  SplitMetaInfo[] writeNewSplits(Configuration conf, 
      T[] array, FSDataOutputStream out)
  throws IOException, InterruptedException {

    SplitMetaInfo[] info = new SplitMetaInfo[array.length];
    if (array.length != 0) {
        ... (snip) ...
        info[i++] = 
          new JobSplit.SplitMetaInfo
              locations, offset,
              split.getLength());
        offset += currCount - prevCount;
      }
    }
    return info;
  }


org.apache.hadoop.mapreduce.split.JobSplit class:
    public SplitMetaInfo(String[] locations, long startOffset, 
        long inputDataLength) {
      this.locations = locations;
      this.startOffset = startOffset;
      this.inputDataLength = inputDataLength;
    }

2017年9月14日木曜日

HDPのインストール後に変更されたファイルを探す(Hotfixとか)

[root@sandbox ~]# yum install yum-utils
Loaded plugins: fastestmirror, ovl, priorities
Setting up Install Process
Loading mirror speeds from cached hostfile
 * base: centos.mirror.serversaustralia.com.au
 * epel: mirror.overthewire.com.au
 * extras: mirror.overthewire.com.au
 * updates: mirror.ventraip.net.au
Package yum-utils-1.1.30-40.el6.noarch already installed and latest version
Nothing to do
[root@sandbox ~]# yumdb search from_repo "HDP-2.6" | grep -vP '^\s|^$|^Loaded' > yum_installed_from_HDP26.out
[root@sandbox ~]# head yum_installed_from_HDP26.out
atlas-metadata_2_6_0_3_8-0.8.0.2.6.0.3-8.noarch
atlas-metadata_2_6_0_3_8-falcon-plugin-0.8.0.2.6.0.3-8.noarch
atlas-metadata_2_6_0_3_8-hive-plugin-0.8.0.2.6.0.3-8.noarch
atlas-metadata_2_6_0_3_8-sqoop-plugin-0.8.0.2.6.0.3-8.noarch
atlas-metadata_2_6_0_3_8-storm-plugin-0.8.0.2.6.0.3-8.noarch
bigtop-jsvc-1.0.15-8.x86_64
bigtop-tomcat-6.0.44-1.noarch
datafu_2_6_0_3_8-1.3.0.2.6.0.3-8.noarch
falcon_2_6_0_3_8-0.10.0.2.6.0.3-8.noarch
flume_2_6_0_3_8-1.5.2.2.6.0.3-8.noarch
[root@sandbox ~]# for p in `cat yum_installed_from_HDP26.out`
> do
> rpm -V $p | grep -P '^..5|^missing' | grep \.jar$
> done

grep \.jar$はJarファイルを探したい時だけ

yum-utilsをインストールしたくない場合は、下記で多分大丈夫
rpm -qa | grep '\.2\.6\.0\.3-8' | sort > rpm_installed_from_HDP26.out

2017年9月7日木曜日

備忘録:IntelliJ IDEで他のプロジェクトをライブラリーとして読み込む

例:hadoop-common-project/hatoop-authを足して見ます。
(事前にhadoopプロジェクトをチェックアウトする必要あり)

File => Project Structure
Modules
+ ボタンを押す。二つありますが最初のやつ(左上のAdd (^N))です
Import Module
hadoop/hadoop-common-project/hadoop-auth/srcを選択
Create module from existing sources => Next => Next .... => Finish
ここでsrc/main/javaとsrc/test/javaが表示されるはずです。


または、(でもうまくいかない)
File => Project Structure
Global Libraries
+ ボタンを押す。二つありますが最初のやつ、またはAdd (^N)のです
hadoop/hadoop-common-project/hadoop-auth/srcを選択
OK

その後、メインのモデュールを選び、Dependenciesタブから、今追加したモデュールを追加。


2017年9月6日水曜日

HDP 2.5/2.6 Ranger Solr Plugin (unofficial)

参考:
https://hadoop-and-hdp.blogspot.in/2017/09/ambari-240hdp-search-solrhdp-250.html
https://community.hortonworks.com/articles/15159/securing-solr-collections-with-ranger-kerberos.html Kerberosパートは必要ない?もう古い?
https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.5.0+Installation#ApacheRanger0.5.0Installation-EnablingRangerSolrPlugin

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_security/content/solr_service.html
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_solr-search-installation/content/ch_hdp-search-install-ambari.html
http://public-repo-1.hortonworks.com/HDP-SOLR-2.6.1-100/repos/centos6/HDP-SOLR-2.6-100-centos6.tar.gz
http://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.9.tar.gz


Pluginのインストール(mpackを使用していない場合)

yum install ranger_*-solr-plugin


Pluginの設定(mpack 2.2.5でも必要)

# 古いRangerでDBへのAuditを使用している場合のみ
cp /usr/share/java/mysql-connector-java.jar /usr/hdp/`hdp-select versions | tail -1`/ranger-solr-plugin/lib

# コンフィグライルへのパスは何回も使うので_fに格納
_f="/usr/hdp/`hdp-select versions | tail -1`/ranger-solr-plugin/install.properties"

# 現在の値を確認(後で同じコマンドで確認)
grep -E '^(SQL_CONNECTOR_JAR|COMPONENT_INSTALL_DIR_NAME|POLICY_MGR_URL|REPOSITORY_NAME)' $_f
COMPONENT_INSTALL_DIR_NAME=/opt/solr/server
POLICY_MGR_URL=
SQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar
REPOSITORY_NAME=

# 念のためバックアップ
cp -p $_f $_f.bak

# 使用する変数の指定
_RANGER_HOST="node13.localdomain"
_ZOOKEEPER_HOST="node13.localdomain"
_CLUSTER_NAME="ubu02c11"

# 設定変更(太字の箇所は変更する場合あり)
sed -i -e 's@^COMPONENT_INSTALL_DIR_NAME=.*@COMPONENT_INSTALL_DIR_NAME=/opt/lucidworks-hdpsearch/solr/server@' $_f
sed -i -e 's@^POLICY_MGR_URL=.*@POLICY_MGR_URL=http://'$_RANGER_HOST':6080@' $_f
sed -i -e 's@^REPOSITORY_NAME=.*@REPOSITORY_NAME='$_CLUSTER_NAME'_solr@' $_f
#sed -i -e 's@^SQL_CONNECTOR_JAR=.*@SQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar@' $_f

# As "root" user
. /etc/hadoop/conf/hadoop-env.sh
/usr/hdp/`hdp-select versions | tail -1`/ranger-solr-plugin/enable-solr-plugin.sh

# 念のためタイムアウトを増加
sed -i 's/(sleep 5)/(sleep 30)/g' /opt/lucidworks-hdpsearch/solr/bin/solr

service solr restart
# Mpackの場合はAmbariからリスタート

# 確認
ls -l /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/WEB-INF/lib/*ranger*
lrwxrwxrwx 1 root root 93 Sep  6 09:18 /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/WEB-INF/lib/ranger-plugin-classloader-0.6.0.2.5.0.0-1245.jar -> /usr/hdp/2.5.0.0-1245/ranger-solr-plugin/lib/ranger-plugin-classloader-0.6.0.2.5.0.0-1245.jar
lrwxrwxrwx 1 root root 68 Sep  6 09:18 /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/WEB-INF/lib/ranger-solr-plugin-impl -> /usr/hdp/2.5.0.0-1245/ranger-solr-plugin/lib/ranger-solr-plugin-impl
lrwxrwxrwx 1 root root 91 Sep  6 09:18 /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/WEB-INF/lib/ranger-solr-plugin-shim-0.6.0.2.5.0.0-1245.jar -> /usr/hdp/2.5.0.0-1245/ranger-solr-plugin/lib/ranger-solr-plugin-shim-0.6.0.2.5.0.0-1245.jar


# AuthorizationにRangerを指定する(太字のところは変更する場合あり)
sudo -u solr /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost ${_ZOOKEEPER_HOST}:2181 -cmd put /solr/security.json '{"authentication":{"class": "org.apache.solr.security.KerberosPlugin"},"authorization":{"class": "org.apache.ranger.authorization.solr.authorizer.RangerSolrAuthorizer"}}'


# Mpackの場合は、Ambariから再起動すると書き換えられてしまうので、下記のファイルを変更する必要あり
/var/lib/ambari-server/resources/mpacks/solr-ambari-mpack-*/common-services/SOLR/*/package/scripts/setup_solr_kerberos_auth.py
/var/lib/ambari-agent/cache/common-services/SOLR/*/package/scripts/setup_solr_kerberos_auth.py

# さらに、mpack 2.2.9からはAdvanced solr-securityから変更可能なはずだが、solr_security_jsonのところがsecurity_jsonになっている、かつ"Solr Security Json was found, it will not be overridden"
#sudo -u solr /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost ${_ZOOKEEPER_HOST}:2181 -cmd clear /solr/security.json

# AmbariからRangerのプロパティを追加 custom ranger-admin-site
ranger.plugins.solr.serviceuser=solr

# core-siteにproxyuser
hadoop.proxyuser.solr.hosts=*
hadoop.proxyuser.solr.groups=*

Ranger UIからSolrサービスを追加する

Service Name: ubu04c11_solr      # clustername_service
Username: amb_ranger_admin     # exists in x_portal_user
Password: r************d
Solr Url: http://$_SOLR_FQDN:8983/solr

#policy.download.auth.users = solr

Note:"/solr"が必要かどうかは不明。HWXとApacheで記述が異なる


Ambariを使わずにSolrを開始するには:

sudo -u solr /opt/lucidworks-hdpsearch/solr/bin/solr start -h node11.localdomain -cloud -z node13.localdomain:2181/solr -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://node12.localdomain:8020/solr -Dsolr.hdfs.confdir=/usr/hdp/current/hadoop-client/conf -Dsolr.hdfs.security.kerberos.enabled=true -Dsolr.hdfs.security.kerberos.keytabfile=/etc/security/keytabs/solr.service.keytab -Dsolr.hdfs.security.kerberos.principal=solr/node11.localdomain@EXAMPLE.COM -p 8983 -m 512m >> /var/log/service_solr/solr-service.log 2>&1

# ステータスの確認
/opt/lucidworks-hdpsearch/solr/bin/solr status

Found 1 Solr nodes:

Solr process 2216 running on port 8983
INFO  - 2017-09-06 09:37:14.907; org.apache.solr.util.SolrCLI; Set HttpClientConfigurer from: org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer
INFO  - 2017-09-06 09:37:15.116; org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer; Setting up SPNego auth with config: /etc/solr/conf/solr_server_jaas.conf
{
  "solr_home":"/etc/solr/data_dir",
  "version":"5.5.2 8e5d40b22a3968df065dfc078ef81cbb031f0e4a - sarowe - 2016-06-21 11:44:11",
  "startTime":"2017-09-06T09:36:49.46Z",
  "uptime":"0 days, 0 hours, 0 minutes, 26 seconds",
  "memory":"84.6 MB (%17.3) of 490.7 MB",
  "cloud":{
    "ZooKeeper":"node13.localdomain:2181/solr",
    "liveNodes":"1",

    "collections":"1"}}


# Ranger Solr pluginのPolicyがダウンロドできるか確認
su - solr
kinit -kt /etc/security/keytabs/solr.service.keytab solr/sandbox.hortonworks.com
curl -v -u amb_ranger_admin "http://sandbox.hortonworks.com:6080/service/plugins/secure/policies/download/Sandbox_solr"
curl -v --negotiate -u: "http://sandbox.hortonworks.com:6080/service/plugins/secure/policies/download/Sandbox_solr"


よくあるエラー

xasecure.policymgr.clientssl.truststoreがPluginのサービスユーザから読めない

1. collection1_shard1_replica1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Index dir 'hdfs://node12.localdomain:8020/solr/collection1/core_node1/data/index/' of core 'collection1_shard1_replica1' is already locked. The most likely cause is another Solr server (or another solr core in this server) also configured to use this directory; other possible causes may be specific to lockType: hdfs

https://issues.apache.org/jira/browse/SOLR-8335

hdfs dfs -ls -R /solr/ | grep write.lock
-rw-r--r--   3 solr hdfs          0 2017-08-31 08:26 hdfs://node12.localdomain:8020/solr/collection1/core_node1/data/index/write.lock
-rw-r--r--   3 solr hdfs          0 2017-08-31 08:26 hdfs://node12.localdomain:8020/solr/collection1/core_node2/data/index/write.lock

2. /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost node13.localdomain:2181 -cmd makepath /solr
Exception in thread "main" org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /solr
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
        at org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:501)
        at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:498)
        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:455)
        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:442)
        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:398)
        at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:258)

3. service solr start
/opt/solr not found! Please check the SOLR_INSTALL_DIR setting in your /etc/init.d/solr script.
/etc/default/solr.in.sh not found! Please check the SOLR_ENV setting in your /etc/init.d/solr script.

ln -s /opt/lucidworks-hdpsearch/solr /opt/solr
ln -s /opt/lucidworks-hdpsearch/solr/bin/solr.in.sh /etc/default/solr.in.sh

4. org.apache.ranger.authorization.hadoop.utils.RangerCredentialProvider (RangerCredentialProvider.java:72) - Unable to get the Credential Provider from the Configuration
java.lang.IllegalArgumentException: The value of property hadoop.security.credential.provider.path must not be null
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1010)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:991)
        at org.apache.ranger.authorization.hadoop.utils.RangerCredentialProvider.getCredentialProviders(RangerCredentialProvider.java:68)
...

/usr/hdp/2.6.0.3-8/ranger-solr-plugin/install/conf.templates/enable/ranger-policymgr-ssl.xml
/opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/WEB-INF/classes/ranger-policymgr-ssl.xml

ln -s /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/WEB-INF/classes/ranger-policymgr-ssl.xml /etc/solr/conf/ranger-policymgr-ssl.xml



5. SLF4J: Failed toString() invocation on an object of type [org.apache.solr.servlet.HttpSolrCall$2]
java.lang.NullPointerException
        at org.apache.solr.servlet.HttpSolrCall$2.toString(HttpSolrCall.java:1001)
        at org.slf4j.helpers.MessageFormatter.safeObjectAppend(MessageFormatter.java:305)
        at org.slf4j.helpers.MessageFormatter.deeplyAppendParameter(MessageFormatter.java:277)
        at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:231)
        at org.slf4j.helpers.MessageFormatter.format(MessageFormatter.java:152)
        at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:345)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:431)
...

https://issues.apache.org/jira/browse/SOLR-10630
https://issues.apache.org/jira/browse/RANGER-1446

プロセス例:

solr      6218  8.5  3.3 5852532 547840 ?      Sl   10:04   0:20 /usr/jdk64/jdk1.8.0_77/bin/java -server -Xms512m -Xmx512m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:MaxDirectMemorySize=20g -XX:+UseLargePages -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/solr/solr_gc.log -DzkClientTimeout=15000 -DzkHost=node13.localdomain:2181/solr -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Dhost=node11.localdomain -Duser.timezone=UTC -Djetty.home=/opt/lucidworks-hdpsearch/solr/server -Dsolr.solr.home=/etc/solr/data_dir -Dsolr.install.dir=/opt/lucidworks-hdpsearch/solr -Dlog4j.configuration=file:/etc/solr/conf/log4j.properties -Xss256k -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://node12.localdomain:8020/solr -Dsolr.hdfs.confdir=/usr/hdp/current/hadoop-client/conf -Dsolr.hdfs.security.kerberos.enabled=true -Dsolr.hdfs.security.kerberos.keytabfile=/etc/security/keytabs/solr.service.keytab -Dsolr.hdfs.security.kerberos.principal=solr/node11.localdomain@EXAMPLE.COM -Dsolr.authentication.httpclient.configurer=org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer -Djava.security.auth.login.config=/etc/solr/conf/solr_server_jaas.conf -Dsolr.kerberos.cookie.domain=node11.localdomain -Dsolr.kerberos.cookie.portaware=true -Dsolr.kerberos.principal=HTTP/node11.localdomain@EXAMPLE.COM -Dsolr.kerberos.keytab=/etc/security/keytabs/spnego.service.keytab -XX:OnOutOfMemoryError=/opt/lucidworks-hdpsearch/solr/bin/oom_solr.sh 8983 /var/log/solr -jar start.jar --module=http

Ambari-infraと共存するには

grep 'ps aux' /opt/lucidworks-hdpsearch/solr/bin/solr | grep 'start'
" | grep -v ambari-infra"を追加する

auth_to_localを設定するには

SOLR_AUTHENTICATION_OPTS=" -DauthenticationPlugin=org.apache.solr.security.KerberosPlugin -Djava.security.auth.login.config=${SOLR_JAAS_FILE} -Dsolr.kerberos.principal=${SOLR_KERB_PRINCIPAL} -Dsolr.kerberos.keytab=${SOLR_KERB_KEYTAB} -Dsolr.kerberos.cookie.domain=${SOLR_HOST} -Dhost=${SOLR_HOST} -Dsolr.kerberos.name.rules=RULE:[1:\$1@\$0](.*EXAMPLE.COM)s/@.*//LDEFAULT"

mpack問題

mpack 2.2.8だと、RangerがSSLを使用しているとうまくいかない模様
そもそもmpackインストール後、security.jsonが変更できない(Ambariのデフォルトに書き換えられる)
Pluginのインストール先:/usr/hdp/2.6.0.3-8/ranger-solr-plugin
不親切にもinstall.propertiesを自分で設定する必要あり

https://issues.apache.org/jira/browse/RANGER-1446
https://issues.apache.org/jira/browse/RANGER-1658
HDP 2.6.3で修正済みの模様

Sandboxにmpackをインストールしようとすると、
ambari-server install-mpack --mpack=/tmp/solr-service-mpack-2.2.9.tar.gz --verbose
...
INFO: Loading properties from /etc/ambari-server/conf/ambari.properties
source:/var/lib/ambari-server/resources/mpacks/solr-ambari-mpack-2.2.9/common-services/SOLR/5.5.4
link_name:/var/lib/ambari-server/resources/common-services/SOLR/5.5.4
Traceback (most recent call last):
  File "/usr/sbin/ambari-server.py", line 941, in <module>
   ... (snip)...
    sudo.symlink(src_path, dest_link)
  File "/usr/lib/python2.6/site-packages/resource_management/core/sudo.py", line 125, in symlink
    os.symlink(source, link_name)
OSError: [Errno 17] File exists

または、
ERROR: Management pack solr-ambari-mpack-2.2.N already installed!
ERROR: Exiting with exit code -1.
REASON: Management pack solr-ambari-mpack-2.2.N already installed!

find /var/lib/ambari-server -type l -ls | grep SOLR
#rm -rf /var/lib/ambari-server/resources/mpacks/solr-ambari-mpack*
#rm /var/lib/ambari-server/resources/common-services/SOLR/5.5.2.2.*
# 多分これだけで大丈夫
rm /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SOLR
#rm -rf /var/lib/ambari-server/resources/extensions/SOLR/*

NOTE: ambari-server backup でバックアップが取れる!

/etc/solr/conf/solr_server_jaas.conf

log4j.logger.org.apache.ranger.authorization.solr.authorizer.RangerSolrAuthorizer=WARN

2017年9月4日月曜日

HDP 2.5.0以降でHive2のHive CLIを使う

なぜかHive2以下のHive CLIを使うと
sudo -u hive /usr/hdp/current/hive-server2-hive2/bin/hive
Please use beeline (or another JDBC client) to access data with Hive 2.

今日時点でApache Hive2のMasterにはこの変更はないのでHDP限定だと思われます。

/usr/hdp/current/hive-server2-hive2/bin/hive.distro
 82 if [ "$SERVICE" = "" ] ; then
 83   if [ "$HELP" = "_help" ] ; then
 84     SERVICE="help"
 85   else
 86     echo "Please use beeline (or another JDBC client) to access data with Hive 2."
 87     SERVICE="cli"    # この行を追加
 88     #exit 1
 89   fi
 90 fi

変更例:
_f=/usr/hdp/current/hive-server2-hive2/bin/hive.distro
_n=`awk "/^[[:blank:]]+echo \"Please use beeline \(or another JDBC client\) to access data with Hive 2.\"/{ print NR; exit }" $_f`
[ -n "$_n" ] && sed -i "$_n,$(( $_n + 1 )) s/exit 1/SERVICE=\"cli\"/" $_f


でも、何かINSERTしようとすると、、、

Vertex killed, vertexName=Reducer 2, vertexId=vertex_1504233014419_0003_1_01, diagnostics=[Vertex received Kill in NEW state., Vertex vertex_1504233014419_0003_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1504233014419_0003_1_00, diagnostics=[Vertex vertex_1504233014419_0003_1_00 [Map 1] killed/failed due to:INIT_FAILURE, Fail to create InputInitializerManager, org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator
        at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:70)
        at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:151)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)
        at org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4031)
        at org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:204)
        at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2855)
        at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2802)
        at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2784)
        at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)
        at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1925)
        at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:203)
        at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2290)
        at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2276)
        at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
        at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)
        ... 25 more
Caused by: java.lang.IllegalArgumentException: No running LLAP daemons! Please check LLAP service status and zookeeper configuration
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
        at org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:47)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.<init>(HiveSplitGenerator.java:121)
        ... 30 more

その場合は、下記の設定でLLAPを無効(disable)できる模様
set hive.execution.mode=container;
TODO: hive.llap.execution.mode=none

sudo -u hive /usr/hdp/current/hive-server2-hive2/bin/hive -hiveconf hive.execution.mode=container

2017年9月1日金曜日

Ambari 2.4.0でHDP Search (Solr)をHDP 2.5.0にインストールする

Ambari 2.4.からManagement Packが採用された模様です。

https://issues.apache.org/jira/browse/AMBARI-14854
https://cwiki.apache.org/confluence/display/AMBARI/Management+Packs

HDP 2.6.1(Ambari 2.5.x)でのインストール方法
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_solr-search-installation/content/ch_hdp-search-install-ambari.html
HDP 2.5.6 でのインストール方法
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_solr-search-installation/bk_solr-search-installation.pdf

で、solr-service-mpack-2.2.8.tar.gzを他のAmbariバージョンに使おうとすると:

[root@node11 tmp]# ambari-server install-mpack --mpack=/tmp/solr-service-mpack-2.2.8.tar.gz
Using python  /usr/bin/python
Installing management pack
ERROR: Prerequisite failure! Current Ambari Version = 2.4.0.1, Min Ambari Version = 2.5.0.0
ERROR: Exiting with exit code -1.
REASON: Prerequisites for management pack solr-ambari-mpack-2.2.8 failed!


そこで、古いバージョンを探す必要があります。
s3cmd ls s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/
2017-04-04 10:32     21688   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.8.tar.gz
2017-04-04 10:33       836   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.8.tar.gz.asc
2017-04-04 10:32       115   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.8.tar.gz.md5
2017-05-02 20:25     22518   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.9.tar.gz
2017-05-02 20:51       836   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.9.tar.gz.asc
2017-05-02 20:51       115   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-2.2.9.tar.gz.md5
2016-08-30 13:26     13290   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-5.5.2.2.5.tar.gz
2016-08-30 13:26       836   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-5.5.2.2.5.tar.gz.asc
2016-08-30 13:26       123   s3://public-repo-1.hortonworks.com/HDP-SOLR/hdp-solr-ambari-mp/solr-service-mpack-5.5.2.2.5.tar.gz.md5

上記の太字のファイルをダウンロード後は基本的にPDFの手順と一緒です。
/var/lib/ambari-server/resources/stacks/HDP/2.5/repos/repoinfo.xmlに下記を追加する必要があります。(2.6は必要ない?)
 <repo>
 <baseurl>http://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/centos6/</baseurl>
 <repoid>HDP-SOLR-2.5-100</repoid>
 <reponame>HDP-SOLR</reponame>
 </repo>

そしてAmbari Serverを再起動後、Add Service WizardからSolrを追加できます。

ちなみに、reposには下記のOSがあります
s3cmd ls s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/centos6/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/centos7/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/debian6/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/debian7/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/suse11sp3/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/ubuntu12/
                       DIR   s3://public-repo-1.hortonworks.com/HDP-SOLR-2.5-100/repos/ubuntu14/