共计 12552 个字符,预计需要花费 32 分钟才能阅读完成。
1 前言
Cloudera 是基于 Apache 原生的 Hadoop 组件进行重新封装和加强,Cloudera 可以简化 Hadoop 组件的部署。
2 理论基础
2.1 部署软件架构
1)Oracle JDK
2)Cloudera Manager Server and Agent packages
3)Supporting database software
4)CDH and managed service software
2.2 部署步骤和安装方法
2.2.1 安装方法
A)Cloudera Manager 安装程序安装(容易)
B)yum 源方式安装(中等)
C)源代码安装(难)
注:本教程使用方法 B
2.2.2 部署步骤
1)安装 JDK
2)安装并配置数据库
3)安装 Cloudera 管理服务端
4)安装 Cloudera 管理代理端
5)安装 CDH 管理服务软件
6)创建、启动和配置 CDH 并管理服务
2.3 Cloudera Manager 端的相关文件
1 | rpm -ql cloudera-manager-server |
显示如下:
1 2 3 4 5 6 7 8 9 10 | /etc/cloudera-scm-server /etc/cloudera-scm-server/db .properties /etc/cloudera-scm-server/log4j .properties /etc/default/cloudera-scm-server /etc/rc .d /init .d /cloudera-scm-server /opt/cloudera/csd /opt/cloudera/parcel-repo /usr/sbin/cmf-server /var/log/cloudera-scm-server /var/run/cloudera-scm-server |
文件与目录功能如下:
1)其中 /etc/ 的 2 - 4 行为 Cloudera Manager 服务端配置文件
2)/opt/cloudera/parcel-repo 为下载安装包存放目录
3 实践部分
3.1 环境信息
3.1.1 系统信息
OS = CentOS 6.6 x86_64
注:系统请使用最小化安装,否则可能 Sqoop 服务可能无法启动
3.1.2 主机信息
Cloudera Manager:
ip address=10.168.0.120
hostname=cdm-m.cmdschool.org
Cloudera Host1:
ip address=10.168.0.121
hostname=cdm-h1.cmdschool.org
Cloudera Host2:
ip address=10.168.0.122
hostname=cdm-h2.cmdschool.org
Cloudera Host3:
ip address=10.168.0.123
hostname=cdm-h3.cmdschool.org
Cloudera Host4:
ip address=10.168.0.124
hostname=cdm-h4.cmdschool.org
3.2 运行环境配置
In Cloudera Manager & Cloudera Host[1-4]
3.2.1 关闭 selinux
1 | getenforce |
如果显示如下:
1 | Enforcing |
则执行:
1 2 | setenforce 0 sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config |
3.2.2 配置 hosts
vim 编辑 /etc/hosts
1 2 3 4 5 | 10.168.0.120 cdm-m.cmdschool.org 10.168.0.121 cdm-h1.cmdschool.org 10.168.0.122 cdm-h2.cmdschool.org 10.168.0.123 cdm-h3.cmdschool.org 10.168.0.124 cdm-h4.cmdschool.org |
3.2.3 检查主机名称
1 | hostname |
注:主机名称与上面不一致会影响服务的启动
3.2.4 配置 sudo(单用户模式适用,可选)
1 | visudo |
增加如下组
1 | %cloudera-scm ALL=(ALL) NOPASSWD: ALL |
确认包含如下行:
1 | Defaults secure_path = /sbin : /bin : /usr/sbin : /usr/bin |
vim 编辑 /etc/pam.d/su,确保包含如下行
1 | session required pam_limits.so |
3.2.5 关闭防火墙并设置开机不启动
1 2 | /etc/init .d /iptables stop chkconfig iptables off |
3.2.6 优化虚拟内存需求率
1) 检查虚拟内存需求率
1 | cat /proc/sys/vm/swappiness |
显示如下:
1 |
2) 临时降低虚拟内存需求率
1 | sysctl vm.swappiness=0 |
3) 永久降低虚拟内存需求率
vim 编辑 /etc/sysctl.conf
1 2 | kernel.shmall = 4294967296 vm.swappiness = 0 |
并运行如下命令使生效
1 | sysctl -p |
3.2.7 解决透明大页面问题
1) 检查透明大页面问题
1 | cat /sys/kernel/mm/transparent_hugepage/defrag |
如果显示为:
1 | [always] madvise never |
2) 临时关闭透明大页面问题
1 | echo never > /sys/kernel/mm/transparent_hugepage/defrag |
确认配置生效:
1 | cat /sys/kernel/mm/transparent_hugepage/defrag |
应该显示为:
1 | always madvise [never] |
3) 配置开机自动生效
vim 编辑 /etc/rc.local, 加入如下内容
1 | echo never > /sys/kernel/mm/transparent_hugepage/defrag |
3.3 yum 源的安装配置
3.3.1 公共 yum 源配置
In Cloudera Manager & Cloudera Host[1-4]
1) 配置 yum 源
下载默认 yum 源
1 | wget -P /etc/yum .repos.d/ https: //archive .cloudera.com /cm5/RedHat/6/x86_64/cm/cloudera-manager .repo |
修改为指定版本 yum 源
vim 编辑 /etc/yum.repos.d/cloudera-manager.repo 修改如下参数:
1 | baseurl=https: //archive .cloudera.com /cm5/redhat/6/x86_64/cm/5 .6.0/ |
2) 安装配置工具
1 | yum install -y vim wget openssh-clients |
3) 安装 jdk
1 | yum install -y oracle-j2sdk1.7 |
4) 安装 Python
1 | yum install -y python |
5) 安装 ntpd
1 | yum install -y ntp |
3.3.2 Cloudera Manager 端 yum 源配置
In Cloudera Manager
1) 安装 Cloudera Manager 包
1 | yum install -y cloudera-manager-daemons cloudera-manager-server |
2) 安装 mysql
1 | yum install -y mysql-server mysql-devel mysql |
3.3.3 Cloudera Manager Agent 端 yum 源配置
In Cloudera Host[1-4]
安装 Cloudera Manager Agent 包
1 | yum install -y cloudera-manager-agent cloudera-manager-daemons |
3.4 依赖于 yum 源的环境配置
3.4.1 配置 jdk 变量环境
In Cloudera Manager & Cloudera Host[1-4]
1)vim 编辑 /etc/profile,末尾加入如下内容
1 2 3 4 | export Java_HOME= /usr/java/jdk1 .7.0_67-cloudera export JRE_HOME=${JAVA_HOME} /jre export CLASSPATH=.:${JAVA_HOME} /lib :${JRE_HOME} /lib export PATH=${JAVA_HOME} /bin :$PATH |
2) 导入 java 环境变量
1 | source /etc/profile |
3) 测试 jdk 的配置
1 | java -version |
3.4.2 权限检查 (单用户模式适用,可选)
In Cloudera Manager & Cloudera Host[1-4]
检查以下目录 cloudera-scm 用户具有完全的权限
检查当前目录权限:
1 | ls -ld /opt/cloudera/ |
显示如下
1 | drwxr-xr-x. 4 cloudera-scm cloudera-scm 4096 May 23 13:51 /opt/cloudera/ |
检查子目录权限:
1 | ls -lR /opt/cloudera/ |
显示如下
1 2 3 4 5 6 7 8 9 10 | /opt/cloudera/ : total 8 drwxr-xr-x. 2 cloudera-scm cloudera-scm 4096 Feb 12 11:28 csd drwxr-xr-x. 2 cloudera-scm cloudera-scm 4096 Feb 12 11:28 parcel-repo /opt/cloudera/csd : total 0 /opt/cloudera/parcel-repo : total 0 |
同样,检查服务器或客户端目录权限
1 2 3 4 | ls -ld /var/log/cloudera-scm-server/ ls -lR /var/log/cloudera-scm-server/ ls -ld /var/lib/cloudera-scm-agent/ ls -lR /var/lib/cloudera-scm-agent/ |
3.4.3 检查线程限制配置
In Cloudera Manager & Cloudera Host[1-4]
1 | cat /etc/security/limits .d /cloudera-scm .conf |
显示如下:
1 2 3 4 5 6 7 8 | # # (c) Copyright 2014 Cloudera, Inc. # cloudera-scm soft nofile 32768 cloudera-scm soft nproc 65536 cloudera-scm hard nofile 1048576 cloudera-scm hard nproc unlimited cloudera-scm hard memlock unlimited |
3.4.4 Cloudera Manager 端配置
In Cloudera Manager
1) 临时校对时间
1 | ntpdate 0.centos.pool.ntp.org |
2) 启动并配置 ntpd 服务自动启动
1 2 | /etc/init .d /ntpd start chkconfig ntpd on |
3.4.5 Cloudera Manager Agen 端配置
In Cloudera Host[1-4]
1) 临时校对时间
1 | ntpdate 10.168.0.120 |
2)vim 编辑 /etc/ntp.conf
注释掉外网时间服务器并增加内网时间服务器地址
1 2 3 4 5 | #server 0.centos.pool.ntp.org iburst #server 1.centos.pool.ntp.org iburst #server 2.centos.pool.ntp.org iburst #server 3.centos.pool.ntp.org iburst server 10.168.0.120 iburst |
3) 启动并配置 ntpd 服务自动启动
1 2 | /etc/init .d /ntpd start chkconfig ntpd on |
3.4.7 安装 MySQL JDBC Driver
In Cloudera Manager & Cloudera Host[1-4]
1 2 3 4 | wget http: //dev .mysql.com /get/Downloads/Connector-J/mysql-connector-java-5 .1.39. tar .gz tar zxvf mysql-connector-java-5.1.39. tar .gz mkdir /usr/share/java/ cp mysql-connector-java-5.1.39 /mysql-connector-java-5 .1.39-bin.jar /usr/share/java/mysql-connector-java .jar |
3.4.8 配置公钥认证
In Cloudera Manager
In Cloudera Manager:
1 | ssh -keygen -t rsa |
注:以上一路回车
In Cloudera Manager Agen:
1 2 3 4 5 | ssh -copy- id -i ~/. ssh /id_rsa .pub root@10.168.0.120 ssh -copy- id -i ~/. ssh /id_rsa .pub root@10.168.0.121 ssh -copy- id -i ~/. ssh /id_rsa .pub root@10.168.0.122 ssh -copy- id -i ~/. ssh /id_rsa .pub root@10.168.0.123 ssh -copy- id -i ~/. ssh /id_rsa .pub root@10.168.0.124 |
In Cloudera Manager:
1 2 3 4 5 | ssh 10.168.0.120 ssh 10.168.0.121 ssh 10.168.0.122 ssh 10.168.0.123 ssh 10.168.0.124 |
注:以上如果无需密码登记即成功
3.5 Cloudera Manager 安装配置
In Cloudera Manager
3.5.1 修改 mysql 参数
1) 关闭数据库
1 | /etc/init .d /mysqld stop |
2) 备份 ib_logfile 文件
1 2 3 | mkdir /var/lib/backup cd /var/lib/mysql/ mv ib_logfile* /var/lib/backup/ |
3)vim 编辑 /etc/my.cnf
加入如下参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | [mysqld] transaction-isolation = READ-COMMITTED # Disabling symbolic-links is recommended to prevent assorted security risks; # to do so, uncomment this line: # symbolic-links = 0 key_buffer = 16M key_buffer_size = 32M max_allowed_packet = 32M thread_stack = 256K thread_cache_size = 64 query_cache_limit = 8M query_cache_size = 64M query_cache_type = 1 max_connections = 550 #expire_logs_days = 10 #max_binlog_size = 100M #log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system #and chown the specified folder to the mysql user. log_bin= /var/lib/mysql/mysql_binary_log # For MySQL version 5.1.8 or later. Comment out binlog_format for older versions. binlog_format = mixed read_buffer_size = 2M read_rnd_buffer_size = 16M sort_buffer_size = 8M join_buffer_size = 8M # InnoDB settings innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size = 64M innodb_buffer_pool_size = 4G innodb_thread_concurrency = 8 innodb_flush_method = O_DIRECT innodb_log_file_size = 512M [mysqld_safe] log-error= /var/log/mysqld .log pid- file = /var/run/mysqld/mysqld .pid sql_mode=STRICT_ALL_TABLES |
3.5.2 启动并设置开机自动启动
1 2 | /etc/init .d /mysqld start chkconfig mysqld on |
3.5.3 初始化数据库
1 | mysql_secure_installation |
向导如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | [...] Enter current password for root (enter for none): OK, successfully used password, moving on... [...] Set root password? [Y /n ] y New password: Re-enter new password: Remove anonymous users ? [Y /n ] y [...] Disallow root login remotely? [Y /n ] n [...] Remove test database and access to it [Y /n ] y [...] Reload privilege tables now? [Y /n ] y All done ! |
3.5.4 准备 scm 库
1) 方法一
数据库配置:
1 2 3 4 | mysql -uroot -p create database scm default character set utf8; grant all privileges on *.* to scm@ 'cdm-m.cmdschool.org' identified by 'scm' ; flush privileges; |
vim 编辑 /etc/cloudera-scm-server/db.properties 修改如下参数:
1 2 3 4 5 | com.cloudera.cmf.db. type =mysql com.cloudera.cmf.db.host=cdm-m.cmdschool.org com.cloudera.cmf.db.name=scm com.cloudera.cmf.db.user=scm com.cloudera.cmf.db.password=scm |
2) 方法二 (官方建议)
授权 temp 权限:
1 2 3 | mysql -uroot -p grant all privileges on *.* to 'temp' @ '%' identified by 'temp' with grant option; flush privileges; |
生成配置文件:
1 | /usr/share/cmf/schema/scm_prepare_database .sh mysql -h cdm-m.cmdschool.org -utemp -ptemp --scm-host cdm-m.cmdschool.org scm scm scm |
显示如下:
1 2 3 4 5 6 | JAVA_HOME= /usr/java/jdk1 .7.0_67-cloudera Verifying that we can write to /etc/cloudera-scm-server Creating SCM configuration file in /etc/cloudera-scm-server Executing: /usr/java/jdk1 .7.0_67-cloudera /bin/java - cp /usr/share/java/mysql-connector-java .jar: /usr/share/java/oracle-connector-java .jar: /usr/share/cmf/schema/ .. /lib/ * com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db .properties com.cloudera.cmf.db. [main] DbCommandExecutor INFO Successfully connected to database. All done , your SCM database is configured correctly! |
确认生成的结果:
1 | cat /etc/cloudera-scm-server/db .properties |
显示如下:
1 2 3 4 5 6 7 8 9 10 | # Auto-generated by scm_prepare_database.sh on Tue May 24 19:08:19 CST 2016 # # For information describing how to configure the Cloudera Manager Server # to connect to databases, see the "Cloudera Manager Installation Guide." # com.cloudera.cmf.db. type =mysql com.cloudera.cmf.db.host=cdm-m.cmdschool.org com.cloudera.cmf.db.name=scm com.cloudera.cmf.db.user=scm com.cloudera.cmf.db.password=scm |
确认库访问权限:
1 2 | mysql -uroot -p show grants for scm@ 'cdm-m.cmdschool.org' ; |
显示如下:
1 2 3 4 5 6 7 | +----------------------------------------------------------------------------------------------------------------------+ | Grants for scm@cdm-m.cmdschool.org | +----------------------------------------------------------------------------------------------------------------------+ | GRANT USAGE ON *.* TO 'scm' @ 'cdm-m.cmdschool.org' IDENTIFIED BY PASSWORD '*45E6E3C68BDF1AC7EBB5C5A3BCBD5E9437B293BE' | | GRANT ALL PRIVILEGES ON `scm`.* TO 'scm' @ 'cdm-m.cmdschool.org' | +----------------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec) |
清理数据库用户授权:
1 2 | drop user 'temp' @ '%' ; flush privileges; |
3.5.5 创建附加数据库 (可选)
1) 附加数据库列表
Role | Database | User | Password |
Activity Monitor | amon | amon | amon_password |
Reports Manager | rman | rman | rman_password |
Hive Metastore Server | metastore | hive | hive_password |
Sentry Server | sentry | sentry | sentry_password |
Cloudera Navigator Audit Server | nav | nav | nav_password |
Cloudera Navigator Metadata Server | navms | navms | navms_password |
2) 创建数据库并配置管理账号密码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | mysql -uroot -p create database amon default character set utf8; grant all privileges on amon.* to 'amon' @ '%' identified by 'amon_password' ; create database rman default character set utf8; grant all privileges on rman.* to 'rman' @ '%' identified by 'rman_password' ; create database metastore default character set utf8; grant all privileges on metastore.* to 'hive' @ '%' identified by 'hive_password' ; create database sentry default character set utf8; grant all privileges on sentry.* to 'sentry' @ '%' identified by 'sentry_password' ; create database nav default character set utf8; grant all privileges on nav.* to 'nav' @ '%' identified by 'nav_password' ; create database navms default character set utf8; grant all privileges on navms.* to 'navms' @ '%' identified by 'navms_password' ; flush privileges; |
3.5.6 配置 Oozie 库 (可选)
1) 数据库权限配置
1 2 3 4 5 | mysql -uroot -p create database oozie default character set utf8; grant all privileges on oozie.* to 'oozie' @ 'localhost' identified by 'oozie' ; grant all privileges on oozie.* to 'oozie' @ '%' identified by 'oozie' ; flush privileges; |
2) 配置 Oozie 库所需软连接
1 2 | cd /opt/cloudera/parcels/CDH/lib/oozie/lib/ ln -s /usr/share/java/mysql-connector-java .jar mysql-connector-java.jar |
3.5.7 启动服务并配置开机启动
1 2 | /etc/init .d /cloudera-scm-server start chkconfig cloudera-scm-server on |
3.5.8 故障排除
1 | tail -f /var/log/cloudera-scm-server/cloudera-scm-server .out |
3.6 Cloudera Manager Agent 安装
In Cloudera Host[1-4]
3.6.1 创建压缩包存放目录
1 2 | mkdir -p /opt/cloudera/parcels chown cloudera-scm:cloudera-scm /opt/cloudera/parcels |
3.6.2 指定管理服务器和指定包存放目录
vim 编辑 /etc/cloudera-scm-agent/config.ini 确保参数如下并启用:
1 2 3 | server_host=cdm-m.cmdschool.org server_port=7182 parcel_dir= /opt/cloudera/parcels |
3.6.3 指定运行单用户模式的用户名 (仅用于单用户模式,不配置)
vim 编辑 /etc/default/cloudera-scm-agent 并取消以下行的注释
1 | USER= "cloudera-scm" |
3.6.4 启动服务并配置服务器开机启动
1 2 | /etc/init .d /cloudera-scm-agent start chkconfig cloudera-scm-agent on |
3.6.5 故障排除
如下命令监控启动服务的错误输出
1 | tail -f /var/log/cloudera-scm-agent/cloudera-scm-agent .out |
3.7 登陆配置
In Cloudera WEB Manager
http://10.168.0.120:7180/
界面配置部分本章节省略……
更多 Hadoop 相关信息见 Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13
本文永久更新链接地址 :http://www.linuxidc.com/Linux/2016-06/132597.htm