阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

CentOS 7下Greenplum 源码安装教程

291次阅读
没有评论

共计 12498 个字符,预计需要花费 32 分钟才能阅读完成。

集群组成:

一台主机,一台从节点。

系统环境:

操作系统:CentOS 7,64 位,7.4.1708(/etc/RedHat-release 中查看)

CPU:AMD Fx-8300 8 核

内存:8GB

硬盘:120GB

GNOME:3.22.2

安装版本:

GPDB:V5.4.1

GPORCA:V2.53.11

前提条件:禁用防火墙(所有节点和主机都要禁用!!)

使用 root 账号执行下列命令(同时禁用默认的防火墙和可能已经安装的 iptables,共两个防火墙程序):

关闭默认的防火墙

# systemctl stop firewalld

屏蔽默认的防火墙(重启后也不会启动)

# systemctl mask firewalld

关闭 iptables

# systemctl stop iptables

禁用 iptables

# systemctl disable iptables

安装过程

一)创建专有账号 gpdba,并加入 root 用户组。

下面所有操作都使用 gpdba 账号来执行!如果操作失败,则使用 root 账号。

二)修改所有服务器的主机名(所有节点和主机)

1)修改 hosts 使用命令 vi /etc/hosts 来修改

127.0.0.1 localhost localhost.localdomain

192.168.58.102 Master shsm002

192.168.58.104 Slave1 shsm004

最后,再输入 source /etc/profile 刷新。

2)修改 network 文件,输入命令vi /etc/sysconfig/network

NETWORKING=yes
HOSTNAME= 对应的主机名称

3)如果主机名称与设备名称不符,则按照下列格式修改:

127.0.0.1 localhost localhost.localdomain

IP 地址 主机名称 设备名称
最后使用 ping 命令验证是否可以连通。

三)修改系统文件(所有节点和主机)

1)修改内核配置

vi /etc/sysctl.conf,添加下面内容:

kernel.shmmax = 5000000000

kernel.shmmni = 4096

kernel.shmall = 4000000000

kernel.sem = 250 512000 100 2048

kernel.sysrq = 1

kernel.core_uses_pid = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1

net.ipv4.ip_forward = 0

net.ipv4.conf.default.accept_source_route = 0

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_max_syn_backlog = 4096

net.ipv4.conf.all.arp_filter = 1

net.ipv4.ip_local_port_range = 1025 65535

net.core.netdev_max_backlog = 10000

net.core.rmem_max = 2097152

net.core.wmem_max = 2097152

vm.overcommit_memory = 2

执行命令 sysctl -p 使修改数值生效

2)修改限制配置

vi /etc/security/limits.conf

添加下面内容:

* soft nofile 65536

* hard nofile 65536

* soft nproc 131072

* hard nproc 131072
3)禁用 SELINUX

vi /etc/selinux/config,修改 SELINUX 的值为 disabled。修改后,如下:

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

# enforcing – SELinux security policy is enforced.

# permissive – SELinux prints warnings instead of enforcing.

# disabled – No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

# targeted – Targeted processes are protected,

# mls – Multi Level Security protection.

SELINUXTYPE=targeted

三)安装优化器 GPORCA 的依赖项(所有节点和主机)

1)安装 cmake(3.10.2)

下载:
$ wget http://www.cmake.org/files/v3.10/cmake-3.10.2.tar.gz
解压:
$ tar xzf cmake-3.10.2.tar.gz

定位到解压后的目录中:
$ cd cmake-3.10.2
关于 configure 命令:
如果要查看详细的配置选项,使用下面命令:

$ ./configure –help
执行配置命令(安装到目录 /usr/cmake):

$ ./configure –prefix=/usr/cmake

编译:
$ make
安装:
# make install
最后进行验证:
$ /usr/cmake/bin/cmake -version

输出类似下面内容显示出版本号:
cmake version 3.10.2

编辑修改 /etc/profile 文件,将 cmake 添加到环境变量定义中,添加下面内容:

### CMAKE 3.10 ###
export PATH=/usr/cmake/bin:$PATH

2)安装 gp-xerces

使用 gpdba 账号解压源码文件压缩包,进入解压目录,执行下面命令。
mkdir build
cd build
../configure –prefix=/usr/local  ## 安装到 /usr/local 目录下
(注意:如果出错,则使用 root 账号执行下面的 make 命令)
make
make install

3)安装 re2c(1.0.3)

进入 http://re2c.org/install/install.html 页面下载自己需要的版本
安装 re2c 是由于配置 ninja 时需要
$ ./configure –prefix=/usr/local
(注意:使用 root 账号执行下面的 make 命令;如果用户没有在 root 用户组中时)
$ make
$ make install

4)安装 Ninja

可以使用 git 下载:https://github.com/ninja-build/ninja.git
下载后进入 ninja 目录执行如下命令:
./configure.py –bootstrap
由于最终结果只是一个二进制文件 ninja,之后使用 root 账号拷贝 ninja 文件到 /usr/bin 目录即可(/usr/bin 目录已经在环境变量 PATH 中配置定义了)
Installation is not necessary because the only required file is the resulting ninja binary. However, to enable features like Bash completion and Emacs and Vim editing modes, some files in misc/ must be copied to appropriate locations.

特别说明:先在主机上安装所有依赖项的程序,然后通过 scp 命令远程复制安装包或压缩包到其他节点上逐个执行安装。

四)安装 GPORCA

下载地址:https://github.com/greenplum-db/gporca

安装 GPORCA(GPDB-5.4.1 对应的依赖版本,2.53.11)
使用 gpdba 账号解压源码文件压缩包,进入解压目录,执行下面命令。
cmake -GNinja -H. -Bbuild
ninja install -C build

查看 GPDB 依赖的 ORCA 的版本信息:/gpdb-5.4.1/depends/conanfile_orca.txt 文件
[requires]
orca/v2.53.11@gpdb/stable

安装完成后,进入 /gporca/build 目录,执行 ctest 命令进行检查
如果最后输出类似如下结果:
100% tests passed, 0 tests failed out of 119

Total Test time (real) = 195.48 sec
这说明编译成功了。

【删除旧版的 GPORCA】
进入源文件目录下,执行命令
rm -rf build/*
rm -rf /usr/local/include/naucrates
rm -rf /usr/local/include/gpdbcost
rm -rf /usr/local/include/gpopt
rm -rf /usr/local/include/gpos
rm -rf /usr/local/lib/libnaucrates.so*
rm -rf /usr/local/lib/libgpdbcost.so*
rm -rf /usr/local/lib/libgpopt.so*
rm -rf /usr/local/lib/libgpos.so*

五)安装 GPDB(选择版本 5.4.1)

1)使用 root 账号安装依赖项

sudo yum install -y epel-release

sudo yum install -y apr-devel bison bzip2-devel cmake3 flex gcc gcc-c++ krb5-devel libcurl-devel libevent-devel libkadm5 libyaml-devel libxml2-devel perl-ExtUtils-Embed Python-devel python-paramiko python-pip python-psutil python-setuptools readline-devel xerces-c-devel zlib-devel

# Install lockfile with pip because the yum package `python-pip` is too old (0.8).
sudo pip install lockfile conan

2)下载源代码文件,解压后编译安装。

使用 gpdba 账号进入下载解压的源文件目录下,执行命令(prefix 后面的路径 /usr/gpdb 是安装目录)
./configure –with-perl –with-python –with-libxml –with-gssapi –prefix=/usr/gpdb
如果没有安装 ORCA,则可以使用:./configure –with-perl –with-python –with-libxml –with-gssapi –disable-orca –prefix=/usr/gpdb

然后执行 make
make -j8

最后执行安装
make -j8 install

3)分发

首先,创建服务器之间的 ssh 免密连接。

创建目录 /usr/gpdb-conf,在该目录中创建主机清单文件 hostlist,文件内容如下:

Master

Salve1

然后继续在 gpdb-conf 目录中创建 seg_hosts,文件内容如下:

Slave1

刷新 greenplum_path 的配置

source /usr/gpdb/greenplum_path.sh

gpssh 交换密钥

gpssh-exkeys -f /usr/gpdb-conf/hostlist

 

最后,将安装成功的文件夹压缩打包

gtar -cvf /home/gpdba/gpdb-install-binary-5.4.1.tar /usr/gpdb

使用 gpscp 命令复制到其他节点上(或者先 ssh 后 scp 也可以)

gpscp -f /usr/gpdb-conf/seg_hosts /home/gpdba/gpdb-install-binary-5.4.1.tar =:/usr

使用 gpssh 连接主机与从节点,解压 tar 文件,安装路径同主机的安装路径保持一致。

gpssh -f /usr/gpdb-conf/hostlist

master 节点连接 slave 节点之后,执行所有命令都应该有 n 份输出才表示正常。

解压文件

gtar -xvf gpdb-install-binary-5.4.1.tar

创建数据库工作目录

cd /home/gpdba/gpdata

mkdir gpdatap1 gpdatap2 gpdatam1 gpdatam2 gpmaster

4)初始化数据库(在 master 主机)

配置 bash_profile 环境变量

vi .bash_profile

修改如下:

# .bash_profile

# Get the aliases and functions
if [-f ~/.bashrc]; then
. ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH

## Greenplum Database
source /usr/gpdb/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpdba/gpdata/gpmaster/gpseg-1
export PGPORT=2346
export PGDATABASE=testDB

保存后,刷新生效:

. ~/.bash_profile

配置数据库的启动参数

将 /usr/gpdb/docs/cli_help/gpconfigs/gpinitsystem_config 文件 复制到 /usr/gpdb-conf 目录下然后编辑,保留如下内容:

# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME=”Greenplum Data Platform”

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers
#### are calculated.
PORT_BASE=40000

#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary)

#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=mdw

#### File system location where the master data directory
#### will be created.
MASTER_DIRECTORY=/data/master

#### Port number for the master instance.
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.
ENCODING=UNICODE

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers
#### are calculated.
#MIRROR_PORT_BASE=50000

#### Base number by which primary file replication port
#### numbers are calculated.
#REPLICATION_PORT_BASE=41000

#### Base number by which mirror file replication port
#### numbers are calculated.
#MIRROR_REPLICATION_PORT_BASE=51000

#### File system location(s) where mirror segment data directories
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the
#### DATA_DIRECTORY parameter.
#declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror)

################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.
#DATABASE_NAME=name_of_database

#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem

最后,执行命令开始初始化:

gpinitsystem -c /usr/gpdb-conf/gpinitsystem_config -a

 

特别说明:如果初始化执行失败之后,再次执行初始化,则需要先执行下面命令进行环境重置:

查询并关闭配置指定端口的 postgres 进程

删除生成的未完成的数据库文件(可能是所有节点服务器),/home/gpdba/gpdata/gpmaster/gpseg- 1 文件夹。

六)错误解决

错误:
[gpdba@shsm002 ~]$ gpssh-exkeys -f /usr/gpdb-conf/hostlist
Error: unable to import module: version conflict: ‘/usr/lib64/python2.7/site-packages/psutil/_psutil_linux.so’ C extension module was built for another version of psutil (different than 2.2.1)
解决:重新安装 psutil。sudo pip install psutil==2.2.1

错误:
20180129:23:40:43:gpinitsystem:shsm002:gpdba-[FATAL]:-Found indication of postmaster process on port 2345 on Master host Script Exiting!
解决:关闭杀死占用端口 2345 的进程。
先查询进程
$ lsof -i:2345

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME

postgres 10738 gpadmin 3u IPv4 264510 0t0 TCP *:postgres (LISTEN)
postgres 10738 gpadmin 4u IPv6 264511 0t0 TCP *:postgres (LISTEN)
然后杀死进程
$ kill -9 10738

错误:
20180207:00:14:09:005166 gpinitsystem:shsm002:gpdba-[INFO]:-Building the Master instance database, please wait…
20180207:00:14:17:005166 gpinitsystem:shsm002:gpdba-[INFO]:-Starting the Master in admin mode
20180207:00:14:23:gpinitsystem:shsm002:gpdba-[FATAL]:-Unknown host shsm004 Script Exiting!
20180207:00:14:23:005166 gpinitsystem:shsm002:gpdba-[WARN]:-Script has left Greenplum Database in an incomplete state
原因:hostname 与用户账号的 @后面的主机名称不一致,hosts 定义中也没有 shsm004,添加进去即可。
解决:修改 hosts 文件,每行记录为:IP 地址 主机名 域名。将 hostname 数值 shsm004 放到域名字段保存即可。使用 ping 命令可以 ping 通。

 

错误:
20180207:00:05:00:003516 gpinitsystem:shsm002:gpdba-[INFO]:-Checking Master host
20180207:00:05:00:003516 gpinitsystem:shsm002:gpdba-[WARN]:-Have lock file /tmp/.s.PGSQL.2346.lock but no process running on port 2346
20180207:00:05:00:gpinitsystem:shsm002:gpdba-[FATAL]:-Found indication of postmaster process on port 2346 on Master host Script Exiting!
解决:删除文件,rm /tmp/.s.PGSQL.2346.lock。

错误:
[gpdba@shsm002 ~]$ /bin/bash /home/gpdba/gpAdminLogs/backout_gpinitsystem_gpdba_20180207_225128
[FATAL]:-Not on original master host Master, backout script exiting!
解决:不使用这个脚本清理中间数据,直接删除 gpdata 目录下的未完成的数据库文件即可。

 

错误:
20180207:23:39:31:028691 gpcreateseg.sh:shsm002:gpdba-[INFO][1]:-Start Function PROCESS_QE
20180207:23:39:31:028691 gpcreateseg.sh:shsm002:gpdba-[INFO][1]:-Processing segment Slave1
/usr/gpdb/bin/postgres: error while loading shared libraries: libgpopt.so.3: cannot open shared object file: No such file or directory
no data was returned by command “”/usr/gpdb/bin/postgres” -V”
The program “postgres” is needed by initdb but was either not found in the same directory as “/usr/gpdb/bin/initdb” or failed unexpectedly.
Check your installation; “postgres -V” may have more information.
/usr/gpdb/bin/postgres: error while loading shared libraries: libgpopt.so.3: cannot open shared object file: No such file or directory
no data was returned by command “”/usr/gpdb/bin/postgres” -V”
The program “postgres” is needed by initdb but was either not found in the same directory as “/usr/gpdb/bin/initdb” or failed unexpectedly.
Check your installation; “postgres -V” may have more information.
cat: /home/gpdba/gpdata/gpdatap1/gpseg0.initdb: No such file or directory
cat: /home/gpdba/gpdata/gpdatap2/gpseg1.initdb: No such file or directory
解决:修改 /usr/gpdb/greenplum_path.sh 文件,添加 libgpopt.so.3 文件所在目录到环境变量 LD_LIBRARY_PATH 定义中,然后执行 source 命令刷新(在重启电脑之前,可能每次打开终端命令行时都需要手动刷新一下)。修改后的文件内容如下:

GPHOME=/usr/gpdb

# Replace with symlink path if it is present and correct
if [-h ${GPHOME}/../greenplum-db ]; then
GPHOME_BY_SYMLINK=`(cd ${GPHOME}/../greenplum-db/ && pwd -P)`
if [x”${GPHOME_BY_SYMLINK}” = x”${GPHOME}” ]; then
GPHOME=`(cd ${GPHOME}/../greenplum-db/ && pwd -L)`/.
fi
unset GPHOME_BY_SYMLINK
fi
#setup PYTHONHOME
if [-x $GPHOME/ext/python/bin/python]; then
PYTHONHOME=”$GPHOME/ext/python”
fi
PYTHONPATH=$GPHOME/lib/python
PATH=$GPHOME/bin:$PYTHONHOME/bin:$PATH
LD_LIBRARY_PATH=$GPHOME/lib:/usr/local/lib:${LD_LIBRARY_PATH-}
export LD_LIBRARY_PATH
OPENSSL_CONF=$GPHOME/etc/openssl.cnf
export GPHOME
export PATH
export PYTHONPATH
export PYTHONHOME
export OPENSSL_CONF

错误:
20180208:01:57:59:012804 gpinitsystem:shsm002:gpdba-[INFO]:-Start Function CREATE_DATABASE
psql: FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1513)
20180208:01:58:00:012804 gpinitsystem:shsm002:gpdba-[INFO]:-Start Function ERROR_CHK
20180208:01:58:00:012804 gpinitsystem:shsm002:gpdba-[INFO]:-End Function ERROR_CHK
20180208:01:58:00:012804 gpinitsystem:shsm002:gpdba-[INFO]:-Start Function ERROR_EXIT
20180208:01:58:00:gpinitsystem:shsm002:gpdba-[FATAL]:-Failed to complete create database testDB Script Exiting!
解决:关闭并禁用防火墙(所有的防火墙程序)
运行命令:
# systemctl stop firewalld
# systemctl mask firewalld
# systemctl stop iptables
# systemctl disable iptables
另一种方法供参考:shared_buffers 设置太大,对于如何根据自己内存和 segment 节点个数分配 shared_buffers, 可以去官网找一下,通常出去 2g 的 other,以及 statement_mem * segment 个数,剩下的除以 segment 的个数即可。这种情况通常出现中安装过程中就设置了 shared_buffers,一般默认的 125MB。

本文永久更新链接地址:http://www.linuxidc.com/Linux/2018-02/150872.htm

正文完
星哥玩云-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-22发表,共计12498字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中