阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Greenplum 5.21.1 集群安装部署详述

235次阅读
没有评论

共计 13022 个字符,预计需要花费 33 分钟才能阅读完成。

简单来说 GPDB 是一个分布式数据库软件,其可以管理和处理分布在多个不同主机上的海量数据。对于 GPDB 来说,一个 DB 实例实际上是由多个独立的 PostgreSQL 实例组成的,它们分布在不同的物理主机上,协同工作,呈现给用户的是一个 DB 的效果。Master 是 GPDB 系统的访问入口,其负责处理客户端的连接及 SQL 命令、协调系统中的其他 Instance(Segment) 工作,Segment 负责管理和处理用户数据。

Greenplum 5.21.1 集群安装部署详述

环境准备:
操作系统:CentOS Linux release 7.6.1810 (Core) 64 位
master 1 台(架构图中的主节点),Standby 1 台(架构图中的从节点),Segment 2 台。共 4 台服务器。

一、Master 主机 Root 用户上操作

1. 修改 /etc/hosts 文件,添加下面内容(注:4 台服务器相同的配置)

vim /etc/hosts

192.168.18.130 gp-master
192.168.18.131 gp-standby
192.168.18.132 gp-node1
192.168.18.133 gp-node2

2. 服务器关闭 selinux,防火墙 4 台服务器相互开放,测试环境可以直接先关闭防火墙。(注:4 台服务器相同的配置)

关闭 Firewalld

systemctl stop firewalld
systemctl disable firewalld

永久关闭 Selinux

vim /etc/selinux/conf

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected. 
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

注:查看 Selinux 运行状态:getenforce,CLI 界面非永久生效设置 SeLinux:setenforce 0(0- 1 对应关闭和开启)

3. 操作系统参数设置

 vim /etc/sysctl.conf(注:4 台服务器相同的配置)

kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.default.arp_filter = 1
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2
kernel.msgmni = 2048
net.ipv4.ip_local_port_range = 1025 65535

 vim /etc/security/limits.conf(注:4 台服务器相同的配置)

* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

磁盘预读参数及 deadline 算法修改(注:4 台服务器相同的配置)

blockdev --setra 65536 /dev/sda
echo deadline > /sys/block/sda/queue/scheduler

注:磁盘盘符 sda 需根据自己的实际情况进行配置

软件下载地址:https://network.pivotal.io/products/pivotal-gpdb,下载:greenplum-db-5.21.1-rhel7-x86_64.rpm

在 Master 主机上安装 GP 二进制文件,也就是主机名是 mdw 的服务器。(注:在 master 上安装即可,后面通过批量的方法安装剩下的服务器)

rpm -ivh greenplum-db-5.21.1-rhel7-x86_64.rpm

注:默认安装目录:/usr/local

 在 Master 上添加 gpadmin 用户

adduser gpadmin
echo gpadmin | passwd --stdin gpadmin

 注:设置密码为了后面 gpssh-exkeys -f hostfile_allhosts 使用

 在 Master 上给 gpadmin 用户提权

[root@gp-master ~]# visudo  
gpadmin    ALL=(ALL)       ALL
gpadmin    ALL=(ALL)       NOPASSWD:ALL

在 Master 主机上赋予 gpadmin 用户 Greenplum 文件夹的的权限

chown -R gpadmin.gpadmin /usr/local/greenplum-db*

  二、Master 主机 Gpadmin 用户上操作

准备用于批量安装软件以及后续集群的初始化文件,hostfile_allhosts,hostfile_segments,hostfile_mshosts,存放到 /home/gpadmin

su - gpadmin

vim hostfile_allhosts

gp-master
gp-standby
gp-node1
gp-node2

vim hostfile_segments

gp-node1
gp-node2

vim hostfile_mshosts

gp-master
gp-standby

设置各主机之间免密登录

gpssh-exkeys -f hostfile_allhosts

注:需输入 gpadmin 用户的密码,此处为:gpadmin

设置用于安装 Greenplum 的文件夹权限

gpssh -f hostfile_allhosts
=> sudo chown gpadmin.gpadmin /usr/local
=> exit

创建及赋权 master/standby 主机元数据存储目录

gpssh -f hostfile_mshosts
=>sudo mkdir /data/greenplum_data/gpmaster
=>sudo chown -R gpadmin.gpadmin /data
=>exit

创建及赋权 Segments 主机数据存储目录

gpssh -f hostfile_segments
=>sudo mkdir /data/greenplum_data/{primary,mirror}
=>sudo chown -R gpadmin.gpadmin /data
=>exit

批量安装软件(GP)

cd /home/gpadmin/
source /usr/local/greenplum-db/greenplum_path.sh
gpseginstall -f hostfile_allhosts -u gpadmin -p gpadmin

设置 NTP 同步


 Yum 下载安装 NTP 服务器,已安装的可以略过

sudo yum install ntp -y

若出现如下报错,可看下一步解决方法

Greenplum 5.21.1 集群安装部署详述Greenplum 5.21.1 集群安装部署详述

There was a problem importing one of the Python modules 
required to run yum. The error leading to this problem was: 
  
  No module named yum 
  
Please install a package which provides this module, or 
verify that the module is installed correctly. 
  
It's possible that the above module doesn't match the 
current version of Python, which is: 
2.7.13 (r266:84292, Jan 22 2014, 09:37:14)  
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] 
  
If you cannot solve this problem yourself, please go to  
the yum faq at: 
  http://yum.baseurl.org/wiki/Faq

View Code

解决方法:

unset PYTHONHOME
unset PYTHONPATH
unset LD_LIBRARY_PATH

 再进行 yum 安装之后,再修改回来,使得 GP 能正常使用

source /usr/local/greenplum-db/greenplum_path.sh

注:报错原因:在安装 GP 集群之后,会在 master 节点中的环境变量中会增加 PYTHONHOME,PYTHONPATH,LD_LIBRARY_PATH 几项,并且会修改原本的 path。

补充:LD_LIBRARY_PATH 该环境变量主要用于指定查找共享库(动态链接库)时除了默认路径之外的其他路径。


在每个 Segment 主机,编辑 /etc/ntp.conf 文件。设置第一个 server 参数指向 Master 主机,第二个 server 参数指向 Standby 主机。如下面:

sudo vim /etc/ntp.conf

server gp-master prefer
server gp-standby

在 Standby 主机,编辑 /etc/ntp.conf 文件。设置第一个 server 参数指向 Master 主机,第二个参数指向数据中心的时间服务器。

sudo vim /etc/ntp.conf

server gp-master prefer

在 Master 主机,使用 NTP 守护进程同步所有 Segment 主机的系统时钟。例如,使用 gpssh 来完成:

gpssh -f hostfile_allhosts -v -e 'ntpd'

 输出如下代表成功:

[root@gp-master gpadmin]# gpssh -f all_hosts -v -e 'ntpd'
[WARN] Reference default values as $MASTER_DATA_DIRECTORY/gpssh.conf could not be found
Using delaybeforesend 0.05 and prompt_validation_timeout 1.0

[Reset ...]
[INFO] login mdw
[INFO] login smdw
[INFO] login sdw1
[INFO] login sdw2
[mdw] ntpd
[smdw] ntpd
[sdw1] ntpd
[sdw2] ntpd
[INFO] completed successfully

[Cleanup...]

配置 Greenplum 初始化文件

cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config   /home/gpadmin/gpinitsystem_config
chmod 775 gpinitsystem_config

 相关配置如下:

Greenplum 5.21.1 集群安装部署详述Greenplum 5.21.1 集群安装部署详述

[gpadmin@gp-master ~]$ cat gpinitsystem_config 
# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="Greenplum Data Platform"

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers 
#### are calculated.
PORT_BASE=40000

#### File system location(s) where primary segment data directories 
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in 
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data/greenplum_data/primary)

#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=k8s-master

#### File system location where the master data directory 
#### will be created.
MASTER_DIRECTORY=/data/greenplum_data/gpmaster

#### Port number for the master instance. 
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.
ENCODING=UTF-8

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers 
#### are calculated.
MIRROR_PORT_BASE=43000

#### Base number by which primary file replication port 
#### numbers are calculated.
REPLICATION_PORT_BASE=34000

#### Base number by which mirror file replication port 
#### numbers are calculated. 
MIRROR_REPLICATION_PORT_BASE=44000

#### File system location(s) where mirror segment data directories 
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the 
#### DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/greenplum_data/mirror)


################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.
DATABASE_NAME=testDB

#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
MACHINE_LIST_FILE=/home/gpadmin/hostfile_segments

View Code

 运行初始化工具初始化数据库

source /usr/local/greenplum-db/greenplum_path.sh 
gpinitsystem -c gpinitsystem_config

初始化日志:

20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-Review options for gpinitstandby
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------

现在只有 1 个 master,2 个 segment,没有 standby,那么接下来把 standby 加入集群。

在 Master 服务器上执行

gpinitstandby -s gp-standby

输出如下:

Greenplum 5.21.1 集群安装部署详述Greenplum 5.21.1 集群安装部署详述

[gpadmin@mdw ~]$ gpinitstandby -s smdw
20160827:16:59:24:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for filespace directory /data/master/gpseg-1 on smdw
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master initialization parameters
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master hostname               = mdw
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master data directory         = /data/master/gpseg-1
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master port                   = 5432
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master hostname       = smdw
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master port           = 5432
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master data directory = /data/master/gpseg-1
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum update system catalog         = On
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:- Filespace locations
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-pg_system -> /data/master/gpseg-1
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-The packages on smdw are consistent.
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Adding standby master to catalog...
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Database catalog updated successfully.
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Updating pg_hba.conf file...
20160827:16:59:37:023346 gpinitstandby:mdw:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Updating filespace flat files...
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Filespace flat file updated successfully.
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Starting standby master
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Checking if standby master is running on host: smdw  in directory: /data/master/gpseg-1
20160827:16:59:40:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
20160827:16:59:46:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20160827:16:59:46:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Successfully created standby master on gp-standby

View Code

查看启动进程:

[gpadmin@gp-master ~]$ ps -ef | grep postgres
gpadmin  10975     1  0 00:57 ?        00:00:00 /usr/local/greenplum-db-5.21.1/bin/postgres -D /data/greenplum_data/gpmaster/gpseg-1 -p 5432 --gp_dbid=1 --gp_num_contents_in_cluster=2 --silent-mode=true -i -M master --gp_contentid=-1 -x 0 -E
gpadmin  10976 10975  0 00:57 ?        00:00:00 postgres:  5432, master logger process   
gpadmin  10979 10975  0 00:57 ?        00:00:00 postgres:  5432, stats collector process   
gpadmin  10980 10975  0 00:57 ?        00:00:01 postgres:  5432, writer process   
gpadmin  10981 10975  0 00:57 ?        00:00:00 postgres:  5432, checkpointer process   
gpadmin  10982 10975  0 00:57 ?        00:00:00 postgres:  5432, seqserver process   
gpadmin  10983 10975  0 00:57 ?        00:00:00 postgres:  5432, ftsprobe process   
gpadmin  10984 10975  0 00:57 ?        00:00:00 postgres:  5432, sweeper process   
gpadmin  10985 10975  0 00:57 ?        00:00:05 postgres:  5432, stats sender process   
gpadmin  10986 10975  0 00:57 ?        00:00:01 postgres:  5432, wal writer process   
gpadmin  11279 10975  0 00:59 ?        00:00:00 postgres:  5432, wal sender process gpadmin 192.168.18.131(53573) streaming 0/C05A028
gpadmin  16800 16608  0 04:15 pts/0    00:00:00 grep --color=auto postgres

设置 gpadmin 用户环境变量,Master,Standby 都需设置。

vim /home/gpadmin/.bashrc

[gpadmin@gp-master ~]$ cat .bashrc 
# .bashrc

# Source global definitions
if [-f /etc/bashrc]; then
        . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions

source /usr/local/greenplum-db/greenplum_path.sh 

export MASTER_DATA_DIRECTORY=/data/greenplum_data/gpmaster/gpseg-1
export PGPRORT=5432
export PGDATABASE=testDB
[gpadmin@gp-master ~]$ scp .bashrc gp-standby:`pwd`

启动和停止数据库测试是否能正常启动和关闭,命令如下

gpstart
gpstop 

到此 Greenplum 就部署完成了。下面进行一些简单的测试。

登录数据库:psql -d postgres

建表,插入,查询

postgres=# create table student (no int primary key,student_name varchar(40),age int);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "student_pkey" for table "student"
CREATE TABLE
postgres=# insert into student values(1,'yayun',18);
INSERT 0 1
postgres=# select * from student;
 no | student_name | age 
----+--------------+-----
  1 | yayun        |  18
(1 row)

正文完
星哥玩云-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-22发表,共计13022字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中