CentOS7.0安装配置Kafka集群

共计 8151 个字符，预计需要花费 21 分钟才能阅读完成。

1. 简介

Kafka 是一种高吞吐的分布式发布订阅消息系统，能够替代传统的消息队列用于解耦合数据处理，缓存未处理消息等，同时具有更高的吞吐率，支持分区、多副本、冗余，因此被广泛用于大规模消息数据处理应用。Kafka 支持 Java 及多种其它语言客户端，可与 Hadoop、Storm、Spark 等其它大数据工具结合使用。

本教程主要介绍 Kafka 在 CentOS7 上的安装和使用，包括功能验证和集群的简单配置。

2. 环境准备

#	hostname	ip	software	notes
1	apollo.dt.com	192.168.56.181	kafka zookeeper	Kafka: broker.id=181
2	artemis.dt.com	192.168.56.182	kafka zookeeper	kafka: borker.id=182
3	uranus.dt.com	192.168.56.183	kafka zookeeper	kafka: broker.id=183

3. 安装 JDK

Kafka 使用 Zookeeper 来保存相关配置信息，Kafka 及 Zookeeper 依赖 Java 环境，有关 CentOS7.0 安装 JDK 请参考：CentOS7 安装 JDK1.8

4. 安装配置 zookeeper 集群

有关 Zookeeper 的集群请参考：CentOS7 安装配置 Zookeeper 集群

5. 下载 Kafka

4.1. 下载 Kafka

 [dtadmin@apollo~]$ sudo wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/0.10.2.1/kafka_2.11-0.10.2.1.tgz

6.Kafka 集群安装与配置

6.1. 安装 Kafka

 # 解压 Kafka
[root@apollo dtadmin]# sudo tar -zxvf kafka_2.11-0.10.2.1.tgz 
# 移到目录 /opt 下
[root@apollo dtadmin]# sudo mv kafka_2.11-0.10.2.1 /opt/kafka

6.2. 配置 kafka 环境变量

 [root@apollo dtadmin]# vim /etc/profile
# 添加如下内容：
KAFKA_HOME=/opt/kafka
PATH=$PATH:$KAFKA_HOME/bin
export PATH KAFKA_HOME

6.3. 配置 Kafka

 # 创建日志存放目录
[root@apollo dtadmin]# cd /opt/kafka
[root@apollo kafka]# mkdir -p log/kafka
# 修改配置文件 /opt/kafka/config/server.properties
[root@apollo dtadmin]# vim /opt/kafka/config/server.properties
# 修改内容如下：
broker.id=181
delete.topic.enable=true
listeners = PLAINTEXT://apollo.dt.com:9092
log.dirs=/opt/kafka/log/kafka
zookeeper.connect=192.168.56.181:2181,192.168.56.182:2181,192.168.56.183:2181

样例配置文件：

 [root@apollo dtadmin]# vim /opt/kafka/config/server.properties
#######
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
 
# see kafka.server.KafkaConfig for additional details and defaults
 
############################# Server Basics #############################
 
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=181
port=9092
# Switch to enable topic deletion or not, default value is false
delete.topic.enable=true
 
############################# Socket Server Settings #############################
 
# The address the socket server listens on. It will get the value returned from 
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
listeners = PLAINTEXT://apollo.dt.com:9092
#listeners=PLAINTEXT://:9092
 
# Hostname and port the broker will advertise to producers and consumers. If not set, 
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
 
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
 
# The number of threads handling network requests
num.network.threads=3
 
# The number of threads doing disk I/O
num.io.threads=8
 
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
 
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
 
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
 
 
############################# Log Basics #############################
 
# A comma seperated list of directories under which to store log files
log.dirs=/opt/kafka/log/kafka
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
 
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
 
############################# Log Flush Policy #############################
 
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
 
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
 
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
 
############################# Log Retention Policy #############################
 
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
 
# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=168
 
# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824
 
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
 
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
 
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=192.168.56.181:2181,192.168.56.182:2181,192.168.56.183:2181
 
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000

注：以同样方法配置其它两台服务器。

7. 启动 zookeeper 集群

7.1. 在每台机器上启动 zookeeper, 每台机器都要启动

 [root@apollo dtadmin]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@apollo dtadmin]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: leader
 
[root@artemis dtadmin]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@artemis dtadmin]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: follower
 
 
[root@uranus dtadmin]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@uranus dtadmin]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: follower

7.2. 启动 kafka 集群

 [root@apollo dtadmin]# kafka-server-start.sh /opt/kafka/config/server.properties 
 
[root@artemis dtadmin]# kafka-server-start.sh /opt/kafka/config/server.properties
 
[root@uranus dtadmin]# kafka-server-start.sh /opt/kafka/config/server.properties

下面关于 Kafka 的文章您也可能喜欢，不妨参考下：

CentOS 7.2 部署 Elasticsearch+Kibana+Zookeeper+Kafka http://www.linuxidc.com/Linux/2016-11/137636.htm

CentOS 7 下安装 Logstash ELK Stack 日志管理系统 http://www.linuxidc.com/Linux/2016-08/134165.htm

Kafka 集群部署与配置手册 http://www.linuxidc.com/Linux/2017-02/141037.htm

CentOS 7 下 Kafka 集群安装 http://www.linuxidc.com/Linux/2017-01/139734.htm

Apache Kafka 教程笔记 http://www.linuxidc.com/Linux/2014-01/94682.htm

CentOS 7 下安装 Kafka 单机版 http://www.linuxidc.com/Linux/2017-01/139732.htm

Apache kafka 原理与特性(0.8V) http://www.linuxidc.com/Linux/2014-09/107388.htm

Kafka 部署与代码实例 http://www.linuxidc.com/Linux/2014-09/107387.htm

Kafka 介绍及环境搭建 http://www.linuxidc.com/Linux/2016-12/138724.htm

Kafka 介绍和集群环境搭建 http://www.linuxidc.com/Linux/2014-09/107382.htm

Kafka 的详细介绍：请点这里
Kafka 的下载地址：请点这里

本文永久更新链接地址：http://www.linuxidc.com/Linux/2017-06/144951.htm

CentOS7.0安装配置Kafka集群

1. 简介

2. 环境准备

3. 安装 JDK

4. 安装配置 zookeeper 集群

5. 下载 Kafka

4.1. 下载 Kafka

6.Kafka 集群安装与配置

6.1. 安装 Kafka

6.2. 配置 kafka 环境变量

6.3. 配置 Kafka

7. 启动 zookeeper 集群

7.1. 在每台机器上启动 zookeeper, 每台机器都要启动

7.2. 启动 kafka 集群

申请腾讯混元的API Key并且使用LobeChat调用混元AI

Docker部署搭建一个开源强大的图书管理系统

基于Docker快速搭建一个开源的IT人员在线工具箱-it-tools

让每个人都可以轻松使用Git-腾讯自研Git客户端

使用Docker部署开源的WPS-Office

用 Python 去构建一个 RSS 提示系统

如何处理阿里云ssh连接慢

干货：MySQL增量备份脚本

解析Linux中出现的错误：toomanyopenfiles

Fedora:Gnome创建桌面图标，以Eclipse和IDEA为例

	# 解压 Kafka
	[root@apollo dtadmin]# sudo tar -zxvf kafka_2.11-0.10.2.1.tgz
	# 移到目录 /opt 下
	[root@apollo dtadmin]# sudo mv kafka_2.11-0.10.2.1 /opt/kafka

	[root@apollo dtadmin]# vim /etc/profile
	# 添加如下内容：
	KAFKA_HOME=/opt/kafka
	PATH=$PATH:$KAFKA_HOME/bin
	export PATH KAFKA_HOME

	# 创建日志存放目录
	[root@apollo dtadmin]# cd /opt/kafka
	[root@apollo kafka]# mkdir -p log/kafka
	# 修改配置文件 /opt/kafka/config/server.properties
	[root@apollo dtadmin]# vim /opt/kafka/config/server.properties
	# 修改内容如下：
	broker.id=181
	delete.topic.enable=true
	listeners = PLAINTEXT://apollo.dt.com:9092
	log.dirs=/opt/kafka/log/kafka
	zookeeper.connect=192.168.56.181:2181,192.168.56.182:2181,192.168.56.183:2181

	[root@apollo dtadmin]# vim /opt/kafka/config/server.properties
	#######
	# Licensed to the Apache Software Foundation (ASF) under one or more
	# contributor license agreements. See the NOTICE file distributed with
	# this work for additional information regarding copyright ownership.
	# The ASF licenses this file to You under the Apache License, Version 2.0
	# (the "License"); you may not use this file except in compliance with
	# the License. You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License.

	# see kafka.server.KafkaConfig for additional details and defaults

	############################# Server Basics #############################

	# The id of the broker. This must be set to a unique integer for each broker.
	broker.id=181
	port=9092
	# Switch to enable topic deletion or not, default value is false
	delete.topic.enable=true

	############################# Socket Server Settings #############################

	# The address the socket server listens on. It will get the value returned from
	# java.net.InetAddress.getCanonicalHostName() if not configured.
	# FORMAT:
	# listeners = listener_name://host_name:port
	# EXAMPLE:
	listeners = PLAINTEXT://apollo.dt.com:9092
	#listeners=PLAINTEXT://:9092

	# Hostname and port the broker will advertise to producers and consumers. If not set,
	# it uses the value for "listeners" if configured. Otherwise, it will use the value
	# returned from java.net.InetAddress.getCanonicalHostName().
	#advertised.listeners=PLAINTEXT://your.host.name:9092

	# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
	#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

	# The number of threads handling network requests
	num.network.threads=3

	# The number of threads doing disk I/O
	num.io.threads=8

	# The send buffer (SO_SNDBUF) used by the socket server
	socket.send.buffer.bytes=102400

	# The receive buffer (SO_RCVBUF) used by the socket server
	socket.receive.buffer.bytes=102400

	# The maximum size of a request that the socket server will accept (protection against OOM)
	socket.request.max.bytes=104857600


	############################# Log Basics #############################

	# A comma seperated list of directories under which to store log files
	log.dirs=/opt/kafka/log/kafka
	# The default number of log partitions per topic. More partitions allow greater
	# parallelism for consumption, but this will also result in more files across
	# the brokers.
	num.partitions=1

	# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
	# This value is recommended to be increased for installations with data dirs located in RAID array.
	num.recovery.threads.per.data.dir=1

	############################# Log Flush Policy #############################

	# Messages are immediately written to the filesystem but by default we only fsync() to sync
	# the OS cache lazily. The following configurations control the flush of data to disk.
	# There are a few important trade-offs here:
	# 1. Durability: Unflushed data may be lost if you are not using replication.
	# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
	# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
	# The settings below allow one to configure the flush policy to flush data after a period of time or
	# every N messages (or both). This can be done globally and overridden on a per-topic basis.

	# The number of messages to accept before forcing a flush of data to disk
	#log.flush.interval.messages=10000

	# The maximum amount of time a message can sit in a log before we force a flush
	#log.flush.interval.ms=1000

	############################# Log Retention Policy #############################

	# The following configurations control the disposal of log segments. The policy can
	# be set to delete segments after a period of time, or after a given size has accumulated.
	# A segment will be deleted whenever either of these criteria are met. Deletion always happens
	# from the end of the log.

	# The minimum age of a log file to be eligible for deletion due to age
	log.retention.hours=168

	# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
	# segments don't drop below log.retention.bytes. Functions independently of log.retention.hours.
	#log.retention.bytes=1073741824

	# The maximum size of a log segment file. When this size is reached a new log segment will be created.
	log.segment.bytes=1073741824

	# The interval at which log segments are checked to see if they can be deleted according
	# to the retention policies
	log.retention.check.interval.ms=300000
	############################# Zookeeper #############################

	# Zookeeper connection string (see zookeeper docs for details).
	# This is a comma separated host:port pairs, each corresponding to a zk
	# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
	# You can also append an optional chroot string to the urls to specify the
	# root directory for all kafka znodes.
	zookeeper.connect=192.168.56.181:2181,192.168.56.182:2181,192.168.56.183:2181

	# Timeout in ms for connecting to zookeeper
	zookeeper.connection.timeout.ms=6000

	[root@apollo dtadmin]# zkServer.sh start
	ZooKeeper JMX enabled by default
	Using config: /opt/zookeeper/bin/../conf/zoo.cfg
	Starting zookeeper ... STARTED
	[root@apollo dtadmin]# zkServer.sh status
	ZooKeeper JMX enabled by default
	Using config: /opt/zookeeper/bin/../conf/zoo.cfg
	Mode: leader

	[root@artemis dtadmin]# zkServer.sh start
	ZooKeeper JMX enabled by default
	Using config: /opt/zookeeper/bin/../conf/zoo.cfg
	Starting zookeeper ... STARTED
	[root@artemis dtadmin]# zkServer.sh status
	ZooKeeper JMX enabled by default
	Using config: /opt/zookeeper/bin/../conf/zoo.cfg
	Mode: follower


	[root@uranus dtadmin]# zkServer.sh start
	ZooKeeper JMX enabled by default
	Using config: /opt/zookeeper/bin/../conf/zoo.cfg
	Starting zookeeper ... STARTED
	[root@uranus dtadmin]# zkServer.sh status
	ZooKeeper JMX enabled by default
	Using config: /opt/zookeeper/bin/../conf/zoo.cfg
	Mode: follower

	[root@apollo dtadmin]# kafka-server-start.sh /opt/kafka/config/server.properties

	[root@artemis dtadmin]# kafka-server-start.sh /opt/kafka/config/server.properties

	[root@uranus dtadmin]# kafka-server-start.sh /opt/kafka/config/server.properties