阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Hadoop2.3、 Hbase0.98、 Hive0.13架构中Hive的安装部署配置以及数据测试

221次阅读
没有评论

共计 9698 个字符,预计需要花费 25 分钟才能阅读完成。

简介:

Hive 是基于 Hadoop 的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供简单的 sql 查询功能,可以将 sql 语句转换为 MapReduce 任务进行运行。其优点是学习成本低,可以通过类 SQL 语句快速实现简单的 MapReduce 统计,不必开发专门的 MapReduce 应用,十分适合数据仓库的统计分析。

1,适用场景

Hive 构建在基于静态批处理的 Hadoop 之上,Hadoop 通常都有较高的延迟并且在作业提交和调度的时候需要大量的开销。因此,Hive 并不能够在大规模数据集上实现低延迟快速的查询,例如,Hive 在几百 MB 的数据集上执行查询一般有分钟级的时间延迟。因此,

Hive 并不适合那些需要低延迟的应用,例如,联机事务处理(OLTP)。Hive 查询操作过程严格遵守 Hadoop MapReduce 的作业执行模型,Hive 将用户的 HiveQL 语句通过解释器转换为 MapReduce 作业提交到 Hadoop 集群上,Hadoop 监控作业执行过程,然后返回作业执行结果给用户。Hive 并非为联机事务处理而设计,Hive 并不提供实时的查询和基于行级的数据更新操作。Hive 的最佳使用场合是大数据集的批处理作业,例如,网络日志分析。

2,下载安装
前期 Hadoop 安装准备,参考 CentOS 6.4 下 Hadoop2.3.0 详细安装过程:http://www.linuxidc.com/Linux/2014-08/105915.htm

下载地址

wget http://mirror.bit.edu.cn/apache/hive/hive-0.13.1/apache-hive-0.13.1-bin.tar.gz

解压安装

tar zxvf apache-hive-0.13.1-bin.tar.gz  -C /home/hadoop/src/

PS:Hive 只需要在一个节点上安装即可,本例安装在 name 节点上面的虚拟机上面,与 hadoop 的 name 节点复用一台虚拟机器。

3,配置 hive 环境变量

vim hive-env.sh

export HIVE_HOME=/home/hadoop/src/hive-0.13.1

export PATH=$PATH:$HIVE_HOME/bin

4,配置 hadoop 以及 hbase 参数

vim hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory

HADOOP_HOME=/home/hadoop/src/hadoop-2.3.0/

 

# Hive Configuration Directory can be controlled by:

export HIVE_CONF_DIR=/home/hadoop/src/hive-0.13.1/conf

 

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:

export HIVE_AUX_JARS_PATH=/home/hadoop/src/hive-0.13.1/lib

 

5,验证安装:

启动 hive 命令行模式,出现 hive,说明安装成功了

[hadoop@name01 lib]$ hive –service cli

15/01/09 00:20:32 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead

 

Logging initialized using configuration in jar:file:/home/hadoop/src/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties

创建表,执行 create 命令,出现 OK,说明命令执行成功,也说明 hive 安装成功。

hive> create table test(key string);

OK

Time taken: 8.749 seconds

hive>

 

6,验证可用性

启动 hive

[hadoop@name01 root]$hive –service metastore &

查看后台 hive 运行进程

[hadoop@name01 root]$ ps -eaf|grep hive

hadoop    4025  2460  1 22:52 pts/0    00:00:19 /usr/lib/jvm/jdk1.7.0_60/bin/java -Xmx256m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/hadoop/src/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/hadoop/src/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/home/hadoop/src/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/hadoop/src/hive-0.13.1/lib/hive-service-0.13.1.jar org.apache.hadoop.hive.metastore.HiveMetaStore

hadoop    4575  4547  0 23:14 pts/1    00:00:00 grep hive

[hadoop@name01 root]$

 

6.1 在 hive 下执行命令,创建 2 个字段的表,字段间隔用’,’隔开:

hive> create table test(key string);

OK

Time taken: 8.749 seconds

hive> create table tim_test(id int,name string) row format delimited fields terminated by ‘,’;

OK

Time taken: 0.145 seconds

hive>

 

6.2 准备导入到数据库的 txt 文件,并输入值:

[hadoop@name01 hive-0.13.1]$ more tim_hive_test.txt

123,xinhua

456,dingxilu

789,fanyulu

903,fahuazhengroad

[hadoop@name01 hive-0.13.1]$

 

6.4 再打开一个 xshell 端口,进入服务器端启动 hive:

[hadoop@name01 root]$ hive –service metastore

Starting Hive Metastore Server

 

6.5 再打开一个 xshell 端口,进入 hive 客户端录入数据:

[hadoop@name01 hive-0.13.1]$ hive

 

Logging initialized using configuration in jar:file:/home/hadoop/src/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties

hive> load data local inpath  ‘/home/hadoop/src/hive-0.13.1/tim_hive_test.txt’  into table tim_test;

Copying data from file:/home/hadoop/src/hive-0.13.1/tim_hive_test.txt

Copying file: file:/home/hadoop/src/hive-0.13.1/tim_hive_test.txt

Loading data to table default.tim_test

[Warning] could not update stats.

OK

Time taken: 7.208 seconds

hive>

6.6 验证录入数据是否成功,看到 dfs 出来有 tim_test

hive> dfs -ls /home/hadoop/hive/warehouse;

Found 2 items

drwxr-xr-x  – hadoop supergroup          0 2015-01-12 01:47 /home/hadoop/hive/warehouse/hive_hbase_mapping_table_1

drwxr-xr-x  – hadoop supergroup          0 2015-01-12 02:11 /home/hadoop/hive/warehouse/tim_test

hive>

7,安装部署中的报错记录:
报错 1:

[hadoop@name01 conf]$ hive –service metastore

Starting Hive Metastore Server

javax.jdo.JDOFatalInternalException: Error creating transactional connection factory

Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the “BONECP” plugin to create a ConnectionPool gave an error : The specified datastore driver (“com.mysql.jdbc.Driver”) was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

缺少 mysql 的 jar 包,copy 到 hive 的 lib 目录下面,OK。

报错 2:

[hadoop@name01 conf]$ hive –service metastore

Starting Hive Metastore Server

javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://192.168.52.130:3306/hive_remote?createDatabaseIfNotExist=true, username = root. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ——

java.sql.SQLException: null,  message from server: “Host ‘192.168.52.128’ is not allowed to connect to this MySQL server”

 

将 hadoop 用户添加到 mysql 组:

[root@data02 mysql]# gpasswd -a hadoop mysql

Adding user hadoop to group mysql

[root@data02 mysql]#

^C[hadoop@name01 conf]$ telnet 192.168.52.130 3306

Trying 192.168.52.130…

Connected to 192.168.52.130.

Escape character is ‘^]’.

G

——————————————————————————–
Host ‘192.168.52.128’ is not allowed to connect to this MySQL serverConnection closed by foreign host.

 

[hadoop@name01 conf]$

解决办法:修改 mysql 账号

mysql> update user set user = ‘hadoop’ where user = ‘root’ and host=’%’;

Query OK, 1 row affected (0.04 sec)

Rows matched: 1  Changed: 1  Warnings: 0

mysql> flush privileges;

Query OK, 0 rows affected (0.09 sec)

mysql>

报错 3:

[hadoop@name01 conf]$ hive –service metastore

Starting Hive Metastore Server

javax.jdo.JDOException: Exception thrown calling table.exists() for hive_remote.`SEQUENCE_TABLE`

        at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)

        at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)

        at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)

……

NestedThrowablesStackTrace:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

解决,去远程 mysql 库上修改字符集从 utf8mb4 修改成 utf8

mysql> alter database hive_remote /*!40100 DEFAULT CHARACTER SET utf8 */;

Query OK, 1 row affected (0.03 sec)

mysql>

然后在 data01 上面配置 hive client 端

scp -r hive-0.13.1/ data01:/home/hadoop/src/

报错 3:

继续启动,查看日志信息:

[hadoop@name01 conf]$ hive –service metastore

Starting Hive Metastore Server

卡在这里不动,去看日志信息

[hadoop@name01 hadoop]$ tail -f hive.log

2015-01-09 03:46:27,692 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(229)) – Initialized ObjectStore
2015-01-09 03:46:27,892 WARN  [main]: metastore.ObjectStore (ObjectStore.java:checkSchema(6295)) – Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.0
2015-01-09 03:46:30,574 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(551)) – Added admin role in metastore
2015-01-09 03:46:30,582 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(560)) – Added public role in metastore
2015-01-09 03:46:31,168 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(588)) – No user is added in admin role, since config is empty
2015-01-09 03:46:31,473 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5178)) – Starting DB backed MetaStore Server
2015-01-09 03:46:31,481 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5190)) – Started the new metaserver on port [9083]…
2015-01-09 03:46:31,481 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5192)) – Options.minWorkerThreads = 200
2015-01-09 03:46:31,482 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) – Options.maxWorkerThreads = 100000
2015-01-09 03:46:31,482 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) – TCP keepalive = true 

在 hive-site.xml 上添加如下:

<property>

  <name>hive.metastore.uris</name>

      <value>thrift://192.168.52.128:9083</value>

      </property>

报错 4:

2015-01-09 04:01:43,053 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(229)) – Initialized ObjectStore
2015-01-09 04:01:43,540 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(551)) – Added admin role in metastore
2015-01-09 04:01:43,546 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(560)) – Added public role in metastore
2015-01-09 04:01:43,684 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(588)) – No user is added in admin role, since config is empty
2015-01-09 04:01:44,041 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5178)) – Starting DB backed MetaStore Server
2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5190)) – Started the new metaserver on port [9083]…
2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5192)) – Options.minWorkerThreads = 200
2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) – Options.maxWorkerThreads = 100000
2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) – TCP keepalive = true
2015-01-09 04:24:13,917 INFO  [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5073)) – Shutting down hive metastore.

解决:

查了好久,No user is added in admin role, since config is empty 没有查到问题所在,碰到此类情况的一起交流下,欢迎留言。

基于 Hadoop 集群的 Hive 安装 http://www.linuxidc.com/Linux/2013-07/87952.htm

Hive 内表和外表的区别 http://www.linuxidc.com/Linux/2013-07/87313.htm

Hadoop + Hive + Map +reduce 集群安装部署 http://www.linuxidc.com/Linux/2013-07/86959.htm

Hive 本地独立模式安装 http://www.linuxidc.com/Linux/2013-06/86104.htm

Hive 学习之 WordCount 单词统计 http://www.linuxidc.com/Linux/2013-04/82874.htm

Hive 运行架构及配置部署 http://www.linuxidc.com/Linux/2014-08/105508.htm

Hive 的详细介绍:请点这里
Hive 的下载地址:请点这里

正文完
星哥玩云-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-20发表,共计9698字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中