Hive2.x已经足够稳定了,前面也安装过hive0.x和Hive1.x的版本,今天我们来看下hive2.x如何安装使用。
环境:
centos7.1
Hadoop2.7.3
JDK8
Hive2.1.0
1,首先需要下载hive最新的稳定版本的包,并保证的你Hadoop集群已经是能够正常运行的
2,解压到指定目录
首先进入conf目录把所有带template后缀的文件,给移除后缀,只有hive-default.xml移除后缀后,需要修改名为hive-site.xml。
3,配置hive的log
vi conf/hive-log4j2.properties 配置下面的2个参数:property.hive.log.dir = /home/search/hive/logs property.hive.log.file = hive.log
4,配置使用MySQL作为元数据存储
关于安装mysql以及分配权限的请参考散仙之前的文章:
vi hive-site.xml配置下面的几项参数javax.jdo.option.ConnectionURL= jdbc:mysql://192.168.10.40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=utf-8javax.jdo.option.ConnectionUserName=rootjavax.jdo.option.ConnectionPassword=pwdjavax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driverhive.metastore.warehouse.dir=hdfs://192.168.10.38:8020//user/hive/warehouse其他的凡是包含 ${system:java.io.tmpdir}变量的统一替代成绝对路径,目录可以在hive的根目录下建一个tmp目录,统一放进去
最后切记添加mysql的jdbc驱动包到hive/lib的目录下面
说明下上面的jdbc的url里面驱动字符串设置为数据库编码为utf-8此外&符号需要转义
jdbc:mysql://192.168.10.40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=utf-8
此外默认hive读取Hbase的lib如果没有安装hbase则会启动不起来: 需要下载hbase然后配置对应的HBASE_HOME,文末会给出所有的环境变量
5,在hive2.x之后,需要先初始化schema如下:
$HIVE_HOME/bin/schematool -initSchema -dbType mysql
注意不执行这个,直接执行hive会报错:
Caused by: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql)) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3364) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336) at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)
执行成功打印结果如下:
[search@es1 ~]$ $HIVE_HOME/bin/schematool -initSchema -dbType mysqlSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/home/search/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/home/search/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Metastore connection URL: jdbc:mysql://192.168.10.40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=utf-8Metastore Connection Driver : com.mysql.jdbc.DriverMetastore connection User: rootStarting metastore schema initialization to 2.1.0Initialization script hive-schema-2.1.0.mysql.sqlInitialization script completedschemaTool completed
6,测试集群是否正常
在本地磁盘上新建一个文件a,写入内容如下
1,a2,b3,c4,a5,a2,a4,21,a1,a
编写的create_sql如下:
-- 存在就删除 drop table if exists info ; -- 建表CREATE TABLE info(id string, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; -- 加载数据load data local inpath '/home/search/test_hive/a' into table info;
最后执行脚本,不报错就代表通过了:
hive -f create_sql
Hive2.x之后不推荐使用MR的方式运行任务了,推荐使用Tez或者Spark引擎运行job,但是mr还是支持的
执行下面的语句进行测试
hive -e "select count(*) from info"
运行成功,就代表Hive+Hadoop集成成功。
关于Hive On Tez 集成我下篇文章会介绍。
7,一些环境变量如下:
#JDKexport JAVA_HOME=/home/search/jdk1.8.0_102/export CLASSPATH=.:$JAVA_HOME/libexport PATH=$JAVA_HOME/bin:$PATH#Mavenexport MAVEN_HOME=/home/search/apache-maven-3.3.9export CLASSPATH=$CLASSPATH:$MAVEN_HOME/libexport PATH=$PATH:$MAVEN_HOME/bin#Antexport ANT_HOME=/home/search/antexport CLASSPATH=$CLASSPATH:$ANT_HOME/libexport PATH=$PATH:$ANT_HOME/bin#Hadoopexport HADOOP_HOME=/home/search/hadoopexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbinexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport YARN_CONF_DIR=$HADOOP_HOME/etc/hadoopexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbinexport CLASSPATH=.:$CLASSPATH:$HADOOP_COMMON_HOME:$HADOOP_COMMON_HOME/lib:$HADOOP_MAPRED_HOME:$HADOOP_HDFS_HOME:$HADOOP_HDFS_HOME#Hbaseexport HBASE_HOME=/home/search/hbaseexport CLASSPATH=$CLASSPATH:$HBASE_HOME/libexport PATH=$HBASE_HOME/bin:$PATH#Pigexport PIG_HOME=/home/search/pigexport PIG_CLASSPATH=$PIG_HOME/lib:$HADOOP_HOME/etc/hadoopexport PATH=/ROOT/server/bigdata/pig/bin:$PATH#Zookeeperexport ZOOKEEPER_HOME=/home/search/zookeeperexport CLASSPATH=.:$ZOOKEEPER_HOME/libexport PATH=$PATH:$ZOOKEEPER_HOME/bin#Hiveexport HIVE_HOME=/home/search/hiveexport HIVE_CONF_DIR=$HIVE_HOME/confexport CLASSPATH=$CLASSPATH:$HIVE_HOME/libexport PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/conf#JStormexport JSTORM_HOME=/home/search/jstorm-2.1.1export CLASSPATH=$CLASSPATH:$JSTORM_HOME/libexport PATH=$PATH:$JSTORM_HOME/bin:$PATH#Scalaexport SCALA_HOME=/home/search/scalaexport CLASSPATH=.:$SCALA_HOME/libexport PATH=$PATH:$SCALA_HOME/bin#Sparkexport SPARK_HOME=/ROOT/server/sparkexport PATH=$PATH:$SPARK_HOME/bin