Skip to content

macOS M1 安装 Hive 环境

Published: at 00:00

前言

在软件开发过程中,有时为了便于测试和开发,需要在本地安装一个 Hive 环境。本文档详细记录了我在本机环境下安装 Hive 3.1.3 的过程,以供后续参考和与他人分享。需要特别强调的是,此安装过程不适用于高可用性的生产环境。

环境准备

下载 & 部署

$ wget https://dlcdn.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
$ sudo tar xzvf ~/Downloads/apache-hive-3.1.3-bin.tar.gz -C /opt/

$ sudo chown -R yhz:admin /opt/apache-hive-3.1.3-bin
$ vim ~/.zshrc

export HIVE_HOME=/opt/apache-hive-3.1.3-bin
export PATH=$HIVE_HOME/bin:$PATH

$ source ~/.zshrc
$ hdfs dfs -mkdir -p /user/hive/warehouse
$ hdfs dfs -chmod g+w /user/hive/warehouse
$ hdfs dfs -mkdir -p /tmp/hive
$ hdfs dfs -chmod 777 /tmp/hive
-- 自行在MySQL创建好 metastore 的数据库及用户名和密码
-- Host:127.0.0.1
-- Port:3306
-- Username:hive
-- Password:123456
CREATE DATABASE `metastore`;
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
  </property>
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/tmp/hive</value>
  </property>
  <property>
    <name>hive.downloaded.resources.dir</name>
    <value>/tmp/hive</value>
  </property>
  <property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/tmp/hive</value>
  </property>

  <property>
    <name>hive.metastore.local</name>
    <value>false</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://127.0.0.1:3306/metastore?createDatabaseIfNotExist=true</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>
  </property>

  <property>
    <name>hive.metastore.uris</name>
    <value>thrift://127.0.0.1:9083</value>
  </property>

  <property>
    <name>hive.metastore.event.db.notification.api.auth</name>
    <value>false</value>
  </property>
</configuration>
$ cd $HIVE_HOME/lib
$ wget https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.0.33/mysql-connector-j-8.0.33.jar
$ schematool -dbType mysql -initSchema
$ nohup hive --service metastore 2>&1 &
$ hive

结果如下图,表示安装成功: [Pasted image 20231013164440.png]

$ vim $HADOOP_HOME/etc/hadoop/core-site.xml

这里假设本地用户名为:yhz, 添加如下信息:

<property>
  <name>hadoop.proxyuser.yhz.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.yhz.groups</name>
  <value>*</value>
</property>

重启 Hadoop

$ nohup hive --service hiveserver2 2>&1 &
$ beeline -u jdbc:hive2://localhost:10000 --color=true
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 3.1.3)
Driver: Hive JDBC (version 3.1.3)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.3 by Apache Hive
0: jdbc:hive2://localhost:10000> show databases;
INFO  : Compiling command(queryId=yhz_20231016145831_02c7d352-e791-4744-bfcf-97fff8aa888c): show databases
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=yhz_20231016145831_02c7d352-e791-4744-bfcf-97fff8aa888c); Time taken: 0.755 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=yhz_20231016145831_02c7d352-e791-4744-bfcf-97fff8aa888c): show databases
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=yhz_20231016145831_02c7d352-e791-4744-bfcf-97fff8aa888c); Time taken: 0.146 seconds
INFO  : OK
INFO  : Concurrency mode is disabled, not creating a lock manager
+----------------+
| database_name  |
+----------------+
| default        |
+----------------+
1 row selected (1.315 seconds)
0: jdbc:hive2://localhost:10000> !quit
Closing: 0: jdbc:hive2://localhost:10000