hadoop@Master:~$ hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/bigdata/hive-2.3.5/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/bigdata/hadoop-2.7.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/usr/local/bigdata/hive-2.3.5/conf/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive>show databases; OK dbtest default Time taken: 3.539 seconds, Fetched: 2 row(s) hive>
技巧: 让提示符显示当前库:
1
hive>set hive.cli.print.current.db=true;
显示查询结果是显示自带名称:
1
hive>set hive.cli.print.header=true;
这样设置只是对当前窗口有效,永久生效可以在当前用户目录下建一个.hiverc文件。 加入如下内容:
sql
1 2
set hive.cli.print.current.db=true; set hive.cli.print.header=true;
hadoop@Master:~$ beeline -u jdbc:hive2://Master:10000 -n hadoop -p hadoop Connecting to jdbc:hive2://Master:10000 19/06/25 01:50:12 INFO jdbc.Utils: Supplied authorities: Master:10000 19/06/25 01:50:12 INFO jdbc.Utils: Resolved authority: Master:10000 19/06/25 01:50:13 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://Master:10000 Connected to: Apache Hive (version 2.3.5) Driver: Hive JDBC (version 1.2.1.spark2) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 1.2.1.spark2 by Apache Hive 0: jdbc:hive2://Master:10000>
参数说明
u :指定连接方式
n :登录的用户(系统用户)
p :用户密码
报错
shell
1
errorMessage:Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoop is not allowed to impersonate hadoop), serverProtocolVersion:null)
<property> <name>hadoop.proxyuser.hadoop.groups</name> <value>hadoop</value> <description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>Master,127.0.0.1,localhost</value> <description>The superuser can connect only from host1 and host2 to impersonate a user</description> </property>
0: jdbc:hive2://Master:10000> select * from t_a; +-----------+----------+--+ | t_a.name | t_a.num | +-----------+----------+--+ | a | 1 | | b | 2 | | c | 3 | | d | 4 | +-----------+----------+--+ 4 rows selected (0.523 seconds) 0: jdbc:hive2://Master:10000> select * from t_b; +-----------+----------+--+ | t_b.name | t_b.age | +-----------+----------+--+ | b | 16 | | c | 17 | | d | 18 | | e | 19 | +-----------+----------+--+
4 rows selected (0.482 seconds)
4.3 内连接
指定join条件
sql
1 2 3
select a.*,b.* from t_a a join t_b b on a.name=b.name;
示例:
sql
1 2 3 4 5 6 7 8 9 10 11
0: jdbc:hive2://Master:10000> select a.*,b.* 0: jdbc:hive2://Master:10000> from 0: jdbc:hive2://Master:10000> t_a a join t_b b on a.name=b.name; .... +---------+--------+---------+--------+--+ | a.name | a.num | b.name | b.age | +---------+--------+---------+--------+--+ | b | 2 | b | 16 | | c | 3 | c | 17 | | d | 4 | d | 18 | +---------+--------+---------+--------+--+
4.4 左外连接(左连接)
sql
1 2 3
select a.*,b.* from t_a a leftouterjoin t_b b on a.name=b.name;
示例:
sql
1 2 3 4 5 6 7 8 9 10 11 12
0: jdbc:hive2://Master:10000> select a.*,b.* 0: jdbc:hive2://Master:10000> from 0: jdbc:hive2://Master:10000> t_a a left outer join t_b b on a.name=b.name; ... +---------+--------+---------+--------+--+ | a.name | a.num | b.name | b.age | +---------+--------+---------+--------+--+ | a | 1 | NULL | NULL | | b | 2 | b | 16 | | c | 3 | c | 17 | | d | 4 | d | 18 | +---------+--------+---------+--------+--+
4.5 右外连接(右连接)
sql
1 2 3
select a.*,b.* from t_a a rightouterjoin t_b b on a.name=b.name;
示例:
sql
1 2 3 4 5 6 7 8 9 10 11 12
0: jdbc:hive2://Master:10000> select a.*,b.* 0: jdbc:hive2://Master:10000> from 0: jdbc:hive2://Master:10000> t_a a right outer join t_b b on a.name=b.name; .... +---------+--------+---------+--------+--+ | a.name | a.num | b.name | b.age | +---------+--------+---------+--------+--+ | b | 2 | b | 16 | | c | 3 | c | 17 | | d | 4 | d | 18 | | NULL | NULL | e | 19 | +---------+--------+---------+--------+--+
4.6 全外连接
sql
1 2 3
select a.*,b.* from t_a a fullouterjoin t_b b on a.name=b.name;
示例:
sql
1 2 3 4 5 6 7 8 9 10 11 12 13
0: jdbc:hive2://Master:10000> select a.*,b.* 0: jdbc:hive2://Master:10000> from 0: jdbc:hive2://Master:10000> t_a a full outer join t_b b on a.name=b.name; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. +---------+--------+---------+--------+--+ | a.name | a.num | b.name | b.age | +---------+--------+---------+--------+--+ | a | 1 | NULL | NULL | | b | 2 | b | 16 | | c | 3 | c | 17 | | d | 4 | d | 18 | | NULL | NULL | e | 19 | +---------+--------+---------+--------+--+
4.7 左半连接
求存在于a表,且b表里也存在的数据。
sql
1 2 3
select a.* from t_a a leftsemijoin t_b b on a.name=b.name;
示例:
sql
1 2 3 4 5 6 7 8 9 10 11
0: jdbc:hive2://Master:10000> select a.* 0: jdbc:hive2://Master:10000> from 0: jdbc:hive2://Master:10000> t_a a left semi join t_b b on a.name=b.name; ..... +---------+--------+--+ | a.name | a.num | +---------+--------+--+ | b | 2 | | c | 3 | | d | 4 | +---------+--------+--+
0: jdbc:hive2://Master:10000> select url,max(ip) 0: jdbc:hive2://Master:10000> from t_pv 0: jdbc:hive2://Master:10000> group by url; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. +------------------------------+----------------+--+ | url | _c1 | +------------------------------+----------------+--+ | http://www.xxx.cn/excersize | 192.168.33.46 | | http://www.xxx.cn/job | 192.168.33.55 | | http://www.xxx.cn/pay | 192.168.34.44 | | http://www.xxx.cn/register | 192.168.133.3 | | http://www.xxx.cn/stu | 192.168.33.44 | | http://www.xxx.cn/teach | 192.168.44.3 | +------------------------------+----------------+--+
selectid,name,family_members["father"] as father,family_members["sister"] as sister,age from t_family;
查出每个人有哪些亲属关系
sql
1 2
selectid,name,map_keys(family_members) as relations,age from t_family;
查出每个人的亲人名字
sql
1 2
selectid,name,map_values(family_members) as relations,age from t_family;
查出每个人的亲人数量
sql
1 2
selectid,name,size(family_members) as relations,age from t_family;
查出所有拥有兄弟的人及他的兄弟是谁
sql
1 2 3 4 5 6 7 8 9 10 11
-- 方案1:一句话写完 selectid,name,age,family_members['brother'] from t_family where array_contains(map_keys(family_members),'brother');
-- 方案2:子查询 selectid,name,age,family_members['brother'] from (selectid,name,age,map_keys(family_members) as relations,family_members from t_family) tmp where array_contains(relations,'brother');
load data local inpath '/root/udftest.data' into table t_user_info;
需求:利用上表生成如下新表
sql
1
t_user:uid,uname,age,birthday,address
思路:可以自定义一个函数parse_user_info(),能传入一行上述数据,返回切分好的字段
然后可以通过如下sql完成需求:
sql
1 2 3 4 5 6 7 8 9 10
create t_user as select parse_user_info(info,0) as uid, parse_user_info(info,1) as uname, parse_user_info(info,2) as age, parse_user_info(info,3) as birthday_date, parse_user_info(info,4) as birthday_time, parse_user_info(info,5) as address from t_user_info;