Hiv 语法
2021/8/2 23:08:47
本文主要是介绍Hiv 语法,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
HIV语法
-
1、创建数据库
create database hive_db;显示数据库中表
show tables;
切换数据库
use hive_db;
显示数据库中
show databases;
显示某表结构,两种方式 1:表结构 2:表结构和分区数据位置等
desc 表名;
show create table 表名 -
2、创建表脚本
2.1 内部表-直接建表
create table IF NOT EXISTS orders
(
order_id string comment "订单编号",
user_id string comment "用户ID",
order_number string comment "下单顺序",
order_dow string comment "下单日期周一到周日",
order_hour_of_day string comment "下单时间",
days_since_prior_order string comment "距离上一次购物时间"
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
STORED AS TEXTFILE;create table IF NOT EXISTS order_product( order_id string comment "订单编号", product_id string comment "物品ID", add_to_cart_order string comment "加入购物车时间", reordered string comment "是否复购" ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
2.1 查询建表
#场景:建立一个临时表,或者一个中间表
create table movies_tem as select * from movies limit 100;
2.2 like 建表
CREATE TABLE IF NOT EXISTS default.weblog_20150923
LIKE default.weblog ; -
3、导入数据
3.1 本地导入
load data local inpath ‘/user/root/custom.csv’ overwrite into table 表明
#用overwrite 加载本地数据到hive数据仓库
#local:加上local指本地的数据路径,也就是在linux系统下的文件路径
#不加local:指文件在hdfs下的路径,文件上传到hdfs后的路径 -
4、collect_list和collect_set
它们都是将分组中的某列转为一个数组返回,不同的是collect_list不去重而collect_set去重。
https://www.cnblogs.com/cc11001100/p/9043946.html -
5、split分割和数组转行
select split("I Love you", " ")
select explode(split("I Love you", " ")); -
6、row_number()、rank()、dense_rank()三个函数区别
-
7、时间窗口
row_number() over()
sum() over()从最早的时间距你当前的时间
select *, sum(result) over (partition by user_name order by create_time) as result_sum
from user_match_temp从你当前时间到当前时间的前三条数据,不包括本条数据
select *, avg(result) over (partition by user_name order by create_time rows between 3 preceding and current row) as recenty_wins
from user_match_temp -
8、case when 条件表达式 then 表达式为true返回值 else 表达式为false返回值 end
select uid,iid,score,case when score<=1 then '0-1' when score>1 and score<=3 then '1-3' when score>3 then '3-5' else '-1' end as
score_rank from movies limit 30; -
9、if(条件表达式,表达式为true返回值,表达式为false返回值)
-
10、自定义函数(UDF\UDAF\UDTF)
-
11、concat 拼接
select concat("{",'aaa',"}"); -
12、regexp_replace #正在表达式
select regexp_replace("'course')}", "^\W+|\W+$","") -
13、 针对json格式进行解析处理并转换成多行处理
a:shangdong,b:beijing,c:shanghai 1,2,3 [{"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jdmart","monthSales":7900,"userCount":2900,"score":"5.9"},{"source":"yam","monthSales":54900,"userCount":12900,"score":"4.9"}] a:tianjing,b:beijing,c:shanghai 3,4,5 [{"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jdmart","monthSales":7900,"userCount":2900,"score":"5.9"},{"source":"yam","monthSales":54900,"userCount":12900,"score":"4.9"}]
#去掉开头和结尾的[{、}] select regexp_replace(sale_info,'\\[\\{|\\}\\]','') from explode_test
"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jdmart","monthSales":7900,"userCount":2900,"score":"5.9"},{"source":"yam","monthSales":54900,"userCount":12900,"score":"4.9"
"source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9"},{"source":"jdmart","monthSales":7900,"userCount":2900,"score":"5.9"},{"source":"yam","monthSales":54900,"userCount":12900,"score":"4.9"
# 拆分并转成多行 select explode(split(regexp_replace(sale_info,'\\[\\{|\\}\\]',''),"\\},\\{")) from explode_test ~~~显示结果 "source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9" "source":"jdmart","monthSales":7900,"userCount":2900,"score":"5.9" "source":"yam","monthSales":54900,"userCount":12900,"score":"4.9" "source":"7fresh","monthSales":4900,"userCount":1900,"score":"9.9" "source":"jdmart","monthSales":7900,"userCount":2900,"score":"5.9" "source":"yam","monthSales":54900,"userCount":12900,"score":"4.9"
通过get_json_object函数转换成json并获取json中属性
select
get_json_object(concat("{",t.infos,"}"),"$.source"),
get_json_object(concat("{",t.infos,"}"),"$.monthSales"),
get_json_object(concat("{",t.infos,"}"),"$.score")
from
(
select explode(split(regexp_replace(sale_info,'\[\{|\}\]',''),"\},\{")) as infos from explode_test
)t
7fresh 4900 9.9 jdmart 7900 5.9 yam 54900 4.9 7fresh 4900 9.9 jdmart 7900 5.9 yam 54900 4.9
扩展,如果显示area,如何处理,可以通过udtf即lateral view进行实现
select
area,
get_json_object(concat("{",infos,"}"),"$.source"),
get_json_object(concat("{",infos,"}"),"$.monthSales"),
get_json_object(concat("{",infos,"}"),"$.score")
from explode_test lateral view explode(split(regexp_replace(sale_info,'\[\{|\}\]',''),"\},\{")) g as infos
针对hive一些配置说明 1、当select * from xxx,不能显示列名的时候,可以到hive-site.xml里面添加 <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> <property> <name>hive.cli.print.header</name> <value>true</value> </property> 或者进入hive之后 – set hive.cli.print.current.db=true; – set hive.cli.print.header=true;
这篇关于Hiv 语法的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-12-23线下车企门店如何实现线上线下融合?
- 2024-12-23鸿蒙Next ArkTS编程规范总结
- 2024-12-23物流团队冬至高效运转,哪款办公软件可助力风险评估?
- 2024-12-23优化库存,提升效率:医药企业如何借助看板软件实现仓库智能化
- 2024-12-23项目管理零负担!轻量化看板工具如何助力团队协作
- 2024-12-23电商活动复盘,为何是团队成长的核心环节?
- 2024-12-23鸿蒙Next ArkTS高性能编程实战
- 2024-12-23数据驱动:电商复盘从基础到进阶!
- 2024-12-23从数据到客户:跨境电商如何通过销售跟踪工具提升营销精准度?
- 2024-12-23汽车4S店运营效率提升的核心工具