MySQL基础 开窗函数
2022/4/16 2:15:12
本文主要是介绍MySQL基础 开窗函数,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
目录- mysql语法
- 数据准备
- 1.聚合函数(分组函数)
- 1.聚合统计逻辑
- 2.函数使用
- 2.开窗函数
- 1.语法
- 2.聚合函数:多行数据 按照一定规则 进行聚合 为一行
- 3.内置窗口函数
- 4.内置窗口函数
- 1.取值 串行
- 1.串行
- 2.取值
- 2.排序
- 分组
- 排序
- 1.取值 串行
mysql语法
数据准备
create table emp ( empno numeric(4) not null, ename varchar(10), job varchar(9), mgr numeric(4), hiredate datetime, sal numeric(7, 2), comm numeric(7, 2), deptno numeric(2) ); insert into emp values (7369, 'SMITH', 'CLERK', 7902, '1980-12-17', 800, null, 20); insert into emp values (7499, 'ALLEN', 'SALESMAN', 7698, '1981-02-20', 1600, 300, 30); insert into emp values (7521, 'WARD', 'SALESMAN', 7698, '1981-02-22', 1250, 500, 30); insert into emp values (7566, 'JONES', 'MANAGER', 7839, '1981-04-02', 2975, null, 20); insert into emp values (7654, 'MARTIN', 'SALESMAN', 7698, '1981-09-28', 1250, 1400, 30); insert into emp values (7698, 'BLAKE', 'MANAGER', 7839, '1981-05-01', 2850, null, 30); insert into emp values (7782, 'CLARK', 'MANAGER', 7839, '1981-06-09', 2450, null, 10); insert into emp values (7788, 'SCOTT', 'ANALYST', 7566, '1982-12-09', 3000, null, 20); insert into emp values (7839, 'KING', 'PRESIDENT', null, '1981-11-17', 5000, null, 10); insert into emp values (7844, 'TURNER', 'SALESMAN', 7698, '1981-09-08', 1500, 0, 30); insert into emp values (7876, 'ADAMS', 'CLERK', 7788, '1983-01-12', 1100, null, 20); insert into emp values (7900, 'JAMES', 'CLERK', 7698, '1981-12-03', 950, null, 30); insert into emp values (7902, 'FORD', 'ANALYST', 7566, '1981-12-03', 3000, null, 20); insert into emp values (7934, 'MILLER', 'CLERK', 7782, '1982-01-23', 1300, null, 10);
1.聚合函数(分组函数)
1.聚合统计逻辑
聚合统计: group by => 分组 xianyu,<1,a,xc,asd> lxy,<as,zxf,zxf,qwr,ags> 聚合函数 => 指标 xianyu,4 lxy,5
2.函数使用
group by =》 分组 聚合函数 =》 指标统计 sun avg max min count 需求: 统计每个部门有多少个人? 查什么? 维度:部门 指标:人数 select deptno, count(1) as cnt from emp group by deptno; 解释: count(1) 【1.代表 先放置一个假数,然后再查询】 【2.理解为按照第几个字段进行查数】 select select + 函数 => 可以校验函数是否存在
2.开窗函数
1.语法
窗口函数: 窗口 + 函数 窗口:函数运行时 计算数据集的范围 函数:运行时的函数 1.聚合函数 sun avg max min count 2.内置窗口函数 语法结构: 函数 over([partition by xxx,...] [order by xxx,...]) over() 是以谁进行开窗【table or 数据集】 partition by:以谁进行分组 【group by column】 order by:以谁进行排序【column】
2.聚合函数:多行数据 按照一定规则 进行聚合 为一行
sum avg max... 理论上:聚合后的行数 <= 聚合前的行数 【主要是看维度选取 group by 里面的字段】 需求: 既要显示 聚合前的数据 又要显示 聚合后的数据 ? id name sal dt sal_all 1 zs 1000 2022-4 1000 2 ls 2000 2022-4 2000 3 wu 3000 2022-4 3000 4 zs 1000 2022-5 2000 5 ls 2000 2022-5 4000 6 wu 3000 2022-5 6000
数据:
服务器 每天的启动 次数
linux01,2022-04-15,1
linux01,2022-04-16,5
linux01,2022-04-17,7
linux01,2022-04-18,2
linux01,2022-04-19,3
linux01,2022-04-20,10
linux01,2022-04-21,4
统计累计问题: 创建表 create table window01( name varchar(50), dt varchar(20), cnt int ); 插入数据 insert into window01 values("linux01","2022-04-15",1); insert into window01 values("linux01","2022-04-16",5); insert into window01 values("linux01","2022-04-17",7); insert into window01 values("linux01","2022-04-18",2); insert into window01 values("linux01","2022-04-19",3); insert into window01 values("linux01","2022-04-20",10); insert into window01 values("linux01","2022-04-21",4); insert into window01 values("linux02","2022-04-18",20); insert into window01 values("linux02","2022-04-19",30); insert into window01 values("linux02","2022-04-20",10); insert into window01 values("linux02","2022-04-21",40); 需求: 每个服务器 每天 累积启动次数 select name, dt, cnt, sum(cnt) over(partition by name order by dt) as cut_all from window01; +---------+------------+------+---------+ | name | dt | cnt | cut_all | +---------+------------+------+---------+ | linux01 | 2022-04-15 | 1 | 1 | | linux01 | 2022-04-16 | 5 | 6 | | linux01 | 2022-04-17 | 7 | 13 | | linux01 | 2022-04-18 | 2 | 15 | | linux01 | 2022-04-19 | 3 | 18 | | linux01 | 2022-04-20 | 10 | 28 | | linux01 | 2022-04-21 | 4 | 32 | | linux02 | 2022-04-18 | 20 | 20 | | linux02 | 2022-04-19 | 30 | 50 | | linux02 | 2022-04-20 | 10 | 60 | | linux02 | 2022-04-21 | 40 | 100 | +---------+------------+------+---------+ 1 9 10 11 str 【字典序】 1 10 11 9 * 从1开始,1,2,3,4,5,6,7,8,9 * 从10开始,1,10…19,2,3,4,5,6,7,8,9 * 从20开始,1,10…19,2,20…29,3,4,5,6,7,8,9 * 以此类推,所有的10位数,都插入到与他们十位数位置上相等的个位数后面。
3.内置窗口函数
窗口大小 xxx between xxx and xxx 参数 (ROWS | RANGE) BETWEEN (UNBOUNDED | [num]) PRECEDING AND ([num] PRECEDING | CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING) (ROWS | RANGE) BETWEEN CURRENT ROW AND (CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING) (ROWS | RANGE) BETWEEN [num] FOLLOWING AND (UNBOUNDED | [num]) FOLLOWING select name, dt, cnt, sum(cnt) over(partition by name order by dt) as cut_all, -- 无边界 sum(cnt) over(partition by name order by dt rows between unbounded preceding and current row) as cut_all2, -- 前三行 + 当前行 sum(cnt) over(partition by name order by dt rows between 3 preceding and current row) as cut_all3, -- 前三行 + 当前行 + 下一行 sum(cnt) over(partition by name order by dt rows between 3 preceding and 1 following) as cut_all4, -- 上面无边界 + 下面无边界 sum(cnt) over(partition by name order by dt rows between unbounded preceding and UNBOUNDED FOLLOWING) as cut_all5 from window01; select name, dt, cnt, -- 常规分组排序求加和 sum(cnt) over(partition by name order by dt) as cut_all, -- 整张表对时间排序,然后加和,作用到整张表,理解为18号并列有两条数据 sum(cnt) over(order by dt) as cut_all2, -- 对整张表进行加和 sum(cnt) over() as cut_all3, -- 直接按照名字进分组 sum(cnt) over(partition by name) as cut_all4 from window01 order by dt; 1.partition by 不加 => 作用整张表 数仓顺序 ods不动 union all + group by select ifnull case when join group by grouping sets 【维度组合】
4.内置窗口函数
1.取值 串行
1.串行
LAG 【窗口内 向上 第n行的值 当前行向上取一行】 LAG(column [, N[, default]]) column => 列名 n => 取几行 default => 取不到给默认值 LEAD 【窗口内 向下 第n行的值 当前行向下取一行】 select name, dt, cnt, sum(cnt) over(partition by name order by dt) as cut_all, lead(dt,1,"9999-99-99") over(partition by name order by dt) as lead_alias, lead(dt,1,"9999-99-99") over(partition by name order by dt) as lag_alias from window01;
2.取值
FIRST_VALUE() : 取分组内排序后 截止到当前行 第一个值 LAST_VALUE():取分组内排序后 截止到当前行 最后一个值 select name, dt, cnt, first_value(cnt) over(partition by name order by dt) as f_value, last_value(cnt) over(partition by name order by dt) as l_value from window01;
2.排序
分组
ntile 需求: 把数据按照某个字段进行排序,把数据分成n份ntile(n) 如果不能平均分配 优先分配到编号小的里面 select name, dt, cnt, sum(cnt) over(partition by name order by dt) as cut_all, -- 平均分成n份,不能平均分,优先把多余的放到最小的里面 ntile(2) over(partition by name order by dt) as n2, ntile(3) over(partition by name order by dt) as n3 from window01 order by dt;
排序
rank : 从1开始,按照排序 相同会重复,名次会留下空位 生成组内的记录编号 row_number: 从1开始,按照排序 生成组内的记录编号 dense_rank:从1开始,按照排序 相同会重复,名次不会留下空位 生成组内的记录编号 select name, dt, cnt, sum(cnt) over(partition by name order by dt) as cut_all, rank() over(partition by name order by cnt desc) as rk, row_number() over(partition by name order by cnt desc) as rw, dense_rank() over(partition by name order by cnt desc) as d_rk from window01;
这篇关于MySQL基础 开窗函数的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-11-16MySQL资料:新手入门教程
- 2024-11-16MySQL资料:新手入门教程
- 2024-11-15MySQL教程:初学者必备的MySQL数据库入门指南
- 2024-11-15MySQL教程:初学者必看的MySQL入门指南
- 2024-11-04部署MySQL集群项目实战:新手入门教程
- 2024-11-04如何部署MySQL集群资料:新手入门指南
- 2024-11-02MySQL集群项目实战:新手入门指南
- 2024-11-02初学者指南:部署MySQL集群资料
- 2024-11-01部署MySQL集群教程:新手入门指南
- 2024-11-01如何部署MySQL集群:新手入门教程