Oracle 11g学习笔记–分析函数

共计 5490 个字符，预计需要花费 14 分钟才能阅读完成。

Oracle 11g 学习笔记–分析函数

示例表：
Oracle 11g 学习笔记 -- 分析函数

评级函数

rank()/dense_rank()
返回数据项在分组中的排名，前者在排名相等的情况下，会留下空位，后者不会

 select emp_id, sum(amount),
rank() over (order by sum(amount) desc [{nulls last]|nulls first}]) as rank,
dense_rank() over (order by sum(amount) desc) as dense_rank
from all_sales
group by emp_id;

Oracle 11g 学习笔记 -- 分析函数

该句中的 desc 的含义就是降序，可换为 asc 进行升序排名；
[{nulls last]|nulls first}]指明了，将空结果排列在第一名，还是最排在最后一名；默认的情况下，系统会视空数据为最低，再根据升序或者降序选择显示位置；

cume_dist 和 percent_rank 函数
cume_dist 可以计算某个特定值相对于一组值中的位置；
percent_rank 可计算某个值相对于一组值的百分比排名；

 select 
    prd_type_id, sum(amount),
    cume_dist() over (order by sum(amount) desc) as cume_dist,
    percent_rank() over (order by sum(amount) desc) as percent_rank 
    from all_sales
    where year = 2003
    group by prd_type_id
    order by sum(amount) desc;

Oracle 11g 学习笔记 -- 分析函数
从中我们便容易便可以看出两个函数的作用了
ntile（buket）函数
可以计算 n 分片的值,buket 指定了分片的片数，记录将被分组为 buket 个片。

 select prd_type_id, sum(amount),
    ntile(3) over(order by sum(amount) desc) as ntile
from all_sales
where year = 2003
and amount is not null
group by prd_type_id
order by sum(amount) desc;

Oracle 11g 学习笔记 -- 分析函数
可以看出片规定了最低的名次，多出的名次均是第一；

row_number
row_number()从 1 开始，为每一条分组记录返回一个数字。

 select 
    prd_type_id, sum(amount),
    dense_rank() over (order by sum(amount) desc nulls last) as dense_rank,
    row_number() over (order by sum(amount) desc nulls last) as row_number
from all_sales
where year = 2003
group by prd_type_id 
order by sum(amount) desc;

Oracle 11g 学习笔记 -- 分析函数
通过该图初步判定（猜测）row_number()的作用，和 dense_rank 作用一样；只不过在排名相等的情况下，不可能出现并排；

反百分点函数

反百分点函数有两个：percentile_disc(x)和 percentile_cont(x)
他们的作用于 cume_dist()和 percent_rank()相反。percentile_disc(x)在每一个分组中检查累积分布的数值，直到大于或者等于 x 的值。
percentile_cont(x)在每一个分组中检查百分比排名的值，直到周到找打大于或者等于 x 的值。

 select 
    percentile_cont(0.6) within group (order by sum(amount) desc) as percentile_count,
    percentile_disc(0.5) within group (order by sum(amount) desc) as percentile_disc
    from all_sales
    where year = 2003 and amount is not null
    group by prd_type_id;

? percentile_cont()什么意思，没看懂。

窗口函数

窗口函数可以计算一定记录范围内，一定值域内，或者一段时间内的累计和以及移动平均值。窗口可以与这些函数结合使用：sum(), avg(), max(), min(), count(), variance(), stddev(), first_value(), last_value()。

那么窗口函数到底用来干什么的呢？

计算累计和

 select 
    month, sum(amount) as m_amount, 
    sum(sum(amount)) over (order by month rows between unbounded preceding and current row) as cumulative_amount
 
    from all_sales
    where year = 2003
    group by month
    order by month;

Oracle 11g 学习笔记 -- 分析函数

计算移动平均值

 select
    month as month, sum(amount) as month_amount, 
    sum(sum(amount)) over (order by month rows between unbounded preceding and current row)  as sum,
    avg(sum(amount)) over (order by month rows between unbounded preceding and current row)  as avg
    from all_sales
    where year = 2003 and month between 6
 and 12
    group by month
    order by month;

Oracle 11g 学习笔记 -- 分析函数

从上图可以看出所谓的移动平均值就是将当前行到窗口起点所有项的平均值，其实计算的项是根据窗口大小具体而定的；

计算中心平均值

 select 
    month, sum(amount) as month_amount,
    sum(sum(amount)) over (order by month rows between 1 preceding and 1 following) as moving_sum,
    avg(sum(amount)) over (order by month rows between 1 preceding and 1 following) as moving_average
    from all_sales
    where year = 2003
    group by month
    order by month;

Oracle 11g 学习笔记 -- 分析函数
从图可以看出，所谓的中心平均值，就是讲当月和上月的平均值

frist_value 和 last_value
这个函数的作用是获取窗口的第一行和最后一行数据
使用方法和上面的类似，读者自测；

补充：不知道从以上的列子中你是否看出了 over 里面语句的作用没？
其实它的作用就是规定了一个窗口，而前面的函数只是对这个窗口的函数进行操作；窗口规定：
between 1 preceding and 1 following：就是从前一行到后一行
between unbounded preceding and current row：（未绑定，默认从开始处，）从最开始处到当前行

报表函数

报表函数可用与执行跨越分组和组内分区的计算。

总计报表

 select 
    month, prd_type_id,
    sum(sum(amount)) over (partition by month)
as total_month_amount,
    sum(sum(amount)) over (partition by prd_type_id) as total_type_amount
    from all_sales
    where year = 2003 and month <= 3
    group by month, prd_type_id
    order by month, prd_type_id;

Oracle 11g 学习笔记 -- 分析函数

表达式分解：
sum(amount)计算一个销量的总和，外的 sum()计算总计
over(pratition by month)让外部的 sum(）计算每一个月的总计
使用 patio_to_report 函数
该函数用来计算某个值在一组值的总和中所占的比率；

 select month, 
sum(amount) as prd_type_amount,
ratio_to_report(sum(amount)) over (partition by month) as prd_type_ratio from all_sales
where year = 2003 and month <= 3
group by month, prd_type_id
order by month;

Oracle 11g 学习笔记 -- 分析函数
月总计

从两张图可以看出，计算的是当行的数据在总计中所占的比率

延迟与领先函数

lag 和 lead 函数可获取距当前记录指定距离处的那条记录中的数据。

 select month, sum(amount) as month_amount,
lag(sum(amount), 2) over (order by month) as previous_month_amount,
lead(sum(amount), 2) over (order by month) as next_month_amount
from all_sales
where year = 2003
group by month
order by month;

Oracle 11g 学习笔记 -- 分析函数

首函数与末函数

first 和 last 函数可获取一个排序分组中的第一个值和最后一个值.rirst 和 last 可以与系列函数一起使用：min(), max(), sum(), avg(), stddev(), varlance().

 select 
    max(month) keep (dense_rank first order by sum(amount)) as highest_sales_month,
    min(month) keep (dense_rank last order by sum(amount))
    from all_sales
    where year = 2003
    group by month 
    order by month;
-------------------------------------
select 
    month,  sum(amount)
from all_sales
    group by month
    order by sum(amount);

Oracle 11g 学习笔记 -- 分析函数

线性回归函数

线性回归函数可以用普通最小平方回归曲线拟合一组数值对，线性回归函数可用于聚合，窗口或报表函数；

函数	说明
regr_avgx(y,x)	先去除 x 或 y 为空值的 x 和 y 数值对, 然后返回 x 的平均值
regr_avgy(y,x)	先去除 x 或 y 为空值的 x 和 y 数值对, 然后返回 y 的平均值
regr_count(y,x)	返回可用于拟合回归曲线的非空数值对个数
regr_intercept(y,x)	返回回归曲线在 y 轴方向的截距
regr_r2(y,x)	返回回归曲线的决定系数，或相关系数（R-squared）
regr_slope(y,x)	返回回归曲线的斜率
regr_sxx(y,x)	返回 reg_count(y,x)*var_pop(x)
regr_sxy(y,x)	返回 reg_count(y,x)*covar_pop(y,x)
regr_syy(y,x)	返回 reg_count(y,x)*var_pop(y)

 select 
    prd_type_id,
    regr_avgx(&&y,&&x) as avgx,
    regr_avgy(&&y,&&x) as avgy，regr_count(&&y,&&x) as count，regr_intercept(&&y,&&x) as intercept，regr_r2(&&y,&&x) as r2，regr_slope(&&y,&&x) as solpe，regr_sxx(&&y,&&x) as sxx，regr_sxy(&&y,&&x) as sxy，regr_syy(&&y,&&x) as syy
from all_sales
where year = 2003
group by prd_type_id;

参数：y=amount , x=month
Oracle 11g 学习笔记 -- 分析函数

这方面实在是没有了解过，有点难以入脑；

假想评级及分布函数

假想评级与分布函数可以计算一条新的记录在表中的排名和百分比，而不用将其插入表中。下面这些函数可以与假想计算结合使用：rank(), dense_rank(), percent_rank()和 cume_dist()。

 select 
    prd_type_id, sum(amount),
    rank() over (order by sum(amount) desc) as rank, 
    percent_rank() over (order by sum(amount) desc) as rank
    from all_sales
    where year = 2003 and amount is not null
    group by prd_type_id
    order by sum(amount);
---------------------------------------
select 
    rank(&&amount) within group (order by sum(amount) desc) as rank,
    percent_rank(&&amount) within group (order by sum(amount) desc) as percent_rank
    from all_sales
    where year = 2003 and amount is not null
    group by prd_type_id
    order by prd_type_id;

Oracle 11g 学习笔记 -- 分析函数

参数：amount=500000
Oracle 11g 学习笔记 -- 分析函数

通过这种方式可以估计出排名，这就是假想评级

本文永久更新链接地址：http://www.linuxidc.com/Linux/2016-10/136113.htm

Oracle 11g学习笔记–分析函数

Oracle 11g 学习笔记–分析函数

评级函数

反百分点函数

报表函数

延迟与领先函数

首函数与末函数

线性回归函数

假想评级及分布函数

申请腾讯混元的API Key并且使用LobeChat调用混元AI

Docker部署搭建一个开源强大的图书管理系统

基于Docker快速搭建一个开源的IT人员在线工具箱-it-tools

让每个人都可以轻松使用Git-腾讯自研Git客户端

使用Docker部署开源的WPS-Office

HTTP协议详解

Bash 的路径补全功能

腾讯云服务器官网首页txyfwq（腾讯云服务器首字母）

Linux top 命令快速入门

fold 命令入门学习

	select emp_id, sum(amount),
	rank() over (order by sum(amount) desc [{nulls last]\|nulls first}]) as rank,
	dense_rank() over (order by sum(amount) desc) as dense_rank
	from all_sales
	group by emp_id;

	select
	prd_type_id, sum(amount),
	cume_dist() over (order by sum(amount) desc) as cume_dist,
	percent_rank() over (order by sum(amount) desc) as percent_rank
	from all_sales
	where year = 2003
	group by prd_type_id
	order by sum(amount) desc;

	select prd_type_id, sum(amount),
	ntile(3) over(order by sum(amount) desc) as ntile
	from all_sales
	where year = 2003
	and amount is not null
	group by prd_type_id
	order by sum(amount) desc;

	select
	percentile_cont(0.6) within group (order by sum(amount) desc) as percentile_count,
	percentile_disc(0.5) within group (order by sum(amount) desc) as percentile_disc
	from all_sales
	where year = 2003 and amount is not null
	group by prd_type_id;

	select
	month, sum(amount) as m_amount,
	sum(sum(amount)) over (order by month rows between unbounded preceding and current row) as cumulative_amount

	from all_sales
	where year = 2003
	group by month
	order by month;

	select
	month as month, sum(amount) as month_amount,
	sum(sum(amount)) over (order by month rows between unbounded preceding and current row) as sum,
	avg(sum(amount)) over (order by month rows between unbounded preceding and current row) as avg
	from all_sales
	where year = 2003 and month between 6
	and 12
	group by month
	order by month;

	select
	month, sum(amount) as month_amount,
	sum(sum(amount)) over (order by month rows between 1 preceding and 1 following) as moving_sum,
	avg(sum(amount)) over (order by month rows between 1 preceding and 1 following) as moving_average
	from all_sales
	where year = 2003
	group by month
	order by month;

	select
	month, prd_type_id,
	sum(sum(amount)) over (partition by month)
	as total_month_amount,
	sum(sum(amount)) over (partition by prd_type_id) as total_type_amount
	from all_sales
	where year = 2003 and month <= 3
	group by month, prd_type_id
	order by month, prd_type_id;

	select month,
	sum(amount) as prd_type_amount,
	ratio_to_report(sum(amount)) over (partition by month) as prd_type_ratio from all_sales
	where year = 2003 and month <= 3
	group by month, prd_type_id
	order by month;

	select month, sum(amount) as month_amount,
	lag(sum(amount), 2) over (order by month) as previous_month_amount,
	lead(sum(amount), 2) over (order by month) as next_month_amount
	from all_sales
	where year = 2003
	group by month
	order by month;

	select
	max(month) keep (dense_rank first order by sum(amount)) as highest_sales_month,
	min(month) keep (dense_rank last order by sum(amount))
	from all_sales
	where year = 2003
	group by month
	order by month;
	-------------------------------------
	select
	month, sum(amount)
	from all_sales
	group by month
	order by sum(amount);

	select
	prd_type_id,
	regr_avgx(&&y,&&x) as avgx,
	regr_avgy(&&y,&&x) as avgy，regr_count(&&y,&&x) as count，regr_intercept(&&y,&&x) as intercept，regr_r2(&&y,&&x) as r2，regr_slope(&&y,&&x) as solpe，regr_sxx(&&y,&&x) as sxx，regr_sxy(&&y,&&x) as sxy，regr_syy(&&y,&&x) as syy
	from all_sales
	where year = 2003
	group by prd_type_id;

	select
	prd_type_id, sum(amount),
	rank() over (order by sum(amount) desc) as rank,
	percent_rank() over (order by sum(amount) desc) as rank
	from all_sales
	where year = 2003 and amount is not null
	group by prd_type_id
	order by sum(amount);
	---------------------------------------
	select
	rank(&&amount) within group (order by sum(amount) desc) as rank,
	percent_rank(&&amount) within group (order by sum(amount) desc) as percent_rank
	from all_sales
	where year = 2003 and amount is not null
	group by prd_type_id
	order by prd_type_id;