SQL编程了
版主: Softfist
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
SQL编程了
这看起来是个经典问题,用其他语言很容易编,但是我是在SQL里面碰到这个问题,也想挑战一下SQL编程。
有一个table,叫Account吧,它有两个columns:
AccountID int primary key,
AccountSize int
比如
AccountID, AccountSize
1, 120
2, 157
3, 196
4, 529
....
给定一个batch_size比如50000吧,把AccountID按顺序赋予BatchOrder,要求每填满一个batch之后(停在第一个多于batch_size的地方,另一个问题是停在最后一个少于batch_size的地方,这个可能简单一点),就赋予一个新的连续的BatchOrder。
最后输出是这样:
AccountID, AccountSize, BatchOrder
1, 120, 1
2, 157, 1
....
87, 324, 2
88, 426, 2
....
有一个table,叫Account吧,它有两个columns:
AccountID int primary key,
AccountSize int
比如
AccountID, AccountSize
1, 120
2, 157
3, 196
4, 529
....
给定一个batch_size比如50000吧,把AccountID按顺序赋予BatchOrder,要求每填满一个batch之后(停在第一个多于batch_size的地方,另一个问题是停在最后一个少于batch_size的地方,这个可能简单一点),就赋予一个新的连续的BatchOrder。
最后输出是这样:
AccountID, AccountSize, BatchOrder
1, 120, 1
2, 157, 1
....
87, 324, 2
88, 426, 2
....
-
- 论坛支柱
CanGuanGong 的博客 - 帖子互动: 387
- 帖子: 8860
- 注册时间: 2022年 7月 23日 18:05
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
select AccountID, AccountSize, count(AccountSize) from Account group by AccountID, AccountSize order by AccountID, count(AccountSize)
应该可以了吧。
应该可以了吧。
逼将有三种:老逼将,小逼将,装逼将
消灭买办是唯一出路
所谓女性解放,就是人类走向灭亡的开端
哈,狗子急了
消灭买办是唯一出路
所谓女性解放,就是人类走向灭亡的开端
哈,狗子急了
Re: SQL编程了
用row_number() / batch size 取整即可
TheMatrix 写了: 2022年 7月 29日 15:44 这看起来是个经典问题,用其他语言很容易编,但是我是在SQL里面碰到这个问题,也想挑战一下SQL编程。
有一个table,叫Account吧,它有两个columns:
AccountID int primary key,
AccountSize int
比如
AccountID, AccountSize
1, 120
2, 157
3, 196
4, 529
....
给定一个batch_size比如50000吧,把AccountID按顺序赋予BatchOrder,要求每填满一个batch之后(停在第一个多于batch_size的地方,另一个问题是停在最后一个少于batch_size的地方,这个可能简单一点),就赋予一个新的连续的BatchOrder。
最后输出是这样:
AccountID, AccountSize, BatchOrder
1, 120, 1
2, 157, 1
....
87, 324, 2
88, 426, 2
....
Re: SQL编程了
那就sum(account size) over (order by account id, account size) as running_total,然后running_total/batch size取整
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
我就是这么做的。这是近似做法。实际问题中这就差不多了。但是要想挑战编程的话,这个是不正确的。pda 写了: 2022年 7月 29日 18:34 那就sum(account size) over (order by account id, account size) as running_total,然后running_total/batch size取整
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
这个应该是必须要用recursive CTE的。我下面这个SQL还不对,原因是analytic function在recursive CTE中不能正常工作。所以这个问题目前还是无解。
with
account_batch as (
select AccountID, AccountSize
, 1 iteration
, case when sum(AccountSize) over (order by AccountID) < 50000
then 1 else 0 end BatchOrder
from Account
union all
select AccountID, AccountSize
, iteration+1 iteration
, case when sum(AccountSize) over (order by AccountID) < 50000
then iteration+1 else 0 end BatchOrder
from account_batch ab
where BatchOrder = 0
)
select AccountID, AccountSize, BatchOrder
from account_batch
where BatchOrder != 0
order by AccountID
with
account_batch as (
select AccountID, AccountSize
, 1 iteration
, case when sum(AccountSize) over (order by AccountID) < 50000
then 1 else 0 end BatchOrder
from Account
union all
select AccountID, AccountSize
, iteration+1 iteration
, case when sum(AccountSize) over (order by AccountID) < 50000
then iteration+1 else 0 end BatchOrder
from account_batch ab
where BatchOrder = 0
)
select AccountID, AccountSize, BatchOrder
from account_batch
where BatchOrder != 0
order by AccountID
Re: SQL编程了
用awk应该不需要超过十行吧
TheMatrix 写了: 2022年 7月 30日 09:55 这个应该是必须要用recursive CTE的。我下面这个SQL还不对,原因是analytic function在recursive CTE中不能正常工作。所以这个问题目前还是无解。
⚪︎ 辟邪剑法七折大酬宾~~
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
我发现编程就是在限定工具条件下解决问题,当然要已知算法,未知算法的是另外一个问题。
SQL属于工具条件比较简陋,所以用SQL编是一个挑战。
这个问题用函数式编程也不是特别的容易,因为函数式编程的限定条件也比较多。
这个问题本质上就是procedural编程,不用procedural就要发明各种特殊工具。
比如函数式编程里的reduce,还有SQL中的analytic function。
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
代码: 全选
with acct as (
select 1 as AccountId, 34 as AccountSize
union
select 2 as AccountId, 6 as AccountSize
union
select 3 as AccountId, 7 as AccountSize
union
select 4 as AccountId, 60 as Accountsize
union
select 5 as AccountId, 4 as AccountSize
union
select 6 as AccountId, 23 as AccountSize
union
select 7 as AccountId, 33 as AccountSize
union
select 8 as AccountId, 10 as AccountSize
union
select 9 as AccountId, 14 as AccountSize
union
select 10 as AccountId, 30 as AccountSize
union
select 11 as AccountId, 8 as AccountSize
union
select 12 as AccountId, 17 as AccountSize
),
acct_join as (
select a.AccountId as IdStart,b.AccountId as IdEnd
from acct a, acct b
where a.AccountId<=b.AccountId
),
acct_sum as (
select IdStart, IdEnd,(
select Sum(AccountSize)
from acct where Account Id>=IdStart and AccountId<=IdEnd
) as SumSize
from acct_join
),
acct_sum_no_larger as (
select *
from acct_sum a
where not exists (
select *
from acct_sumb
where a.IdStart=b.IdStart and b.SumSize<a.SumSize and b.SumSize>=50
)
),
acct_sum_no_smaller as (
select *
from acct_sum_no_larger a
where not exists (
select *
from acct_sum_no_larger b
where a.IdStart=b.Idstart and b.SumSize>a.SumSize
)
),
acct_batch as (
select IdStart, IdEnd, 1 as BatchNum
from acct_sum_no_smaller
where IdStart=1
union all
select b.IdStart, b.IdEnd, BatchNum+1 as BatchNum
from acct batch a join acct_sum_no_smaller b on a.IdEnd+1=b.IdStart
)
select * from acct_batch
Re: SQL编程了
recursive cte是正解
declare @batch_size int = 10;
with act as
(
select AccountId, AccountSize, convert(int, row_number() over (order by AccountId, AccountSize)) as rn, 0 as run_total
from account
)
,
cte as
(
select 0 as AccountId, 0 as AccountSize, 0 as rn, 0 as run_total, 0 as batch
union all
select act.AccountId, act.AccountSize, act.rn,
case when cte.run_total + act.AccountSize > @batch_size then act.AccountSize else cte.run_total + act.AccountSize end as run_total,
case when cte.run_total + act.AccountSize > @batch_size then batch + 1 else batch end as batch
from act join cte on cte.rn = act.rn - 1
)
select * from cte
where
accountid > 0
AccountId AccountSize
1 2
2 4
1 3
3 6
5 8
4 2
7 10
9 7
AccountId AccountSize rn run_total batch
1 2 1 2 0
1 3 2 5 0
2 4 3 9 0
3 6 4 6 1
4 2 5 8 1
5 8 6 8 2
7 10 7 10 3
9 7 8 7 4
declare @batch_size int = 10;
with act as
(
select AccountId, AccountSize, convert(int, row_number() over (order by AccountId, AccountSize)) as rn, 0 as run_total
from account
)
,
cte as
(
select 0 as AccountId, 0 as AccountSize, 0 as rn, 0 as run_total, 0 as batch
union all
select act.AccountId, act.AccountSize, act.rn,
case when cte.run_total + act.AccountSize > @batch_size then act.AccountSize else cte.run_total + act.AccountSize end as run_total,
case when cte.run_total + act.AccountSize > @batch_size then batch + 1 else batch end as batch
from act join cte on cte.rn = act.rn - 1
)
select * from cte
where
accountid > 0
AccountId AccountSize
1 2
2 4
1 3
3 6
5 8
4 2
7 10
9 7
AccountId AccountSize rn run_total batch
1 2 1 2 0
1 3 2 5 0
2 4 3 9 0
3 6 4 6 1
4 2 5 8 1
5 8 6 8 2
7 10 7 10 3
9 7 8 7 4
TheMatrix 写了: 2022年 7月 30日 09:55 这个应该是必须要用recursive CTE的。我下面这个SQL还不对,原因是analytic function在recursive CTE中不能正常工作。所以这个问题目前还是无解。
with
account_batch as (
select AccountID, AccountSize
, 1 iteration
, case when sum(AccountSize) over (order by AccountID) < 50000
then 1 else 0 end BatchOrder
from Account
union all
select AccountID, AccountSize
, iteration+1 iteration
, case when sum(AccountSize) over (order by AccountID) < 50000
then iteration+1 else 0 end BatchOrder
from account_batch ab
where BatchOrder = 0
)
select AccountID, AccountSize, BatchOrder
from account_batch
where BatchOrder != 0
order by AccountID
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
漂亮。连analytic function都没用。纯classical join。当然,recursive CTE还是必须得用的。
webbew 写了: 2022年 7月 30日 11:52代码: 全选
with acct as ( select 1 as AccountId, 34 as AccountSize union select 2 as AccountId, 6 as AccountSize union select 3 as AccountId, 7 as AccountSize union select 4 as AccountId, 60 as Accountsize union select 5 as AccountId, 4 as AccountSize union select 6 as AccountId, 23 as AccountSize union select 7 as AccountId, 33 as AccountSize union select 8 as AccountId, 10 as AccountSize union select 9 as AccountId, 14 as AccountSize union select 10 as AccountId, 30 as AccountSize union select 11 as AccountId, 8 as AccountSize union select 12 as AccountId, 17 as AccountSize ), acct_join as ( select a.AccountId as IdStart,b.AccountId as IdEnd from acct a, acct b where a.AccountId<=b.AccountId ), acct_sum as ( select IdStart, IdEnd,( select Sum(AccountSize) from acct where Account Id>=IdStart and AccountId<=IdEnd ) as SumSize from acct_join ), acct_sum_no_larger as ( select * from acct_sum a where not exists ( select * from acct_sumb where a.IdStart=b.IdStart and b.SumSize<a.SumSize and b.SumSize>=50 ) ), acct_sum_no_smaller as ( select * from acct_sum_no_larger a where not exists ( select * from acct_sum_no_larger b where a.IdStart=b.Idstart and b.SumSize>a.SumSize ) ), acct_batch as ( select IdStart, IdEnd, 1 as BatchNum from acct_sum_no_smaller where IdStart=1 union all select b.IdStart, b.IdEnd, BatchNum+1 as BatchNum from acct batch a join acct_sum_no_smaller b on a.IdEnd+1=b.IdStart ) select * from acct_batch
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
-
- 论坛支柱
2024年度优秀版主
TheMatrix 的博客 - 帖子互动: 264
- 帖子: 13231
- 注册时间: 2022年 7月 26日 00:35
Re: SQL编程了
webbew和pda的两种解法是本问题的两个经典解法。
webbew的解法纯用join,把解空间全部列出来,逐步变换缩小。这是SQL表处理的精髓。
pda的解法用SQL recursive CTE实现table逐行处理,也就是实现了函数式编程中的reduce,完成了SQL的图灵完备。
webbew的解法纯用join,把解空间全部列出来,逐步变换缩小。这是SQL表处理的精髓。
pda的解法用SQL recursive CTE实现table逐行处理,也就是实现了函数式编程中的reduce,完成了SQL的图灵完备。