诶,这个巧妙,他这个巧妙(lottery ticket hypothesis)

版主: huangchong

回复
头像
huangchong(净坛使者)楼主
论坛元老
论坛元老
2023-24年度优秀版主
帖子互动: 4076
帖子: 60751
注册时间: 2022年 7月 22日 01:22

#1 诶,这个巧妙,他这个巧妙(lottery ticket hypothesis)

帖子 huangchong(净坛使者)楼主 »

上次由 huangchong 在 2025年 3月 30日 09:03 修改。
原因: 未提供修改原因
头像
ɓuoɥɔɓuɐnɥ(poɓᴉuɯO pǝʇɹǝʌuI)
已冻结已冻结
帖子互动: 127
帖子: 1352
注册时间: 2024年 9月 27日 23:57

#2 Re: 诶,这个巧妙,他这个巧妙(lottery ticket hypothesis)

帖子 ɓuoɥɔɓuɐnɥ(poɓᴉuɯO pǝʇɹǝʌuI) »

"lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective.

有点像deepseek的专家模型
¡qooq ƃᴉq ɐ ǝɹɐ no⅄
头像
huangchong(净坛使者)楼主
论坛元老
论坛元老
2023-24年度优秀版主
帖子互动: 4076
帖子: 60751
注册时间: 2022年 7月 22日 01:22

#3 Re: 诶,这个巧妙,他这个巧妙(lottery ticket hypothesis)

帖子 huangchong(净坛使者)楼主 »

ɓuoɥɔɓuɐnɥ 写了: 2025年 3月 30日 13:03 "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective.

有点像deepseek的专家模型
good point
回复

回到 “肚皮舞运动(Joke)”