AI算法新论文诸位来点评一下

StMichael · 帖子由 **StMichael楼主** » 2024年 5月 1日 18:14

点评一下，是不是一个新突破

https://arxiv.org/html/2404.19756v1

Abstract

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes (“neurons”), KANs have learnable activation functions on edges (“weights”). KANs have no linear weights at all – every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful “collaborators” helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today’s deep learning models which rely heavily on MLPs.

StMichael · 帖子由 **StMichael楼主** » 2024年 5月 1日 18:16

一作刘子鸣，北大物理系毕业的的

FoxMe · 帖子由 **FoxMe** » 2024年 5月 2日 13:43

忘了，现在美国还要用苏联人几十年前的老古董。米利坚应该立法禁止使用苏联的科学成果。

Kolmogorov–Arnold representation theorem (or superposition theorem) states that every multivariate continuous function can be represented as a superposition of the two-argument addition of continuous functions of one variable.

TheMatrix · 帖子由 **TheMatrix** » 2024年 5月 2日 14:00

StMichael 写了： ↑2024年 5月 1日 18:14 点评一下，是不是一个新突破

https://arxiv.org/html/2404.19756v1

Abstract

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes (“neurons”), KANs have learnable activation functions on edges (“weights”). KANs have no linear weights at all – every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful “collaborators” helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today’s deep learning models which rely heavily on MLPs.

看了。我觉得很不错。论文陈述也非常清楚。

用spline来拟合曲线，感觉效率是很高的。我觉得可能可以减少back propagation的用量。我觉得如果说人的智能和deep learning的智能有什么区别的话，就在于back propagation。人没有那么多的back propagation。

我看了一下spline，里面很重要的在于control point。几个点定下来了，曲线形状就大致定下来了。而control point的数量，似乎对应论文里的grid。所以论文里面grid变细似乎比网络变深和变宽还管用。

Caravel · 帖子由 **Caravel** » 2024年 5月 2日 15:26

TheMatrix 写了： ↑2024年 5月 2日 14:00 看了。我觉得很不错。论文陈述也非常清楚。

用spline来拟合曲线，感觉效率是很高的。我觉得可能可以减少back propagation的用量。我觉得如果说人的智能和deep learning的智能有什么区别的话，就在于back propagation。人没有那么多的back propagation。

我看了一下spline，里面很重要的在于control point。几个点定下来了，曲线形状就大致定下来了。而control point的数量，似乎对应论文里的grid。所以论文里面grid变细似乎比网络变深和变宽还管用。

做函数拟合这个更有效不奇怪，relu的非线性比较差，但是能不能像MLP那样通用存疑

hci · 帖子由 **hci** » 2024年 5月 2日 15:45

这些都是小打小闹，不是什么突破。基本的东西没有什么根本性的变化。

突破的例子，我来给你说一个：active learning，在工作中现学现用，这个深学做不到。其他不能做的事情太多了，所以扯什么AGI，都是鬼扯，差得也太远了点。

StMichael 写了： ↑2024年 5月 1日 18:14 点评一下，是不是一个新突破

https://arxiv.org/html/2404.19756v1

Abstract

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes (“neurons”), KANs have learnable activation functions on edges (“weights”). KANs have no linear weights at all – every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful “collaborators” helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today’s deep learning models which rely heavily on MLPs.

Caravel · 帖子由 **Caravel** » 2024年 5月 2日 16:30

hci 写了： ↑2024年 5月 2日 15:45 这些都是小打小闹，不是什么突破。基本的东西没有什么根本性的变化。

突破的例子，我来给你说一个：active learning，在工作中现学现用，这个深学做不到。其他不能做的事情太多了，所以扯什么AGI，都是鬼扯，差得也太远了点。

active learning 不难做啊，后台把一部分权重重新train一下，人active learning也需要时间，还有in context learning也属于active learning，特别现在的context window很长

hci · 帖子由 **hci** » 2024年 5月 2日 16:33

你去重新train一下试试？LOL.

context window很长，可没有任何learning，weights没有任何变化。

现在这个AI，只有nature，没有nurture。我老是世界上唯一这么说的。

Caravel 写了： ↑2024年 5月 2日 16:30 active learning 不难做啊，后台把一部分权重重新train一下，人active learning也需要时间，还有in context learning也属于active learning，特别现在的context window很长

Caravel · 帖子由 **Caravel** » 2024年 5月 2日 16:38

hci 写了： ↑2024年 5月 2日 16:33 你去重新train一下试试？LOL.

context window很长，可没有任何learning，weights没有任何变化。

现在这个AI，只有nature，没有nurture。

weights没有变化，但是会honor input的fact吧

hci · 帖子由 **hci** » 2024年 5月 2日 16:41

并没有learning呀。

Caravel 写了： ↑2024年 5月 2日 16:38 weights没有变化，但是会honor input的fact吧

tootsie · 帖子由 **tootsie** » 2024年 5月 2日 16:46

现在能在工作中现学现用的人也不多了吧。LOL

hci · 帖子由 **hci** » 2024年 5月 2日 16:54

小孩都会这个。其实不论什么东西，人大都是教一两遍就会了，要么就学不会。

你们这些人，没学过心理学，不懂的东西太多了，把人的能力看得太简单。

如我老的slides说的，现在这个AI，就是行为主义。

行为主义在心理学被赶走了，就是因为解释力太差，不能作的事太多了。

tootsie 写了： ↑2024年 5月 2日 16:46 现在能在工作中现学现用的人也不多了吧。LOL

TheMatrix · 帖子由 **TheMatrix** » 2024年 5月 2日 17:03

hci 写了： ↑2024年 5月 2日 15:45 这些都是小打小闹，不是什么突破。基本的东西没有什么根本性的变化。

突破的例子，我来给你说一个：active learning，在工作中现学现用，这个深学做不到。其他不能做的事情太多了，所以扯什么AGI，都是鬼扯，差得也太远了点。

看了一下active learning - 人工识别最难的例子 - 这个想法很好。这是主要的思想吧？

hci · 帖子由 **hci** » 2024年 5月 2日 17:04

什么主要的思想？

TheMatrix 写了： ↑2024年 5月 2日 17:03 看了一下active learning - 人工识别最难的例子 - 这个想法很好。这是主要的思想吧？

TheMatrix · 帖子由 **TheMatrix** » 2024年 5月 2日 17:05

hci 写了： ↑2024年 5月 2日 17:04 什么主要的思想？

active learning的主要思想 - 人工识别最难的例子。

ccmath · 帖子由 **ccmath** » 2024年 5月 2日 17:13

凡是要扯上死人名字的论文，基本上都是拉大旗扯虎皮

想沾Kolmogorov 光的人不少，基本都没有什么用，比如更加唬人的kolmogorov complexity

hci 写了： ↑2024年 5月 2日 15:45 这些都是小打小闹，不是什么突破。基本的东西没有什么根本性的变化。

突破的例子，我来给你说一个：active learning，在工作中现学现用，这个深学做不到。其他不能做的事情太多了，所以扯什么AGI，都是鬼扯，差得也太远了点。

forecasting · 帖子由 **forecasting** » 2024年 5月 6日 12:04

ccmath 写了： ↑2024年 5月 2日 17:13 凡是要扯上死人名字的论文，基本上都是拉大旗扯虎皮

想沾Kolmogorov 光的人不少，基本都没有什么用，比如更加唬人的kolmogorov complexity

Kolmogorov complexity也没啥唬人的，KC因为非常普适，几乎可以囊括一切，所以找不到更具体的理论时，就拿KC解释一下。KC事实上不可计算，而又可以涵盖一切，所以就好像可以解释一样，

ccmath · 帖子由 **ccmath** » 2024年 5月 6日 12:09

一个不可计算，涵盖一切的量、是不是有点玄学的感觉了？

搞研究的人，经常就这么虚晃一枪，然后说我这个是低阶的近似。

forecasting 写了： ↑2024年 5月 6日 12:04 Kolmogorov complexity也没啥唬人的，KC因为非常普适，几乎可以囊括一切，所以找不到更具体地理论时，就拿KC解释一下。KC事实上不可计算，而有涵盖一切，所以就好像可以解释一样，

forecasting · 帖子由 **forecasting** » 2024年 5月 6日 18:38

ccmath 写了： ↑2024年 5月 6日 12:09 一个不可计算，涵盖一切的量、是不是有点玄学的感觉了？

搞研究的人，经常就这么虚晃一枪，然后说我这个是低阶的近似。

那个Ilya就是这样。我问是不是有具体实现，没有，就是说说罢了。

新未名空间

AI算法新论文诸位来点评一下

#1 AI算法新论文诸位来点评一下

#2 Re: AI算法新论文诸位来点评一下

#3 Re: AI算法新论文诸位来点评一下

#4 Re: AI算法新论文诸位来点评一下

#5 Re: AI算法新论文诸位来点评一下

#6 Re: AI算法新论文诸位来点评一下

#7 Re: AI算法新论文诸位来点评一下

#8 Re: AI算法新论文诸位来点评一下

#9 Re: AI算法新论文诸位来点评一下

#10 Re: AI算法新论文诸位来点评一下

#11 Re: AI算法新论文诸位来点评一下

#12 Re: AI算法新论文诸位来点评一下

#13 Re: AI算法新论文诸位来点评一下

#14 Re: AI算法新论文诸位来点评一下

#15 Re: AI算法新论文诸位来点评一下

#16 Re: AI算法新论文诸位来点评一下

#17 Re: AI算法新论文诸位来点评一下

#18 Re: AI算法新论文诸位来点评一下

#19 Re: AI算法新论文诸位来点评一下