Skip to content Skip to footer

0 items - $0.00 0

Deep Learning Is Not So Mysterious or Different by wuubuu

10CommentsShare PostShare on Facebook Share on XShare by EmailSend Link

Top

Deep Learning Is Not So Mysterious or Different by wuubuu

ByHackTech 10 hours ago

10Comments

Share This Article

Sed ut perspiciatis unde.

Send to HN

[Submitted on 3 Mar 2025]

View PDF
HTML (experimental)

Abstract:Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization. Popular examples of anomalous generalization behaviour include benign overfitting, double descent, and the success of overparametrization. We argue that these phenomena are not distinct to neural network

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (10)

10 Comments

Post Author

cgdl

Posted March 17, 2025 at 5:20 pm

Agreed, but PAC-Bayes or other descendants of VC theory is probably not the best explanation. The notion of algorithmic stability provides a (much) more compelling explanation. See [1] (particularly Sections 11 and 12)

[1] https://arxiv.org/abs/2203.10036

0Likes Log in to Reply
Post Author

uoaei

Posted March 17, 2025 at 5:31 pm

[flagged]

0Likes Log in to Reply
Post Author

TechDebtDevin

Posted March 17, 2025 at 5:41 pm

Anyone who wants to demystify ML should read: The StatQuest Illustrated Guide to Machine Learning [0] By Josh Starmer.

To this day I haven't found a teacher who could express complex ideas as clearly and concisely as Starmer does. It's written in an almost children's book like format that is very easy to read and understand. He also just published a book on NN that is just as good. Highly recommend even if you are already an expert as it will give you great ways to teach and communicate complex ideas in ML.

[0]: https://www.goodreads.com/book/show/75622146-the-statquest-i…

0Likes Log in to Reply
Post Author

getnormality

Posted March 17, 2025 at 5:49 pm

> rather than restricting the hypothesis space to avoid overfitting, embrace a flexible hypothesis space, with a soft preference for simpler solutions that are consistent with the data. This principle can be encoded in many model classes, and thus deep learning is not as mysterious or different from other model classes as it might seem.

How does deep learning do this? The last time I was deeply involved in machine learning, we used a penalized likelihood approach. To find a good model for data, you would optimize a cost function over model space, and the cost function was the sum of two terms: one quantifying the difference between model predictions and data, and the other quantifying the model's complexity. This framework encodes exactly a "soft preference for simpler solutions that are consistent with the data", but is that how deep learning works? I had the impression that the way complexity is penalized in deep learning was more complex, less straightforward.

0Likes Log in to Reply
Post Author

inciampati

Posted March 17, 2025 at 5:57 pm

An example, which is interesting, in which "deep" networks are necessary, is discussed in this fascinating and popular recent paper on RNNs [1]. Despite the fact that the minGRU and minLSTM models they propose don't explicitly model ordered state dependency, they can learn them as long as they are deep enough (deep >= 3):

> Instead of explicitly modelling dependencies on previous states to capture long-range dependencies, these kinds of recurrent models can learn them by stacking multiple layers.

[1] https://arxiv.org/abs/2410.01201

0Likes Log in to Reply
Post Author

YesBox

Posted March 17, 2025 at 6:42 pm

I wish I had the time to try this:

1.) Grab many GBs of text (books, etc).

2.) For each word, for each next $N words, store distance from current word, and increment count for word pair/distance.

3.) For each word, store most frequent word for each $N distance. [a]

4.) Create a prediction algorithm that determines the next word (or set of words) to output from any user input. Basically this would compare word pairs/distance and find most probable next set of word(s)

How close would this be to GPT 2?

[a] You could go one step further and store multiple words for each distance, ordered by frequency

0Likes Log in to Reply
Post Author

EncomLab

Posted March 17, 2025 at 6:44 pm

The implication that any software is "mysterious" is problematic – there is no "woo" here – the exact state of the machine running the software may be determined at every cycle. The exact instruction and the data it executed with may be precisely determined, as can the next instruction. The entire mythos of any software being a "black box" is just so much advertising jargon, perpetuated by tech bros who want to believe they are part of some Mr. Robot self-styled priestly class.

0Likes Log in to Reply
Post Author

rottc0dd

Posted March 17, 2025 at 6:49 pm

If anyone wants to delve into machine learning, one of the superb resources I have found is, Stanfords "Probability for computer scientists"(https://www.youtube.com/watch?v=2MuDZIAzBMY&list=PLoROMvodv4…).

It delves into theoretical underpinnings of probability theory and ML, IMO better than any other course I have seen. (Yeah, Andrew Ng is legendary, but his course demands some mathematical familarity with linear algebra topics)

And of course, for deep learning, 3b1b is great for getting some visual introduction (https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQ…).

0Likes Log in to Reply
Post Author

buffalobuffalo

Posted March 17, 2025 at 7:26 pm

When I was first getting into Deep Learning, learning the proof of the universal approximation theorem helped a lot. Once you understand why neural networks are able to approximate functions, it makes everything built on top of them much easier to understand.

0Likes Log in to Reply
Post Author

talles

Posted March 17, 2025 at 7:29 pm

Correct me if I'm wrong, but an artificial neuron is just good old linear regression followed by an activation function to make it non linear. Make a network out of it and cool stuff happens.

0Likes Log in to Reply

Deep Learning Is Not So Mysterious or Different by wuubuu

Deep Learning Is Not So Mysterious or Different by wuubuu

Share This Article

Newsletter

HackTech

10 Comments

cgdl

uoaei

TechDebtDevin

getnormality

inciampati

YesBox

EncomLab

rottc0dd

buffalobuffalo

talles

Leave a comment Cancel reply

Editor's Choice

Deep Learning Is Not So Mysterious or Different by wuubuu

Deep Learning Is Not So Mysterious or Different by wuubuu

Share This Article

Newsletter

10 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter