More compute -> more precision is just one field's definition of scalable... Say...

felippee · on May 30, 2018

OK, the definition of scalable is crucial here and it causes lots of trouble (this is also response to several other posts so forgive me if I don't address your points exactly).

Let me try once again: an algorithm is scalable if it can process bigger instances by adding more compute power.

E.g. I take a small perceptron and train it on pentium 100, and then take a perceptron with 10x parameters on Core I7 and get better output by some monotonic function of increase in instance size (it is typically a sub linear function but it is OK as long as it is not logarithmic).

DL does not have that property. It requires modifying the algorithm, modifying the task at hand and so on. And it is not that it requires some tiny tweaking. It requires quite a bit of tweaking. I mean if you need a scientific paper to make a bigger instance of your algorithm this algorithm is not scalable.

What many people here are talking about is whether an instance of the algorithm can be created (by a great human effort) in a very specific domain to saturate a given large compute resource. And yes, in that sense deep learning can show some success in very limited domains. Domains where there happens to be a boatload of data, particularly labeled data.

But you see there is a subtle difference here, similar in some sense to difference between Amdahl's law and Gustafson's law (though not literal).

The way many people (including investors) understand deep learning is that: you build a model A, show it a bunch of pictures and it understands something out of them. Then you buy 10x more GPU's, build model B that is 10x bigger, show it those same pictures and it understands 10x more from them. Look I, and many people here understand this is totally naive. But believe me, I talked to many people with big $ that have exactly that level of understanding.

evrydayhustling · on May 30, 2018

I appreciate the engagement in making this argument more concrete. I understand that you are talking about returns on compute power.

However, your last paragraph about how investors view deep learning does not describe anyone in the community of academics, practitioners and investors that I know. People understand that the limiting inputs to improved performance are data, followed closely by PhD labor. Compute power is relevant mainly because it shortens the feedback loop on that PhD labor, making it more efficient.

Folks investing in AI believe the returns are worth it due to the potential to scale deployment, not (primarily) training. They may be wrong, but this is a straw man definition of scalability that doesn't contribute to that thesis.

shadowmint · on May 30, 2018

You’re arguing around the point here.

Almost all reasearch domains live on a log curve; a little bit gets you a lot to start with, but eventually you exhaust the easy solutions and a lot of work gets you very little improvement.

You’re arguing we haven’t reached the plateau at the top yet, but you’ve offered no meaningful evidence that is the case.

There are real world indicators that we are reaching diminishing returns for investment in compute and research now.

The ‘winter’ becomes a thing when it becomes apparent to investors that their financial bets are based off nothing more concrete than opinions like yours, when they don’t work out.

Are we there yet? Not sure, myself, I think we can get some more wins from machine generated architectures... but I can’t see any indication that the ‘winter’ isn’t coming sooner or later.

Investment is massively outstripping returns right now... we’ll just have to see if that calms down gradually, or pops suddenly.

History does not have a good story to tell about responsible investors behaving in a reasonable manner and avoiding crashes.