Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It sounds like you went though a similar process as the computer vision community over last couple decades.

First people used to write classifiers by hand, but they found it's too tedious, unreliable and has to redone for each object you want to classify. Then they tried to learn to detect objects by using local feature detector and train a machine learning model to classify objects based on that. This worked much better, but still made some mistakes. Convolutional Neural Network were already used to classify small images of digits, but people were skeptical they would scale to larger images.

This was until in 2012 AlexNet came along. Since then performance of convolutional networks has improved each year. Now they can classify images with similar performance as humans.



Am I wrong to see this as a bit scandalous for computer vision as a field before 2012? (It kind of seems like maybe a decade of research at the Berkeley CS department will be tossed out?)


Not entirely. In a lot of fields that where DNNs (or other ML techniques) have shown dramatic improvements of late, there are several reasons why the field didn't show improvement in the past.

A big reason is the tremendous increase in computing power available to the researcher for low cost. Most of these improvements depend on CPU-expensive training over lots of examples. In the past, the time to train a model or evaluate a situation would have been very high.

Another big reason is that datasets in a lot of these areas were fairly small, and the newer techniques tend to need a lot of data to train.

Another reason is that most previous researchers were focused on feature engineering, whereas modern techniques seem to move feature engineering into the ML system. This is a sort of conceptual change,

I don't really see it as "scandalous" in the sense that you'd expect people to have realized manual feature engineering wasn't the fastest way to get human-like results, or that computers were going to get faster for ML tasks, or that it would be possible to train deep networks, or that having good datasets against which everybody in the community can run and evaluate would be valuable.


Yes, the speed of these GPUs is insane. Teraflops @ 200 Watts.


But not a petaflop @ 200W which is possible for deep learning. GPUs have driven a lot of progress in deep learning, I'll be excited to see what DPUs will do in terms of deep learning progress, assuming of course, they have floating point, (otherwise creative algorithm will be a problem)


Commercial deep learning ASICs will definitely happen. The Nvidia stuff is in a way getting there, to go from 3500 ops / tick to 35000 ops / tick with the same power consumption will most likely require more than a merely incremental improvement in the hardware.

It would have to be:

- less general

- a smaller process node

- possibly more than one chip on a board tightly coupled

- specialized data types

- very tight coupling between memory and computation

(so maybe memory on the chip)

- a slightly higher clock speed, say twice as high

GPUs are much too general but if all the factors above can be realized a factor of 10 in a PCIe add-in card should be possible.


It's not just computer vision. Many traditional methods in speech recognition and synthesis, translation, and game AI are being replaced by the same algorithms.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: