This is less useless than you think. Captioning video could allow for video to b...

CamelCaseName · on May 30, 2018

I think he is stating that the quality of the transcription is poor.

ehsankia · on May 30, 2018

You don't need amazing transcription to search a video. A video about X probably repeats X multiple times, and you only really need to detect it properly once.

As for the users, sure the translation may not be perfect, but I'm sure if you were deaf had no other way of watching a video, you would be just fine with the current quality of the transcription.

dx034 · on May 30, 2018

Often you need exactly that. Because it's the unique words the machine will get wrong. If you look for machine learning tutorials/presentations that mention a certain algorithm, the name of it must be correctly transcribed. At the moment, it appears to me that 95%+ of words work but exactly the ones that define a video often don't. But then again getting those right is hard, there's not much training data to base it on.

cooper12 · on May 30, 2018

They mean useless in the end result. Of course having perfect captions could potentially allow indexable videos, but the case is that the captions suck. They're so bad in fact that it's a common meme on Youtube comments for people to say "Go to timestamp and turn on subtitles" so people can laugh at whatever garbled interpretation the speech recognition made.

nopinsight · on May 30, 2018

Have you used/tried them recently? The improvement relative to 5 years ago is major.

At least in English, they are now good enough that I can read without listening to the audio and understand almost everything said. (There are still a few mistakes here and there but they often don’t matter.)

ballenf · on May 30, 2018

Yes I’ve had to turn them off on permanently. Felt I could follow video better without sound often than with subtitles.

I tried to help a couple channels to subtitle and the starting point was just sooo far from the finished product. I would guess I left 10% intact of the auto-translation. Maybe it would have been 5% five years ago; when things are this bad 100% improvement is hard to notice.

It is super cool how easy it is to edit and improve the subtitles for any channel that allows it.

buvanshak · on May 30, 2018

>understand almost everything said...

If by 'almost everything', you mean stuff that a non native English speaker could have understood anyway, then yes.

nopinsight · on May 30, 2018

I'd say the current Youtube autocaptioning system is at an advanced nonnative level (or a drunk native one :)) and it would take years of intensive studying or living in an English-speaking country to reach it.

The vast majority of English learners are not able to caption most Youtube videos as well as the current AI can.

You underestimate the amount of time required to learn another language and the expertise of a native speaker. (Have you tried learning another language to the level you can watch TV in it?)

Almost all native speakers are basically grandmasters of their mother tongue. The training time for a 15-year-old native speaker could be approx. 10 hours * 365 days * 15 years = 54,750 hours, more than the time many professional painists spent on practice.

AstralStorm · on May 30, 2018

Not true. The problem with Google captioning and translate is that unlike a weak speaker it makes critical mistakes completely misunderstanding the point.

A weak speaker may use a cognate, idiom borrowed from their native tongue or a similar wrong word more often. The translation app produces completely illegible word salad instead.

nopinsight · on May 30, 2018

I was talking exclusively about auto-captioning, which has >95% accuracy for reasonably clear audio. Automatic translation still has a long way to go, I agree.

Schoolmeister · on May 30, 2018

To be honest, as the other child comment said, I too have noticed they have gotten way better in the last 5 years. Also, the words of which it isn't 100% sure are in a slightly more transparent gray than the other words, which kind of helps.