Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anyone who is interested should also read https://twitter.com/aakashg0/status/1641976869460275201 for a rather different take.

It particularly interested me that Twitter under Musk is trying NOT to discuss Ukraine, and PENALIZES people who attempt to interact with those outside of their general political circle. I can give arguments for why they should do both, but I think both are ultimately bad ideas.



I was not at all impressed with the analysis in that thread. It makes a bunch of assumptions that don't feel very thorough to me, but announces them as if they are unimpeachable facts.

Biggest example is this one:

"9. Making up words or misspelling hurts - Words that are identified as “unknown language” are given 0.01, which is a huge penalty."

The code in the screenshot for that looks like this:

    // Boost (demotion) if the tweet language is not one of user's
    // understandable languages, nor interface language.
    optional double unknownLanguageBoost = 0.01
That doesn't match the description of "Making up words or misspelling hurts" at all!


Yeah on the surface this looks more like "This user doesn't know German we should make German tweets much less likely to appear on their timeline" kind of thing. Just a complete misinterpretation, presented as fact without any supporting evidence.


I agree that seems presumptive. But of course twitter conveniently didn’t share the source code for how that factor is calculated. Turns out it’s hard to evaluate a big complex distributed system by looking at just one slice and no data.


interesting though, that comment directly says that if I tweet in english and my interface is in german then I'll have a serious problem. Which I wouldn't think is so rare. How many people have english as their second language and use it online? It feels like this deboost would hit a lot of people. Or actually, I have my interface in english and I often tweet in german. So that would hit me from both sides.


"One of the user's understandable language" presumably refers to the list of languages in the browser's Accept-Language header. If you are tweeting in English, you'd likely have English in that header.


It may or may not use the Accept-Language header as well, but this is a user setting accessible via the Twitter web UI-- you specify your primary interface language and can specify any number of additional languages that you understand.

There's also a place in Twitter's settings to show languages that Twitter has inferred that you know (e.g. following a German-language account and interacting with it almost certainly means that you know some German).


Ironically, sounds like a flaw of using Twitter as a medium. I'm sure he has more to say, or at least a more polite way to say it, but there's a character limit.


The systemic racism people are going to go wild about this


I agree with you (although I think one should be upset about this, and you seemed to imply that you disagree with the "systemic racism people").

This would directly penalize endangered languages and dialects, which is a tragic loss for linguistic diversity. I think a lot of people hoped the WWW would connect these small and dwindling communities, and even help revitalize them, but policies like this sound actively harmful.

If we're looking at it through the lens of systemic racism, it seems pretty straightforward as well – anyone who speaks a non-standard variant or dialect of their language will have less power to share their thoughts and ideas. They will also see less other people doing so, creating a feedback-loop discouraging the use of non-standard language.


> anyone who speaks a non-standard variant or dialect of their language will have less power to share their thoughts and ideas

If a recipient doesn't speak the language, the speaker doesn't have much power in the first place. This is noise reduction.


It’s systematic Monolingualism. Which even though this is something that has a big negative impact on me and my particular life, I am willing to first attribute to ignorance before discrimination. It’s just easier to make a product that follows a one country one language pattern. And the numbers of multilingual people are in the minority.


> I am willing to first attribute to ignorance before discrimination

I think ignorance is a fair answer for an individual like you and me.

However when you are a giant organisation, the organisation can be evil without anyone individually being so. In organisation ignorance and evilness is really hard to distinguish, and easy to conflate on purpose.

It is the job of the leadership to not be ignorant, or at least to hire someone who isn't.

Thats why AI ethocs teams where the correct move.


Most of the world is multilingual:

https://www.linguisticsociety.org/resource/multilingualism

Of course in the US this is not the case, and Twitter is a US based company that creates software used globally.


Headline can be: twitter shadowbans cultural appropriation - that should trigger everyone.


input youtube thumbnail of cat in the hat enraged "DR SUESS CANCELLED?! TWITTER WON'T COMMENT!" ragebait youtuber.


From your link (thanks!):

> 9. Making up words or misspelling hurts

> Words that are identified as “unknown language” are given 0.01, which is a huge penalty.

Does that mean if I tweet about coding and use identifiers like "setUserName", which is not an English word, the tweet gets a huge penalty? If so, that's disappointing.


That jumped out at me as a possible misreading of the code. Is it detecting the language of the whole tweet, or just a word as the author claims?

Demoting a tweet that's entirely unidentifiable as any human language seems fair enough.


Man if someone asked me to build a system to merely identify whether a unicode string is human language or not I would flatly refuse. There are thousands of spoken languages, many of them with no standard written form, some that are transcribed into multiple different writing systems, some with no writing tradition at all and with only ad-hoc transliteration unique to each user and use.

Even being 90% confident would be a massive undertaking, and "speakers of this language may/may not use the internet" feels like high stakes for getting it wrong.

It seems a little niche but I'm sure a few times a year some far out town gets connected and suddenly there are speakers of a previously unknown-to-the-internet language newly online.


Note that the metric here is "is the tweet in one of the languages spoken by the user". This hypothetically allows more nuanced implementations than you contemplate.

For example, they could have a language "unrecognized" and assume everyone speaks it.

I broadly find this useful: I see tweets in other languages when they're retweeted by people I follow, and about half the time I machine translate them. But I don't want my whole feed to be that.


Well if someone asked me to do that, I would suggest that it’d be based off their recent tweet history and not just one tweet. And I would make my case in the meeting.

Second, it’s already been done so my next suggestion would be to look what at all the computational linguistic majors have been up to.


I think it would be pretty easy with the language models we have available these days.

And there’s always the options of having unsupported languages or inferring it from user settings or user location.

From a product perspective you will need this feature though if you want worldwide coverage, because very few people are polyglots and most people don’t speak English as a first language.


The actual code comment doesn’t mention “words” but rather if the “tweet language” isn’t in one of the user’s “understandable” languages. As such, I assume your example is perfectly fine (would be extremely surprising if it wasn’t).

Whether the user implies the reader or author, I don’t know, I assume the reader as that would make most sense.

https://twitter.com/aakashg0/status/1641976943141699584?s=61...


Yeah that struck me too. I can see the reasons why you'd want it but the collateral damage on that must be huge.

For example do they check for the common but nonstandard transliteration systems arabic speakers use? There have to be similar systems in other languages that don't use the roman or cyrillic alphabets too right?

Or for that matter what about languages twitter simply isn't aware of? There are thousands with native speakers after all, does this make it basically impossible for them to organically use twitter together?


I speak a language that is highly colloquialized, and in its casual written format includes a rather inscrutable system of abbreviation (one of the features of this system is to basically omit nearly all vowels). I had always figured that this language would be impossible for machines to translate, but I just tested it and both google translate and ChatGPT can accurately identify the language and translate the slang into English (Google didn’t pick up some of the subtleties between similar dialects, but still provided a correct translation). So I’m somewhat optimistic that they could be potentially managing these problems quite well.


What's the language?


Indonesian. ChatGPT could quite reliably tell the difference between Indonesian and Malaysian. Google translate seemed to have a bias towards thinking it was Malaysian. But if I tried Indonesian mixed with Javanese slang (which is a common way of talking), they would both just say it was Indonesian. I only tested a few phrases though, so maybe it breaks down at some point.


Why do we assume that Twitter, a global communications company which has had offices in Dubai, might not consider the nonstandard transliteration requirements of Arabic, the 4th largest language in the world, which would mean that they would now be only showing content that is explicitly non-Arabic to Arabic speakers?

We aren’t giving them enough credit here IMO!


Also RIP the Conlang community on Twitter...


I think the correct reading of the code is that if English is your only language, non English tweets would be weighted 0.01x on your timeline.


> It particularly interested me that Twitter under Musk is trying NOT to discuss Ukraine

What makes you say this? I’m aware tweets about Ukraine are down ranked. This seems analogous to subreddits (such as world news, geopolitics, etc) stickying a live thread for the war. This is to prevent it from taking over the subreddit and it happens to slow down discussion.

Seems like the Ukraine war topic is similarly tuned on Twitter. Ukraine can still trend it just takes more activity than other topics.


This is exactly what "trying not to discuss Ukraine" is. Without outright banning it.

When subreddits create a mandatory megathread, it is always with the purpose of kneecapping that specific topic. Putting a large topic into a single thread makes sharing information about the topic extremely difficult.

I'd like to add that there's no reason to believe that such topics that get buried into megathreads would have taken over a subreddit otherwise. The news cycle turns. For example, worldnews today would look no different if the Ukraine war topic had been allowed to breathe. At most it would have caused the minor inconvenience of going to page two for the fraction of people not interested at all in the Ukraine war at that time.


OP here. Unfortunately this thread is mostly misinformation. There were a bunch of viral threads from the growth hacker / influencer crowd, including this one, within hours of the code release with a very superficial understanding of the code (and how recsys work in general). That's partly what motivated me to write this article.

See here for a rebuttal of the main tweet in that thread (near the bottom of the article). https://solomonmg.github.io/post/twitter-the-algorithm/


If this is for their Crisis Misinformation Policy why only one specific callout and specifically directed to Ukraine? Seems like a generous assumptions to make on your part that it's a nothing burger. The takeaway we should go with is that we now know that internally they are willing to programatically segment out Ukraine related topics. The question to me that this new knowledge should lead to is why a policy to segmenting this? (not to call immediately jump to 'nothing burger' or as you put it in the above post 'misinformation').


This is an example of "a lie can get half way around the world before the truth gets its pants on".

I'm no fan of Musk these days, but there is plainly no evidence in the repo that Ukraine is being suppressed - the linked code is very obviously a model dispatch from an initial classification system, and it makes perfect sense Ukraine would need a call out there since a major war most likely would normally run afoul of profanity, violence, and calls to violence filters without actually violating them.


> It particularly interested me that Twitter under Musk is trying NOT to discuss Ukraine, and PENALIZES people who attempt to interact with those outside of their general political circle.

While Musk's Twitter explicitly censors references to Russia's genocide of Ukraine, Musk himself feigns ignorance and false indignation accusing the "western press" of insisting "on pushing such a lopsided view of the conflict".

https://twitter.com/VsimPohuy/status/1645699649003569152?t=v...


The two ideas aren’t necessarily in conflict with one another. In fact, they make sense to go hand-in-hand. Take for instance affirmative action or reparations. They’re instances of trying to correct for something that’s the opposite of how the group desires it, by doing the opposite of what supporters claim is being done to the marginalized group.


The "marginalized group" in this case are the ones that invaded a country and are killing innocent civilians for natural resources.


That's a great way to eliminate misinformation, I don't really see the problem.


> It particularly interested me that Twitter under Musk is trying NOT to discuss Ukraine

Because of stuff like the NAFO trolls I can understand why that discussion is not brought to the top anymore.

Anyway, whoever is interested in the war going down there can most definitely have access to both sides' views, I personally find Twitter one of the few media/online outlets that still makes that possible (and props to them for that).


It’s unclear if it penalizes discussion of Ukraine equally though.

There have been many stories that have come to light in the last few months. Merkel and Macron admitting the Minsk agreements were used to buy time for CIA and British to arm rebels since 2014 was big story. Large amounts of money the US has supplied Ukraine and lack of oversight to where this is going (the total US aid now surpasses Russia’s entire military budget per year). But this same poster (aakashg0) claims these stories have been suppressed, even though they would be counter to dominant narrative in western media.

I think algorithmic moderation on a particular topic is hard; you still need someone in there boosting the stories you want people to read and downplaying the stories you don’t.


I mean, the fact that the Ukraine re-armed itself after Russia invaded their territory isn’t news, is it? I think it was reported on pretty substantially. And a good thing too since they were invaded a second time, this time with a strike towards their capital. I sort of assumed that was obvious public knowledge and don’t understand why people are making it into a “story.”


>Merkel and Macron admitting the Minsk agreements were used to buy time for CIA and British to arm rebels since 2014 was big story.

You don't mean "rebels" (who were/are in fact simply Russia proxies), you mean the other side, Ukraine.

Not sure what percentage of all arms delivered to Ukraine has come after February 2022, but it must be well north of 90%. So apparently all these plans to secretly and slowly arm Ukraine amounted to nothing, and it was the invasion itself that triggered the flow of arms. Some dastardly plan.

BTW always love the not-at-all-loaded use of "admitted".

>Large amounts of money the US has supplied Ukraine and lack of oversight to where this is going

These are the kind of fact-free notions that seem to start their life in the TuckerCarlsonVerse and spread outward from there. Just saying there's "lack of oversight" does not make it so, particularly when there's plenty of oversight by the US gov't. In fact, oversight is one reason why the buildup of weapons supplies (in terms of quantity and weapons systems types) has been so gradual. The US wanted to make sure Ukraine's army knows what to do with the stuff and won't leave on the field of battle so that it ends up arming the enemy, like the Afghans did.

>(the total US aid now surpasses Russia’s entire military budget per year).

That's completely meaningless. Russia's military budget pays for many times more personnel and weapons systems than the equivalent number as part of the US military budget, because the purchasing power of, say, $1 million is vastly different when it's spent by the Pentagon in the US, paying US prices and employing Americans, or by the Russian MOD, paying Russian prices and employing Russians.


>>(the total US aid now surpasses Russia’s entire military budget per year).

> That's completely meaningless. Russia's military budget pays for many times more personnel and weapons systems than the equivalent number as part of the US military budget, because the purchasing power of, say, $1 million is vastly different when it's spent by the Pentagon in the US, paying US prices and employing Americans, or by the Russian MOD, paying Russian prices and employing Russians.

It also seems misleading in the sense that, for example, we have a ton of Abrams tanks that the US military didn’t want, but that Congress has over time decided to buy. So if we send them to Ukraine, how should that be accounted for financially? I guess the cost of a gently used Abrams is pretty high but we already bought it and the value to us is pretty low.


> Merkel and Macron admitting the Minsk agreements were used to buy time for CIA and British to arm rebels since 2014 was big story.

Wtf are you talking about? What rebels in Ukraine did CIA and British arm? Or is one of those “it’s complicated” comments?


Tell us more, who are the "rebels" in this story and what arms did Merkel send?

(Is this what news in the PRC feel like?)


The rebels are right-wing paramilitary groups. And Germany didn’t send any weapons during 2014-2022, but she said in a Der Spiegel interview from Oct 2022 that during the Minsk negotiations, it became clear that the US’ objective was to buy time to secretly arm Ukraine (which is newsworthy because this would imply a violation of the Minsk agreement).


So in a year the US has sent more than Russia's yearly defence budget, yet Minsk (which one, even?) was needed to secretly (what was secret?) arm the Ukraine over 8 years? Who are the "right-wing paramilitary groups" and if they are Ukraine, since this is who you are alleging is being armed, why are they rebels if they are government-aligned?


This is all very easy stuff for you to verify for yourself and wasn’t the original point of my top comment (which was that these stories are hard to suppress without manual effort— although apparently many Americans are unaware).

But to be clear, the US was funding Ukrainian rebel groups (right-wing paramilitary organizations) 2014-2022 but through clandestine means. This is much more difficult to do without the support of congress because the support has to be indirect — the funding has to be off-the-books — because this was a violation of the Minsk agreement.

Since 2022, the floodgates have opened and the US is now openly sending money and weapons systems, now totaling over $100B since the Russian invasion. The Russian Defence budget is estimated to be $70-80B per year.


Which side in the conflict are you talking about? It is very unlikely that the US was funding the Russian-backed rebel groups. It is likely that they were supporting the Ukraine-supported paramilitary groups. Who are not rebels cause they were government supported and supporting.


I think maybe you misunderstood; nobody in this thread is claiming the US has backed Russian rebel groups.


You literally did. The rebel paramilitaries are the Russian-backed separatists.

The Ukrainian-backed paramilitaries aren't rebels because they support the government.


The paramilitaries I was referring to are these: https://en.wikipedia.org/wiki/Category:Paramilitary_forces_o...

Many of these are right-wing militias that have been fighting in the Donbas since 2014. Some have been absorbed into the UA national guard.

I have never mentioned pro-Russia paramilitary groups; I think you and the other commenter got confused because I didn’t explicitly say Ukrainian paramilitary groups. (Thought it was obvious from context which side I was referring; obviously the CIA would not be supporting pro-Russian militias)


> The paramilitaries I was referring to are these

None of those are “rebels”, which is the term you used.

> Some have been absorbed into the UA national guard.

The National Guard of Ukraine is one of the “paramilitaries” on the list you linked, so, yes, it is fair to say it has been absorbed into…itself.

> I have never mentioned pro-Russia paramilitary groups

You said “rebels”. The only rebels in Ukraine are those forces consisting of Ukrainians (as non-Ukrainians would be invaders, rather than rebels) fighting against the government of Ukraine, i.e., Russian-backed and Russian-aligned armed groups.


He also said 2014, maybe you need to reread that comment once more. Those were times when Ukraine was not clearly separated from Russia, for example most of big boys in the Government were openly pro-Russian. Ukraine have been a rebel since 1991 and the process is yet to be finished.

Boy, Ukraine have never had a proper borderline with Russian, I mean most of that border was nothing more than just demarcation sticks.


The problem is you kept saying "rebel groups". "Rebel" means that they are opposed to the government. Like the separatists.

Plus, you didn't answer the multiple times when asked to clarify.


Consider my answer [1]

Also I have vouched the top comment of this tree because I consider this point as valid.

[1] https://news.ycombinator.com/item?id=35537017


The gigantic key factor in all this that you're leaving out is that Ukraine is defending itself against a full-on invasion by a hostile neighbor.


The "CIA and the British" clandestine funding of "right-wing paramilitary Ukrainian rebel organizations" with diplomatic support from "Merkel and Macron via Minsk" is "all very easy stuff" for me to verify?

"many Americans are unaware" of this story because it's nonsense that puts together various tidbits to spin a yarn long enough to envelop the world.


> But to be clear, the US was funding Ukrainian rebel groups (right-wing paramilitary organizations) 2014-2022 but through clandestine means.

You're still being UN-clear. And this far down the thread it's becoming harder and harder to believe that it is unintentional.

"Ukrainian right-wing paramilitary pro-Russian 'rebel' organizations", or "Ukrainian right-wing paramilitary pro-Ukraine 'rebel' organizations"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: