Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Unicode character “ꙮ” (U+A66E) is being updated (twitter.com/jonty)
372 points by SerCe on Sept 19, 2022 | hide | past | favorite | 244 comments


When my kids were young, I accidentally flubbed the pronunciation of "Santa Claus" once and said something that sounded a lot like "Centiclops", which I decided to roll with. Centiclops is a lot like a cyclops with one eye, except the as a reading of the roots clearly indicates, this is a creature with 100 eyes.

Today I learn that Centiclops effectively has a Unicode character. As Centiclops' representative in the world of the non-imaginary, we accept that a Unicode character with a hundred eyes is not practical and we accept the representation with just a few eyes, but generally agree that upgrading to 7 to 10 is a nice improvement, as 7 does not evenly divide into 100 but 10 does. This is important, because... reasons.


From "The House of Asterion" by Jorge Luis Borges:

"It is true that I never leave my house, but it is also true that its doors (whose numbers are infinite) (footnote: The original says fourteen, but there is ample reason to infer that, as used by Asterion, this numeral stands for infinite.) are open day and night to men and to animals as well."

https://klasrum.weebly.com/uploads/9/0/9/1/9091667/the_house...


That reminds me of the Nahuatl word centzon, which is used to mean either 400, or an innumerable/infinite number. The Aztecs used a base-20 number system, so 400 = 20*20.

https://www.mythicalcreaturescatalogue.com/post/2016/06/10/t...


Greek mythology actually did have a "centiclops" -- Argus Panoptes ("all eyes"), who had a hundred eyes all over his body. Hera assigned him to watch over Io, a nymph who had been turned into a cow, so that Zeus wouldn't come and shag her in secret. Argus was slain by Hermes (a Zeus loyalist); to mourn and honor him, Hera had his eyes transferred to the peacock's tail.

The real Greek for a hundred-eyed being would be something like "hekatonoptes", but Argus wasn't called that as far as I know.


There should be a combining “eye” character so that you can have as many or few eyes as you like.

Though to be honest, that Unicode character looks more like a bunch of cells forming a tissue to me than eyes.


But then what are eyes but a bunch of cells?


B ¬∈ A


The nice thing about ꙮ having ten eyes is that you can now combine ten of them with U+200D ZERO WIDTH JOINER [1] to make a centiclops grapheme, as long as your font has a glyph for that particular ligature. (Readers without centiclops-compatible fonts will simply see ten separate ꙮ glyphs, an acceptable fallback for legacy systems.)

[1]: https://en.wikipedia.org/wiki/Zero-width_joiner


Or perhaps this character is an accurate representation of a Dekaclops.


It'd be dekaops, because the -cl- is part of “cycle”+“ops” (one round eye, with the one dropped because it's inferred). So “cycle” out, “deka” in.


I thought ops was from opsis and referred to a “circle-like face”?

Like oinops, referring to the “wine-like face” (ie dark read) of a sea passage in the evening.

May be wrong, though, I don’t have my tools handy to check now.


My client finds your proposal offensive and an appropriation of his culture, and also that Dekaclops guy is mean and smells bad and hasn't returned the lawnmower my client lent him even though my client has clearly referred to the need to mow his lawn several times now so he totally doesn't deserve a Unicode character.


“Santa Clause” would translate to “holy clause”. There might be such a thing but I think you meant Santa Claus :)


My fingers love adding the e's on the end of any worde that can conceivably take them. Also have that problem with any word that can take an "ly" even if I don't meanly it.

Fixed, thanks.


That's supercalifragilisticexpialidociously strangee.


Maybe just a big fan of the Tim Allen movie?


[grunts]


I thought "santa" meant "saint"?


> I thought "santa" meant "saint"?

Well, santa is a Spanish word meaning "holy" and saint is a cognate French word meaning the same thing. They descend from Latin sanctus; compare sanctify.

When the prayer goes "holy Mary, mother of god", "holy Mary" is an exact equivalent of "santa María".


Might as well mention “Sancta Marīa” in Latin, for example from the Christian Hail Mary[1], a recorded Latin version[2], written Latin next to English and Spanish[3] and of course translated into thousands of languages[4] although unfortunately mostly written using /A-Z/i; I am an atheist interested in languages.

[1] https://en.m.wikipedia.org/wiki/Hail_Mary

[2] https://glaemscrafu.jrrvf.com/english/avemaria.html

[3] https://hymnary.org/text/hail_mary_full_of_grace_the_lord_is...

[4] http://www.marysrosaries.com/Rosary_prayers_in_different_lan...


In my mind, the Latin form of Mary is Mariam, because that's what my Latin teacher taught me. (He also commented that, unlike Greek names, Hebrew names never inflected in Latin, so that it would be "Mariam" regardless of what case the name should appear in.)

But it makes sense that Church Latin would be different.


“Santa” means “female saint” in Italian and Spanish. Perhaps the English “santa” came from another language but I always found the name “Santa Claus” just horrible.


The name Santa Claus evolved from Nick's Dutch nickname, Sinter Klaas, a shortened form of Sint Nikolaas (Dutch for Saint Nicholas)

https://www.history.com/.amp/topics/christmas/santa-claus#si...


It’s actually Sinterklaas (without a space) and we still call him that :) We also ended up re-importing the American Santa Claus, so these days we have two festive holidays in December.


The first mention of this version of Saint Nicholas's name has the form "St. A Claus" and appeared in the New-York Gazette of 20 Dec 1773.[1] The same issue also first reported some incident regarding tea in Boston harbour. Nice coincidence.

[1] Source: https://boston1775.blogspot.com/2016/12/st-claus-was-celebra...


The Tim Allen movie series in Spanish is titled "Santa Cláusula".


Saint is more or less the same as holy, just used as a title. It comes from Old French saint, seinte "holy, pious, devout," from Latin sanctus "holy, consecrated"


It does; the character originates from Saint Nicholas (or Odin, depending who you ask)


I thought it was a misspelling of Satan, but maybe that's because I'm Jewish.


> Centiclops is a lot like a cyclops with one eye, except th[at] as a reading of the roots clearly indicates, this is a creature with 100 eyes.

Not in any normal sense of "roots". Cent is a Latin root meaning 100. ops is a Greek form meaning eye. The -i- indicates that the word is being formed in Latin, and the -cl- is entirely spurious. The original Greek word divides as cycl-ops, not cy-clops.


A bit like the heli-copter | helico-pter thing.


Wow. I never noticed that before. Spiral wings.


Impressively polylingual, even multiglot.


In any case, there is already an ancient, general, and perfectly serviceable epithet Panoptes.


Cent is easy to grasp if you speak a Romance.


But it doesn't combine with ops. You'd need to talk about a hecatops or a hecatontops. And even more than it can't combine with ops, it can't combine with clops because there is no such root.


It does. Combining Latin and Greek roots is done fairly frequently.

https://en.wikipedia.org/wiki/Hybrid_word mentions automobile, chloroform, hexadecimal, micro-instruction, petroleum, television and a few others.


Citing wikipedia (“wiki” is a Hawaiian-derived root, and “-pedia” a Greek-derived suffix) is particularly appropriate here.


The thing that will annoy you most of all is that I was fully aware of all of this at the time... and I did it anyways.

Truly the purity of English will never recover from the trauma that I have inflicted on it.


> But it doesn't combine with ops.

Sure, it does, in English, which stole prefixes, suffixes, and roots from Latin, Greek, and many other languages, and has no problem using them together, without special concern about where it got them from.


By the same reasoning, the 7-eyed O has now been used more than once, so it deserves a glyph! So the right way to do this is to introduce a new character for the correct glyph, and also leave the current one (perhaps changing the title). Otherwise these tweets won't make when read by someone that updated to Unicode 15.0


Honestly it probably deserves the Pluto treatment: decertification as a character. One historical use in the 1400s doesn't merit a character and never did.


Unicode's mission is to make every document "roundtrip-able". Even if a character is only used once, it should be possible to save a plaintext version of the containing document without losing any information. Roughly, I should be able to put a transcription of that one translation from the 1400s on Wikisource without using images.

You may disagree with me, and that's fine, but it doesn't change Unicode's mission. Besides, there's room for 1,112,064 codepoints[a], and only 149,146 are in use. It's predicted we'll never use it up, so what harm is there in one codepoint no one will ever need?

[a]: U+10'FFFF max; it used to be U+FFFF'FFFF, but UTF-16 and surrogates ruined that


If that was once its mission, it was clearly abandoned long ago. They rejected Klingon characters on the grounds that it has low usage for communication, and that many of the people who do communicate in Klingon use a latinized form.

ꙮ seems to just be a fancy way of writing О. I haven't seen anything that says it has a different meaning. The arguments for excluding Klingon seem to apply even more so to ꙮ.


If you look through the old mailing list postings, the oft-left-implicit problem with Klingon (as well as Tengwar, Everson’s [EDIT: misspelling] pet project) is that it may get people into legal trouble (even though in a reasonable world it shouldn’t be able to). So in the unofficial CSUR / UCSUR they remain.

A weird solitary character from the 1400s isn’t subject to that, and even if it’s a mistake it’s probably not worth breaking compatibility at this point (I think the last such break with code points genuinely changing meanings was to repair a mistaken CJK unification some time in the 00s, and the Consortium may even have tied its own hands in that regard with the ever-more-strict stability policies).

Similarly, for example, old ISO keyboard symbols (the ⌫ for erase backwards, but also a ton of virtually unused ones) were thrown in indiscriminately at the beginning of the project when attempting to cover every existing encoding, but when the ISO decided to extend the repertoire they were told to kindly provide examples of running-text (not iconic) usage in a non-member-body-controlled publication. (Crickets. The ISO keyboard input model itself only vaguely corresponds to how input methods for QWERTY-adjacent keyboards work in existing systems—as an attempt at rationalization, it seems to mostly be a failed one.)


[EDIT: Removed a section about the now-fixed typo]

> I think the last such break with code points genuinely changing meanings was to repair a mistaken CJK unification some time in the 00s, and the Consortium may even have tied its own hands in that regard with the ever-more-strict stability policies[.]

Not exactly, the last break happened between Unicode 1.1 and 2.0 and the new CJK Unified Ideographs Extension A block still contains unified characters. The main reason for break was that both Hangul and CJK(V) ideographs required tons of additional code points and it became clear that 16-bit code space is dangerously insufficient; by 1.1 there was only a single big block of unassigned code points from U+A000 to U+E7FF (18,432 total), and there were 4,516 and 6,582 new Hangul and CJK(V) ideographs in 2.0 (11,098 total).


Unless it's legitimately someone's native tongue, conlangs shouldn't be in unicode. If there are kids out there that are native Klingon speakers, then you can make the argument it should be included.


I think it makes way more sense to put a conlang in Unicode than it does a peculiar stylistic flourish only ever applied once to a single letter in a single document. If that belongs in Unicode, why not every bit of marginalia ever doodled and every uniquely adorned drop cap / initial letter?


There is a smattering of missionary-made alphabets that have way less usage than some conlangs. Why are they legitimate but conlangs aren't?


so all you need is one crazy parent? shouldn't be too hard to find


https://www.dailymail.co.uk/news/article-1229808/Linguist-re... (2009):

“A linguist has revealed he talked only in Klingon to his son for the first three years of his life to find out if he could learn to speak the 'language'.

[…]

Now 13, Speers' son does not speak Klingon at all.”


Okay, let's take a look at the context where the multiocular o was used: https://en.wikipedia.org/wiki/Multiocular_O

I see that near it, there is an ef (Ф) with a very tall stem.

Why should that not be included as a standard unicode character? Surely it is used more often than the multiocular o.

You may say "it's a decorative flourish", which is of course true, but so is the multiocular o. Should we allow every conceivable decorative flourish into unicode? What is the standard for where flourishes become distinct characters?


Today, I wrote a document by hand containing a new symbol that only looks like genitalia if you squint really hard. Where do I apply to have it included in unicode so that it can be digitized properly?


Rule-lawyering wise-asses try to mess with many policies. It's rarely a sensible indictment of a policy, nor is it very effective. Anyone dealing with such people just ignores them.


What's the criterion that includes the document in the tweet, but excludes the document referenced by the GP?



I don't see any anything on the inclusion of symbols that are not icons, such as U+A66E, or the symbol proposed by bityard.


Can you reuse 𓂸 or 𓂺?


And for years we've just been using eggplants!


Lol I never knew those existed. Apparently they're egyption hieroglyphs but it really makes me wonder the meanings of them now :)

I also wonder how these didn't become insanely popular overnight, like the famous eggplant.


( ノ ≧ ∇ ≦ ) ノ ミ ┻ ━ ┻

〜 ( ꒪ ꒳ ꒪ ) 〜

( ༎ຶ ෴ ༎ຶ )


Reuse rectangle or rectangle? (Seems to not render for me, win 10, chrome)


Given that “𓂸” (U+130B8) is already in unicode (and related 𓂹,𓂺) pretty sure the only problem is you made it up, not that it looks like genitilia


For as inclusive as that mission is, it seems weird to me how limited in certain areas unicode is. For instance, people use peach emoji since there isn't one for butt, eggplant since there's no penis, etc.

This doesn't contradict the stated goal exactly, but it seems against the spirit of it at least.


One could argue that emoji should have never been added to Unicode in the first place. Peaches and butts are images, pictures, illustrations, whatever - but they are not characters. There's no writing system which has a colored drawing of a peach as a character.


Yes there is - a widely used character set (when Unicode talks about "writing systems" it explicitly includes all the computer character sets used in practice pre-unicode) used by japanese 'featurephones' had emoji characters, so in order to be able to include that character set in unicode, unicode had to add emoji.


They're sort of neither. The peach emoji will render differently on iOS, Android, Windows. And I'm sure emoji-replacement packs are possible on Windows and Android (even though it's also guaranteed to be a virus).

So a peach emoji is not the same thing as the iOS peach-emoji-image. Similar to how changing my font doesn't change the actual characters.

I don't think including emojis was a great idea, but now that it's happened and people everywhere use them, emoji have become characters. I agree with your point, but it's already happened and so now there's not really any going back.


Yes there is. We're using it right now. Even linguists are studying the use of emoji today.


But that doesn't change the fact that most people use them snd like them, and there is not much technical disruption. They just chose practicality over purity.


Not only that - people use them in textual communication the way letters traditionally are used. There is probably a lot better argument for emoiji than a lot of other things in unicode (but it is a slippery slope)


Most people (me included) like funny cat videos and send funny cat videos. Shall we include some to Unicode?

I mean, this ship has long sailed, but that was a mistake nevertheless. Not everything has to be a unicode character.


If there were specific funny cat videos that were culturally relevent, maybe.

For example I could see there being an emoji for keyboard cat.


That wouldn't be practical. It would make fonts too big, and videos aren't a thing that goes inline in text.

However, I could totally see some kind of open source GIF library of a few hundred meme videos and pictures, to standardize the "Reply with a GIF" thing in some P2P chat ecosystem, and maybe it could have a new URL scheme for referring to OpenMemes images.


You can have entire sentences constructed out of emoji.


I tried to reply with just a unicode penis but that got flagged immediately, so I'll be more substantial and leave out the actual penis. It appears in Egyptian hieroglyphs, so actually there is a penis included in unicode.


That's true, good call. I feel like there should be one without the context of Egyptian hieroglphys though, though I'm not exactly sure how that kind of thing works in unicode.


I thought peach was a vulva. What's the emoji for a vulva then?

Don't tell me Presidents of the United States' song "Peaches" was about butts!?

https://en.wikipedia.org/wiki/Peaches_(The_Presidents_of_the...


I'm not sure if there is one. Or maybe it's used for both, depending on context/community?

The song was definitely a vaginal reference as far as I ever knew.


> For instance, people use peach emoji since there isn't one for butt, eggplant since there's no penis, etc.

Personally I think there should be, actually. There's all these other body parts but these are left out. Emoji is almost becoming a language and the good thing is that everyone can understand them, regardless of language. For example I could imagine these could be very useful in an international medical setting. Or for sexting, obviously, we can pretend that's not a thing but that's a bit too Victorian for me.

Of course they're not appropriate in some settings but so are many words.


I REALLY don't like that emojis are beholden to companies. For example, when the emoji for a gun was changed from a pistol to a squirtgun on many platforms, it changed the meaning of its use by a lot. You could argue that it is a good thing, but I see it as a pretty bad direction to go into.


Unicode doesn't have a character for every illuminated initial, nor should it. I'm not clear on why this character should be considered any differently.


http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3194.pdf

It was introduced with other "ocular O"s which are seemingly more commonly used than this one.

It's not quite an illuminated initial.


Wow, this is probably the most actually useful and interesting comment in this whole discussion, thanks! For anyone interested, the most relevant quotes from the document are in particular:

"This document requests the addition of a number of Cyrillic characters to be added to the UCS. It also requests clarification in the Unicode Standard of four existing characters. This is a large proposal. While all of the characters are either Cyrillic characters (plus a couple which are used with the Cyrillic script), they are used by different communities. Some are used for non-Slavic minority languages and others are used for early Slavic philology and linguistics, while others are used in more recent ecclesiastical contexts. We considered the possibility of dividing the proposal into several proposals, but since this proposal involves changes to glyphs in the main Cyrillic block, adds a character to the main Cyrillic block, adds 16 characters to the Cyrillic Supplement block, adds 10 characters to the new Cyrillic Extended-A block currently under ballot, creates two entirely new Cyrillic blocks with 55 and 26 characters respectively, as well as adding two characters to the Supplementary Punctuation block, it seemed best for reviewers to keep everything together in one document.

(...)

MONOCULAR O Ꙩꙩ, BINOCULAR O Ꙫꙫ, DOUBLE MONOCULAR O Ꙭꙭ, and MULTIOCULAR O ꙮ are used in words which are based on the root for ‘eye’. The first is used when the wordform is singular, as ꙩкꙩ; the second and third are used in the root for ‘eye’ when the wordform is dual, as ꙫчи, ꙭчи; and the last in the epithet ‘many-eyed’ as in серафими многоꙮчитїй ‘many-eyed seraphim’. It has no upper-case form. See Figures 34, 41, 42, 55."


Literally everyone in choir: we have boooobs!


Because it's already been added to unicode. Now it's not a question of whether or not to add, rather to remove, and unicode almost by definition does not remove.


Unicode does have deprecated code points though. Not that I necessarily think making this character deprecated makes sense.


Meanwhile one still can't roundtrip regular Japanese without some kind of funky out-of-band signalling. By itself this kind of thing is harmless, but it speaks to poor prioritization from Unicode.


This is incorrect. I think you defined round-trip as something else, but some character set A providing a round-trip compatibility with other set B means that B can be converted to A and back to B without a loss. And it is one of Unicode's explicit goals to provide a round-trip compatibility with major encodings including Japanese ones.

Han unification only means that when you convert Japanese encodings (B) to Unicode (A), it is not distinguishable from non-Japanese encodings converted to Unicode. This means that the Unicode text doesn't always follow domestic conventions without out-of-band signaling or IVD or so. But if you know that the text was converted from a particular encoding, you can perfectly recover the original text encoded in that encoding.


By that logic any 8-byte encoding is round-trip compatible with all encodings, since however bad the mojibake is, if you know what the original encoding was then you can always just convert back to that.


Yes, but only under a very wrecked hypothetical definition of "conversion".


To be fair they wanted to keep everything representable with 16 bits and that wasn't going to happen without the Han-unification. The mess when everything still had to move to a 32 bit representation has been far reaching, many programming languages went from exposing code points atomically as "char" to some half encoded nonsense value that just happens to also be a valid standalone value in UTF-16 most of the time and a source of bugs when you least expect it.


Why can’t it round-trip Japanese?


"Han Unification" - in Unicode many Japanese characters are represented as Chinese characters that look different (and subjectively ugly). The Unicode consortium's answer is that you're supposed to use a different font or something when displaying Japanese, which is pretty unsatisfying (e.g. if you want to have a block of text that contains both Japanese and Chinese, you can't represent that as just a Unicode string, it has to be some kind of rope of segments with their own fonts, at which point frankly you might as well just go back to bytes-with-encoding which at least breaks very clearly and visibly if you get it wrong).


You can use the deprecated language tag control codes to distinguish unified code points. It is unlikely to be well supported but it is there.


The thing is, this is just a decorative way to write “o”. It’s not a specific letter by any definition.

I can’t speak of other letters that were added in the same batch in 2007. Some of them seam meaningful, I donno, I don’t speak old church slavonic (although I am told it sounds like Croatian, which I understand a little)

http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3194.pdf


> so what harm is there in one codepoint no one will ever need?

Fonts bloat (do you want a font with 1 million characters in it ? I don’t. Do you want to have to install 1000 fonts having 1000 characters each to be sure to cover all the Unicode table ? I don't).

Lots of issues for everyday programmers (how do you handle weird unicode characters in your validation code ?) potentially leading to security issues (bypassing validation rules by close-but-different characters, phishing…)


Cataloging every doodle ever drawn inline with text by anyone at any time in history would exhaust any finite set of code points.


> Unicode's mission is to make every document "roundtrip-able".

Only for characters from existing coded character sets.


That isn't Unicode's mission though. To get new characters added you have to show that people use it or would use it if it were available.


The artist Prince changed his stage name to an unpronounceable symbol for a few years. It appears in more than one document. Should it be added to Unicode?


Probably. Maybe you can propose it.


By the original reasoning, shouldn’t every fancy illuminated character from medieval manuscripts get its own codepoint?


why isnt the artist formerly known as prince in unicode?


Isn’t there an entire Unicode block for the symbols on the Phaistos disc? Yes: https://en.wikipedia.org/wiki/Phaistos_Disc_(Unicode_block) . I suppose those occur in quite a few documents about the disc, even though the disc itself is the only known document written in those symbols.


At the moment this character is used in many documents and databases - including comments in this thread, the article mentioned there, etc.

There could have been a good case not to include it back in 2007, but once it has been included, excluding it would break stuff.


And updating it rather than adding a new, correct one, might make the current uses confusing ?

Speaking of which, do we have any similar hexagonal symbol ?


One historical use in the 1400s doesn't merit a character and never did

One known and surviving use. It is possible that it exists in other places, since the vast majority of the planet's written work has not been digitized. It may also have been used other places that have not survived.

Just because it's not important to you does not mean it is not important.

The fact that is survived for 600 years makes it interesting and worth saving. It is infinitely unlikely that anything you do, write, or say will last that long.


Sure it's possible, but there should be a higher bar than "it's possible it's used more than once" for meriting inclusion in the standard keyboard of billions of devices worldwide.


The thing is, looking at the page, there are many other characters that were not added - the large red С-looking characters, for example. But for some "bizarre" reason, those were not included in Unicode...

Of course, the simple answer is that Unicode actually includes any character that someone cares enough to ask to be added, with rare exceptions.


> It is infinitely unlikely that anything you do, write, or say will last that long

Ouch


There are characters in unicode with 0 usages that we dont even know where they came from. E.g. 彁


While the origin of 彁 will never be certain, there is a good chance that it came from a misinterpretation of 彊 [1]. Why is this not an accepted theory though? Because it is still possible that 彁 did appear in some reference source from the standardization, and neither that source or a source where 彊 does look like 彁 was found.

[1] http://www.asahi.com/special/kotoba/archive2015/moji/2011082...


idk. when the word Planet was redefined such that Pluto was no longer a planet, it kind of ruined the word Planet. It suddenly wasn’t nearly as useful as a word as it used to (even though now it has a precise meaning). For most people that use the word, it won’t matter (and is actually rather exciting) that they keep discovering new planets in our solar system.

If they’d treat the word characters the same way, it would only serve to confuse and do no favors to the remaining glyphs.


This is temporary though, soon people will look at you funny if you say that Pluto is a planet - and/or they might not even have heardof it (though of course that is still worth learning about in an History of Science context).

We do NOT keep discovering new planets, rather minor planets (I agree that the term is confusing), more than a million of them discovered in the Solar System now, like the 9007 James Bond.


It could go either way, it is not always that the scientific meaning wins out, especially not when even scientists don’t find the new definition useful.

When I think of a planet, I think of a world that has active geology that isn’t a moon (I know excluding moons is arbitrary, and perhaps I shouldn’t do that; but hey, that’s language for you). I honestly don’t care about the orbit, and I bet that when most people think about planets they aren’t thinking about the orbit either, let alone whether the planet has cleared the orbit or not. I doubt that will change.


> When I think of a planet, I think of a world that has active geology

Wouldn't that definition rule out gas giants?


No just that, but whether or not Mars is still geologically active is still an open question. If you admit planets on the basis that they have a history of geological activity, then Ceres is a planet too.

I don’t think anybody considers geological activity as particularly useful for classifying things as ‘planet’ or ‘not planet’.


Why shouldn’t Ceres be a planet? If Pluto gets to be a planet then Ceres is definitely a planet.

But there is still active geology on Mars. There is still moisture, winds and ice-caps that are shaping the environment. I consider that to be geologically active.

EDIT: And there are actual experts which consider active geology (or something similar) to be a planet, including Anton Petrov (https://www.youtube.com/watch?v=8-2HxrgqUnM)


Okay, but then you have to go and figure out which other asteroid and kuiper belt objects are planets.

The 'dwarf planet' distinction helps solve this! There are planets - distinctive in that they have clear orbits - and there are dwarf planets, which can be part of belt systems. This is a useful distinction.


Sure it is, but the distinction between terrestrial planets and gas giants are also useful, that doesn’t mean the latter aren’t planets.

I think it is fine that there are more objects planets then we can meaningfully count. Loads of things in our language act like that. E.g. a bug can be any number of things, and you know what a bug is by just talking about it. If some insect society then comes up with a meaningful definition of bugs which excludes spiders, that definition isn’t really doing the average user of that word any favor.


Yeah, probably strictly... But I’m not a planetary scientist. I’m merely a user of language, and I don’t need to be rigorous in my definitions. And to me the weather patterns on Jupiter is an interesting feature enough to count as geology (even though it is probably not strictly a geology).


This thread on HN won't make sense in the future if the Unicode body replaces ꙮ

Make a new character!


why not make an additional eye a diacritic mark so you can just add an arbitrary number of eyes


Uff.

I'm not sure we have space for another glyph in Unicode. Looks pretty packed in here...


UTF-8 is still more than 80% empty, and can be potentially extended...


Theoretically, UTF-8 can encode up to 31 bits (U+7FFF'FFFF)[0], but for compatibility with UTF-16's surrogates, it's officially capped to 21 bits with the max being U+10'FFFF[1]. That decision was made November 2003, so there's two decades of software written with hard caps of U+10'FFFF.

[0]: https://www.rfc-editor.org/rfc/rfc2279

[1]: https://www.rfc-editor.org/rfc/rfc3629#section-3


My thought as well


Unicode basic rule is that character definitions never ever change, even when enumerated erroneously.


Yes, but this is a change either way, because that codepoint's definition referred to that character. Either the reference or the description of the appearance has to change.


   ꙮ ꙮ 
  ꙮ ꙮ ꙮ 
   ꙮ ꙮ 
ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ

Make a new character. Updating the existing character ruins the meaning of all previous usages.

It's like trying to change an API. Don't disrespect your existing users. Make a new version.

(ꙮ ͜ʖꙮ)

Think of all the ASCII art this botches. That has to have some historical importance to the Unicode standards body.

(⌐ꙮ_ꙮ)

For scholarly digital (unprinted) documents where the correct character rendering matters, erroneous past usages can be trivially found with grep, a date search, and easily corrected. The domain experts will familiarize themselves with this issue and fix the problem. Don't take a shotgun to it!

This message wꙮn't have the ꙮriginally intended meaning if the characters are updated from underneath.

ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ ꙮ



So the text at that point literally talks about ‘many-eyes seraphims’. The eyes symbol is a pure gag—seems to be spliced in place of the letter ‘о’ in the word ‘eye’ just a little down the line. (However, Old Slavonic is a tough read due to no spaces, so I'm not sure about that word. But at least it's not the Glagolitic script, which was just ridiculous and actually had multi-circle letters.)


As far as I can tell it swallows up the preceding го.

It seems to be preceded by other jocular glyphs among scribes. A regular o received a central dot when writing the word "eye": ꙩ

Words containing "two" or "both" had an o replaced by two conjoined os: ꚙ

It is only natural to carry on the in-joke for the dual and plural "eyes" like this: ꙭ

Our scribe simply got a little excited, following the pattern to it's logical end point in the term "many-eyed"


It's curious that the red ink blobs behind the "eyes" aren't included in the unicode glyph either...


Looks more like a diagram in the middle of text. It's very unique. It should not be a character


It's used in place of the letter "o", so not purely a diagram but it feels like the role of a font to me, not a dedicated character.


I don't understand why this character needs to exist given that, at least according to the author, it has only been seen once in the wild, and it's semantically identical to another more widely used character.

I'm glad I'm not responsible for unicode. Clearly I have the wrong mindset for it.


Perhaps it's relevant to look at how it was introduced - as a "package deal" with many, many characters from medieval cyrillic literature, as described in this proposal https://www.unicode.org/L2/L2007/07003r-n3194r-cyrillic.pdf

It certainly made sense to include this package in Unicode, and the vast majority of those characters certainly should be in this proposal. You do have to draw the line somewhere, and obviously those close to the line will be debatable, no matter where you chose to draw it, like this particular symbol - but once you've decided that you will include the one-eyed O (small and capital) and the two-eyed O (small and capital), then putting in the many-eyed O as well to complete the set doesn't seem so far-fetched.


Surprisingly many characters in Unicode are only recorded a few times if not once before the assignment. Chinese characters for example have a lot of them, because it was relatively frequent to make a new character for newborns before the modernity and some of them have survived through literatures but otherwise seen no uses (e.g. 𡸫 U+21E2B only appears once in the Records of the Three Kingdoms 三國志). But they have still received code points because they are considered essential for digitaization of historical works, and multiocular O is no different.


I didn't realize that digitization of all historical works was the goal of unicode. There's plenty of space for everything. And only a few fonts out there aim for complete coverage, like noto.

I just don't have the personal fortitude to attempt something so grandiose. Seems like a fool's errand.

Also, keep in mind there's not just one multiocular O. There's a bunch with varying numbers of eyes.


Not every goal needs to be something you can accomplish in a day or a year or even one lifetime.

There are not quite 8000 spoken languages on Earth at the moment, and a lot of them are from cultures that never invented writing. SIL has sent a missionary to most of them to learn the language, invent a writing system for it, teach it to them, and translate the New Testament into it. Most of those are fairly standard alphabets using characters from the Latin scripts, plus perhaps a few new characters or new combinations of character and diacritical. The task is large, but finite.


Imagine you’re a historian from the future studying some old document, and you spot a weird character that you’ve never seen before. Wouldn’t it be useful to be able to search for that character to see if it shows up in any other document? A simple OCR scan will bring up all the information you could ever need for that one weird symbol.


It's been seen once in the in-print wild.

There's no way to know how many since-written documents will break if a whole codepoint is dropped.


I agree with your mindset. It’s time for a unicode replacement.


I’m not sure how I feel about this. I’m not an expert by any means.

But something just doesn’t feel right when you’ve got unicode with a character with one known use from forever ago.

Doesn’t this open up the flood gates to just a ridiculous amount of work or else biased gatekeeping?

How much work would it be to implement your own font of the entire unicode set? Or is that not actually a thing and fonts implement as-desired subsets?


There are quite a few such characters in Unicode because academic articles about things like cuneiform need to be digitized too. And because the historical record is so sparse, we often have vanishingly few, or only one example of a character, and perhaps no way to know if it was a misprint or a real character.

Actually this character seems like a scribe's joke, no different from the illustrated characters at the beginning of medieval paragraphs (all of which are represented in Unicode as A, B or whatever). But the point still holds.

It even holds for modern languages -- consider the ghost characters needed for round trip compatibility: https://weekly-geekly.imtqy.com/articles/418717/index.html

(actually cuneiform is a poor example; perhaps Linear A would have been a better example)


Why don't those digitized articles just use images? They can have any variant of any glyph they want to document.


It's not just the articles, it's digitization of the texts themselves and email conversations. Using characters offers the opportunity to do computational textual analysis (this allows you to do substitutions first, by replacing this character with 'o' -- much harder on a bunch of tiny images).

Plus there's no shortage of space in the Unicode address space.


> How much work would it be to implement your own font of the entire unicode set? Or is that not actually a thing and fonts implement as-desired subsets?

You can't, and you are not expected to do so. You are limited by OpenType limit (65,535 glyphs), various shaping rules that possibly increase the number of required glyphs, and lack of local or historical typographic convention. Your best bet is either to recruit a large number of experts (e.g. Google Noto fonts) or to significantly sacrifice quality (e.g. GNU Unifont).


A single OpenType font file is limited to 65,535 glyphs. Nothing stops your font from being implemented as a series of .otf files (besides what people think of as a "font" when it comes to usage on computers).

But yes, time constraints are the limiting factor. I don't think anyone is going to dedicate their entire life to making a single font.


While you are right that one logical font can consist of multiple font files (or possibly a OpenType collection), this constraint does affect most typical fonts, and in particular wide-coverage CJK fonts already hit this limit. Fonts supporting only one of Chinese, Japanese and Korean don't need that many glyphs, and probably even two of them will be okay, but fonts with all three sets of glyphs won't. It is therefore common to provide three versions of fonts, all differently named.


You could also go the shady route and just make a font out of all the "reference character sheets" that the Unicode site has. Probably not legal and the result would not be pleasant to read, but that's one way to create a font containing all of Unicode.


I wasn’t aware of the 2^16 limitation. Thank you for the notes!


I'll tell you more: there are Unicode glyphs without known usage.


And then there are the ghost characters, which are known never to have been used.


I love this character and I love the fact that is being updated. Just to get this right: at some point some person chose to doodle the letter instead of writing it the correct way and now we have a corresponding Unicode character? Sort of amazing and it also makes you think ...


There was a... "tradition" is a strong word, perhaps "trend" is better. Authors making copies of the Bible or related works in Cyrillic, that the letter O (equivalent to Roman O) at the beginning of the word for "eye" would be stylized to look like an eye. There are a variety of glyphs along these lines: Ꙩ, Ꙫ, Ꙭ. All of them, including ꙮ, were added to Unicode as a single group.

The glyph "ꙮ" was used to refer to an Angel with a whole buncha eyeballs, as one does. In terms of texts that survive today, this specific glyph has exactly one use in a single manuscript from the 1400's. It might have been used more, in texts which don't survive. But it is part of a larger trend, and I bet that its inclusion in Unicode depends strongly on that.

But yeah, in itself the ꙮ character exists solely so that modern computers are capable of a more-faithful rendition of the transcription of a single handwritten copy of the Book of Psalms.


Thank you for describing the missing context. I couldn't understand why this stylized letter deserved a code point more than the uncountable others. I don't necessarily agree still, but the fact that this character was only unique within a larger trend makes it much more reasonable.


Hah, and here I thought I was making a joke when I called it a biblically accurate O!


> modern computers are capable of a more-faithful rendition of the transcription of a single handwritten copy of the Book of Psalms.

I wonder if there is even a copy of the book transcribed to actual characters or if it only exists as scanned PDF copies? If anyone did transcribe it, would they have any knowledge that the ꙮ character even exists on computers?


So you are saying that the glyph is now more biblically accurate?


The Bible doesn't specify how many eyes seraphim have.

"In the center, around the throne, were four living creatures, and they were covered with eyes, in front and in back. ... Each of the four living creatures had six wings and was covered with eyes all around, even under its wings."


"meme"


I attended a Unicode meeting (or maybe two? not sure?) and came away with the impression that Unicode is like those open source projects that are used by half of the world and maintained by a handful of skilled and benevolent people.

In Unicode's case I think most of them are paid, at least.


That is what I understood too. It doesn’t seem particularly hard to add new letters to Unicode too if you try a bit.

However that is a bit harder with emojis, that have their own subcommittee, which seem to be more bureaucratic and also more popular than the rest of Unicode. Everyone wants to make a new emoji.


It does raise interesting questions about what counts as decoration/formatting and what counts as part of the actual text. You could view these ocular O characters as purely decorative (like the fancy first character in a paragraph) but they could also be seem as a quirk of spelling which should be represented in unicode.

But the multiocular O really does seem like one monk got bored one time and did some doodling.


This is not exactly a correct description. Unicode does not specify the appearance of characters, only their meaning. It seems what’s changed is the reference presentation of the character in the Unicode tables, not the character itself. Unicode goes to great lengths to preserve backwards compatibility so changing the meaning of a code point would violate that principle. Your OS or application providing Unicode 15.0.0 support will not change the appearance of U+A66E. The appearance is dependent on the font.


Wait a minute, how will we refer to the old glyph in the future? Once this is updated the articles such as this one will have the new shape.


There was a joke that U+A66E should retain seven eyes and further eyes should be added with a ZWJ sequence [1]. If that character somehow got very popular in modern texts, updating its glyph may result in an interoperability problem so such solution would have been needed. But that didn't happen so the glyph itself has been updated instead.

[1] https://twitter.com/BabelStone/status/1323440365429542919


"The character formely known as U+A66E"


If you open the proposal [0] it kinda just looks like someone doodled some flowers on the text rather than actually used a particular letter. And given it's the ONLY existing record of this letter, it's very suspect isn't it?

[0]: https://www.unicode.org/wg2/docs/n5170-multiocular-o.pdf


It's in place of "goo" in "mnogoočimi" (many-eyed) in the phrase "many-eyed seraphims", so it at least makes sense.


my Old Church Slavonic is pretty rusty (well, nonexistent), but "mnogo" looks like modern Russian много (many), and the -imi I guess would be instrumental plural like -ими? but Russian for "eye" is глаз or око. I'm guessing oč -> око, and it's a compound word? or is the č an infix, something like "ogo" is eye, and mnogoočimi is such because the two -og-s (one from mnog and the other from go) fuse because "mnogoögočimi" would be awkward to pronounce?


Pretty sure it is mnogo-oči-tii. The word "oči" still means "eyes" (although mostly in poetry).


Here's the original tweet where the discrepancy was noticed in 2020, and a photograph of a page inside the book where it's used:

https://twitter.com/etiennefd/status/1322673792452354048


Related thread, about non-existent CJK characters ending up in Unicode through transcription mistakes ("ghost characters"):

https://news.ycombinator.com/item?id=32095502 ("A Spectre Is Haunting Unicode", 180 comments)

edit to add: The top thread in the 2020 repost was about ꙮ,

https://news.ycombinator.com/item?id=24955536


So it's a Unicode character that represents a... blob with 10 eyes?

Hordes of Wizards of the Coast lawyers getting ready for the big fight


Nah, Beholders have 11 eyes, so we're good here.


I feel like the spelling should be updated to Behꙮlders, or better yet, BehꙨꙮlders, to reflect that (of course, this would only make sense once the glyph update actually hits).



(┛ꙮДꙮ)┛彡┻━┻


  ꙮ>
 ===
The James Webb Space Telescope.


┬─┬ノ( ꙮ _ ꙮノ)


Be not afraid


Bee not afraid?


Bee Nut Afraid.

(When an apiarist is terrified.)


cant unsee


Don't worry, you'll forget about this one when it gets six more eyes.


> Unicode character “ꙮ” (U+A66E) is being updated

I fear this will lead to a lot of "bug fixes and performance improvements" in Android. /s


From the tweet’s image:

> written in an extinct language, Old Church Slavonic

It’s absolutely not extinct and is used by the Eastern Orthodox Church in their religious texts almost exclusively. It’s taught to children alongside their Sunday school curriculum and, of course, in seminaries.


Generally languages with only liturgical usage are not considered “living” languages, just as the Latin if the Catholic Church is still considered a “dead” language.


Too bad I have to adjust my business cards for ꙮ.world


Sadly, ꙮ is not eligible for engraving by Apple on AirPods.


As of right now, it's available for "adoption": https://www.unicode.org/consortium/adopt-a-character.html


The Unicode can be ridiculous at times. It contains a character used once in a single manuscript in a extinct language, but not a standardized glyph for an external URL link.


This kind of stupid thing is my problem with Unicode. We have all this baggage for stuff that nobody uses, and we need to deal with it forever. The worst for me is the way there is no possible way to encode a grapheme cluster as a constant size, so using Unicode make it impossible to have simple character access like an old style c string, no matter how big you make your char, even though it's totally possible with damn near every language that people actually use.

So then we all end up paying this massive complexity tax everywhere to pay for support for some Mongolian script that died out 200 years ago (or multi codepoint encodings of simple things like é - just why, it was so avoidable).


> encode a grapheme cluster as a constant size […] totally possible with damn near every language that people actually use

This is not true. For a concrete example: the languages Hindi and Marathi, with ~500 million speakers, use the Devanagari script (also used by Nepali and Sanskrit), in which a grapheme cluster is (usually) a sequence of consonants followed by a vowel. For instance, something like "bhuktvā" (भुक्त्वा) would be two grapheme clusters, one (भु) for "bhu" and one (क्त्वा) for "ktvā". In Unicode each vowel and consonant (here, bh, u, k, t, v, ā) is separately encoded, which is the only reasonable thing to do, and inevitably means that grapheme clusters can have different lengths (number of code points). The alternative would have been to encode every possible (sequence of consonants + vowel) as a single codepoint, which gets ridiculous quickly: these sequences can be up to 5 consonants long, so you'd end up having to encode (33^5 * 13 ≈ 500M) codepoints for Devanagari alone (or completely prevent certain sequences of consonants from being expressed, which makes no sense either), not to mention that most of the scripts of the Indian subcontinent and south-east Asia follow the same principle and have similar issues (e.g. Bengali with 250M speakers, Telugu, Javanese, Punjabi, Kannada, Gujarati, Thai with over 50M speakers each, etc).

(See chapters 12–17 of the Unicode standard, currently version 15: https://www.unicode.org/versions/Unicode15.0.0/ch12.pdf)


Have you ever written software before Unicode? We had N different encodings for each language, each culture, each country. There were all kinds of bugs creeping up, and software that works perfectly well could be buggy for one random language. Unicode abstracted all of this away from the programmer in a pretty simple fashion. I simply do not see how we're paying the "complexity tax" by using Unicode, unless you're writing a library that handles Unicode (which you shouldn't do, you should use existing libraries) you don't need to know anything about Unicode.


Before Unicode, everyone who came up with a character encoding scheme probably thought their system was good enough for any reasonable use-case. But they all had limitations that made them inadequate for things less obscure than representing some dead Mongolian language.

It would be nice if we could come up with some magical system that optimally encodes all the text that "matters" and ignores everything else, but history has shown that to be very hard. So we're left with Unicode, which takes the approach of giving us (effectively) infinite code points to represent characters, with (effectively) infinite ways to visually represent them. That does lead to a bunch of "unnecessary" baggage and headaches, but it also solves a bunch of real problems that you probably don't know exist.

Unicode is a pain in the ass, but it's a solution to a very hard problem. You can feel free to design your own solution, but you'll probably run head-first into all the problems Unicode was trying to solve from 40 years ago.


I hear you. I loathe working with Unicode for this exact reason. It's a bit of a nightmare due to its complexity.

That said, what it's trying to do is enormously complex.


I'm getting the impression that this is only "obvious" from a latin-cyrillic-greek alphabet point of view ?

P.S.: Also, even for those, it would seem that one of the big reasons for things like combining characters was added to Unicode in order to be backwards compatible even with mutually incompatible encodings ?


Your notion of character doesn't necessarily match others, and there are many cases where the number of possible "characters" in some notion is unbounded. Unicode provides a very well-defined superset of those notions for you. Collecting characters is only a minor portion of their jobs.



Being stuck on macOS Catalina with Unicode 12, I think there is a way to upgrade to newer versions and get new emoji support [1][2]

[1] https://apple.stackexchange.com/questions/278937/is-there-a-... [2] https://forums.macrumors.com/threads/updating-maverickss-emo...


Meanwhile I check back every now and again on MUFI (Medieval Unicode Font Initiative) [1] and it's still not in.

[1]: https://mufi.info


Am I alone in thinking that this is not so much a separate character, as a doodle a bored monk made to relieve a tiny bit of the tedium of copying manuscripts?


And its new official name shall be the Trypophobigon.


Is there any end to this? E.g., why not include Galileo's pictograms of Saturn as seen here: https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=...


Galileo's pictograms aren't being used as a letter, though: they're explicitly a diagram (I believe he says "like this: <picture>").


Crazy that it renders in HN comments (which rejects a lot of Unicode) ꙮ


I'm always happy to see some esoteric unicode updates


They should put in a few additional eyes as hot spares.


That character triggers my trypophobia.


I was astonished not to see this mentioned at all when I saw the post earlier! Almost commented about it myself but I wanted to think about something else.


Thank heavens. I was always so annoyed the way my multiocular O's came out, especially on 4chan


But how do we opt out of the update!


Use a font that contains the previous glyph. This is just an update to the reference glyph, and there is nothing prevents you from using a font that has an upside-down A in the place of U+0041.


Cut & paste from the pdf: многочитїй.

Hmmm. Well, on my screen, it’s lips/kiss. A Unicode fail.


This is similar to "man in business suit levitating" emoji.

How this stuff make it to Unicode?!


Levitating man is just an unicode encoding of an old Webdings (or windings?) font.

There was an accepted proposal to add many windings and webdings letters as unicode endpoints. Thus, levitating man in a suit.


I miss the good old days when character sets didn't feel the need for annual updates.


Finally!


Personally I'm waiting for Kaktovik numerals in unicode.


Aren't they there already? 𝋀𝋁𝋂𝋃𝋄𝋅𝋆𝋇𝋈𝋉𝋊𝋋𝋌𝋍𝋎𝋏𝋐𝋑𝋒𝋓 Or do you mean waiting for more fonts to support them?


Which fonts do support them?


And yet no word on adding Cistercian numerals?


ꙮ I'll check back in the future.


Since this link list https://slatestarcodex.com/2017/05/09/links-517-rip-van-link... , the character has become somewhat of a meme in the appropriate circles. Not surprised it's getting some love at last.


Biblically accurate O?


[flagged]


There's an emoji for handgun, but Apple and other big tech decided it needed to be a water gun. There is also a rifle character intended to represent the sport of shooting in a pentathlon, but again Apple threw its weight around and, while the character became codified in Unicode, it never became an emoji and no font from big tech supports it.


I guess because the goal of Unicode is to be able to represent every character that's appeared in language. This one is in a published book, while guns and a sexual intercourse symbol aren't.

Emoji was a weird value add that Japanese mobile providers added to their phones before Unicode. To get them to move to Unicode, they had to keep them. That's why there's a Tokyo Tower emoji, but not an Eiffel Tower. That's why the post office has a 〒 on it. That people get any use out of emoji outside of Japan is really pure luck.


I've even heard emoji referred to as "the carrot that keeps the implementations current." Every time a new version of Unicode is published, a few more emoji are tacked on. It acts as incentive for all the cellphone carriers and such to put the money into updating their implementations, because nobody wants to be the one on the block with the one phone that can't render "Mirror Ball" .

(ETA: LOL, Hacker News drops "Mirror Ball" https://emojipedia.org/mirror-ball/ from the comment when you post)


Incidentally, Windows doesn't have the mirror ball. I guess it is a carrot to get me to upgrade to Windows 11, which I am skipping. (The key with Windows is to only use the good versions; XP, 7, 10, ???. Hoping ??? arrives soon ;)


It's not in Win11 yet.


I believe the majority of emoji do not work on hacker news.


That seems actually logical when you consider that kanji presumably began as simple depictions of objects that could be drawn quickly. Perhaps the only difference between emoji and kanji is time.


We do have a Unicode character for a gun: U+1F52B PISTOL. Most fonts that have it choose to style it as a water gun, though.


There are heiroglyph dicks in unicode, see U+130B8.


I even posted phallus with emission in my comment above.

I can see it on latest iOS, but not on Windows 10 + Chrome.


I want to be that person that has so much time on their hand they can afford to waste it on pointless things like this.


There's a career path to get there. It involves becoming someone who cares deeply about the ways and means of digitizing data stored in analog media. Drill down deep enough, and you'll find yourself in a fascinating world of coding an error.

There are things like the "ghost characters," which are codepoints in Japanese that map to characters that were basically transcription errors when the team was putting together a full set of Kanji. Some characters with an extra horizontal line snuck into the set; they were likely caused by a transcription error because the character got split onto two pieces of paper by lines of text being copy-pasted into a records book, and the shadow cast by the thin extra layer of paper was misinterpreted as another stroke.

https://www.dampfkraft.com/ghost-characters.html


And then people wander why software developers don't care to support Unicode properly. First 60,000+ characters made sense, than few more were needed and Unicode suddenly got to play with a 1,000,000+ and just went off the rails.


You can support Unicode without ever having to display all possible characters "correctly".


I think the big issue with Unicode is that it is centralized and there are politics about what characters get included (see Klingon)

I think I have a solution to decentralize Unicode:

1. Extend Unicode to 128-bits. We can still use UTF-8 variable length encoding which will limit the real size.

2. Use a blockchain to coordinate the characters. That way whoever wants to add a character can do it without gatekeeping.

These simple suggestions will go a long way in making Unicode less centralized.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: