Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Han Unification" - in Unicode many Japanese characters are represented as Chinese characters that look different (and subjectively ugly). The Unicode consortium's answer is that you're supposed to use a different font or something when displaying Japanese, which is pretty unsatisfying (e.g. if you want to have a block of text that contains both Japanese and Chinese, you can't represent that as just a Unicode string, it has to be some kind of rope of segments with their own fonts, at which point frankly you might as well just go back to bytes-with-encoding which at least breaks very clearly and visibly if you get it wrong).


You can use the deprecated language tag control codes to distinguish unified code points. It is unlikely to be well supported but it is there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: