Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem is not with the characters themselves though, it's with how CoreText processes and parses the unicode symbols.

Don't get me wrong, I know fonts can be malicious, even... but given the history[0] here, with these unicode[1] issues... I think it's pretty safe to say the issue is not with a specific font, per-se, but with CoreText and unicode parsing. For example, when I ran OSX on my old macbook pro, I used open source fonts on my terminal, and this unicode parsing bug still happened.

In 2015 an Apple spokesperson had this to say: "We are aware of an iMessage issue caused by a specific series of unicode characters and we will make a fix available in a software update."

> BOM or endianness don't seem to be relevant to this bug.

Yet on your blog you allude to just that with the left/right comments, though, to be fair, you state that you really don't know the problem:

> I don’t really have one guess as to what’s going on here – I’d love to see what people think – but my current guess is that the “affinity” of the virama to the left instead of the right confuses the algorithm that handles ZWNJs after viramas into thinking the ZWNJ applies to the virama (it doesn’t, there’s a consonant in between), and this leads to some numbers not matching up and causing a buffer overflow or something.

This is claimed to be a dissection of the issue, but there is not even a stack trace present, and yet you joke about that...

> Yes, I could attach a debugger to the crashing process and investigate that instead, but that’s no fun

Nah, you should do that... and you'll likely see that it's CoreText being the same old piece of shit as usual. If it was a font problem, then loading that font on a different system that uses a different rendering engine should reproduce the same problem. It doesn't, I tried that years ago.

Sure, I would posit, that potentially, this is a whole different bug, but... given the history, and the repeated failed attempts to fix this entire class of issues... it's safe to say that IOS and OSX do not handle unicode very well.

[0] https://www.theregister.co.uk/2013/09/04/unicode_of_death_cr...

[1] https://www.theregister.co.uk/2015/05/27/text_message_unicod...



I'm not talking about malicious fonts, but font stacks are complicated regardless of the encoding being fed in.

You seem to think I'm blaming it on the font. I'm not. I'm blaming it on the font stack (CoreText)

I avoided using the term "unicode" to refer to a bug in the font stack because font stack bugs don't always have to be specific to unicode.

You keep saying "unicode parsing"; that's a meaningless term in this context.

> Yet on your blog you allude to just that with the left/right comments, though, to be fair, you state that you really don't know the problem:

That doesn't have to do with endianness or BOM, that's a totally different kind of ordering. It's ordering of the code points, not the code units.

> This is claimed to be a dissection of the issue, but there is not even a stack trace present, and yet you joke about that...

It's a dissection of the string. Not a full debugging of the issue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: