Read it on the internet (if you can)!

“So many boxes to be found on the internet!”

I’m guessing that you’ve seen something like this, perhaps in Facebook, Twitter, email, etc. Right? And perhaps you know that the boxes represent characters in fonts that you don’t have on your device.

But fonts are only part of the issue. What’s also going on is that your device may be missing more than a font; it might just not know how to deal with a particular writing system, regardless of the font in which it is written.

So how many different writing systems are there in the world? Which ones can you name? If this is not one of your areas of expertise, you might start with Latin (which you possibly call English or maybe Roman), Greek, Hebrew, Arabic, Cyrillic (which you probably call Russian), Chinese, and then… you run out of steam. That’s six. If you know a bit more about writing systems, you’ll add Japanese, perhaps Korean and Thai (if you have good taste in restaurants and pay attention to the scripts), Devanagari (which you will recognize, probably also from restaurants, even if you don’t know the name), and… well, that’s about it for the unwashed masses. So maybe ten.

It turns out that you’re just scratching the surface. There are literally hundreds of different writing systems.

You can see all of them on the internet. Check out Wikipedia, for instance.

Well, no, actually. That’s wrong.

You can find images of practically any writing system, but you can’t read or write in them without font support. Check out Zachary Scheuren’s 20-minute talk at TypeCon, which includes a mesmerizing slide show of 143 different writing systems, most of which are surprisingly beautiful in their own ways and all of which altogether represent only slightly more than half of the world’s scripts. This slide show portion goes by much too fast, but that’s kind of the point: you’re supposed to get a rapid overview, not a lesson in specific scripts. I hope it will whet your appetite for more. The quotation in the first line of this post comes from Scheuren’s talk and illustrates the problem.

There are some important and interesting tidbits in the rest of the talk. For instance, the mystery of why a text message in Telugu script — just text — crashes an iPhone! (Don’t tell me you haven’t heard of Telugu. It’s spoken by eight million people, more than the entire population of Massachusetts, so don’t ignore it.) And the inability of most software to render Arabic and Devanagari correctly, even though those are two of the most widely used scripts in the world. So go watch that short video. But first… Can you recognize these four scripts, and can figure out how two of them are typeset with significant errors? You don’t have to know the languages — just a bit about the scripts in which they are written.

Categories: Linguistics