The past few months I’ve been hitting the kanji study hard. Between the two web-apps I use, I’ve regularly been reviewing at least a couple of hundred items a day. What this has really driven home for me is that QWERTY is just not a great fit for typing Japanese. Sure, it does the job, but it feels a little… off. Like wearing your belt one hole too tight. Off-balance almost.
So I began looking for reasons as to why this would be.
How do you type in Japanese?
Nowadays there are two main methods for typing in Japanese. Both require and IME (input method editor), a piece of software which turns your keystrokes into kana or kanji.
The first, and by far the most common, method (for both Japanese and English speakers) involves typing in a romanized version of a Japanese word. The romanization is automatically converted by the IME into hiragana. At this point the user can choose to select a kanji (or series of kanji) to replace the hiragana.
To illustrate, say I wanted to enter the word for ‘soy sauce’ shōyu.
- First, I’d type the romanization: shouyu.
- This would be converted into: しょうゆ.
- If I was happy just typing the kana I could hit Enter and my cursor would move on.
- Alternatively I could cycle through the list of kanji that match this reading by hitting Space and selecting the correct characters, in this case 醤油, and then press Enter.
The second method is largely the same except that there’s no conversion from the romanization to hiragana. You type directly in kana. Since Japanese has a syllabic, rather than an alphabetic, writing system, this requires a completely different layout from QWERTY.
In this layout each syllable, e.g. ka, has it’s own key. Because there are more syllables in the Japanese syllabary than letters in the Latin alphabet, the keys used for numerals on a standard US keyboard get taken over by kana. A separate key is used to mark voicing. Also notice the tall Enter key and small Backspace (the latter being one of my pet peeves), which make room for more kana.
So if everyone uses QWERTY, what’s so bad about it?
From my personal experience, it feels like your right hand does too much work.
Let’s compare what characters each hand is typing on QWERTY.
Right hand: Y U I O P H J K L N M (B)
Left hand: Q W E R T A S D F G Z X C V (B)
Now consider what consonants above are actually used in romanization. Even before taking into account character frequency, things certainly look weighted towards old righty. On the right-hand side, every key except l is used in typing Japanese. The situation is very different on the left. To illustrate, look at the unused keys for each hand.
Right hand: L
Left hand: Q (F) X (C) V
(The characters in parentheses can be used to type fu and chi respectively, though these sounds are normally typed H-U and T-I)
Setting aside the inefficiency of having unused keys in the first place, there’s more dead space on the left-hand side.
But that’s a very crude metric. Character frequency also needs to be considered. Tamaoka and Makioka (2004) (PDF) looked at the frequency of phonemes, morae and syllables in the Japanese newspaper Asahi between 1985 and 1998. Using their data, it’s possible to get a rough idea of how often phonemes, and their corresponding roman letter, occur. There’s quite a lot to unpack in this paper – it’s worth revisiting in a future post – but I’ll briefly discuss some key findings below in relation to typing.
Japanese has a five vowel system. Three of the vowels (/u/, /i/, /o/) are typed with the right hand. Two vowels are typed with the left (/a/, /e/). Tamaoka and Makioka found that the frequency of all vowels, occurring as either their own syllable/mora or part of a consonant + vowel cluster, was around ~22%, with the exception of /e/, which accounted for only 11% of vowel tokens.
So your left hand types just two of the five vowels, and of these two, one vowel e occurs half as frequently as all the rest.
The easiest way to present these frequencies is through tables. Below you can see the token frequency of each consonant (in percent) along with the hand it would be typed with using QWERTY.
To get a better idea of how much each hand is doing, I’ve also split the table in two, based on the hand used, and tallied the frequencies.
The takeaway here is that, of the consonants listed above, the left hand types more than the right, both for types, i.e. there’s a greater number of different consonants on the left, and for tokens, i.e. the total number of instances of left hand consonants is greater in their corpus. Obviously the token frequencies run contra to my intuitions. However, as will be discussed soon, the above frequencies don’t tell the whole story.
Semivowels + Contracted sounds
Here “semivowels” refers to the consonants at the beginning of the syllables wa, ya, yu, yo (/w/ and /j/), whereas “contracted sounds” refers to syllables with onsets of C + /j/, e.g. gya, gyu, gyo.
Overall, these sounds are account for a much smaller proportion of syllables than CV clusters, so I won’t go into too much detail. However, it is worth noting that, barring wa, the romanization for each of these syllables must contain one of the following: j, as in ju, h, as in shu, or y, as in yu. All of these are typed with your right hand.
As a result, the consonant frequency tables presented above are a little more skewed to the left than they should be, as they don’t account for the various syllables containing /j/, either as ya, yu and yo or as the “contracted sound” in syllables like gya, gyu, gyo.
So, am I right?
Well, it’s complicated.
There’s more wasted space on the left-hand side of the keyboard, but the left hand consonants occur more frequently than the right hand ones. Muddying the waters is the semi-vowel /j/, which is always typed with the right hand, but doesn’t feature in the consonant tables above. Simply put, the distribution of consonants is less slanted to the right than it first seems. And naturally, part of the reason the right hand appears to be doing less based on the consonant tables alone, is that much of the prime real estate on the right is taken up by the vowels U, I and O.
Moreover, the nasal /n/ hasn’t yet been discussed. This is typed N, using the right hand. Importantly, this key often has to be double-tapped to tell the IME you want to type only N and not a syllable like ni or nya. For example, the key sequence for typing the word shinya (syllabified as しんや shin-ya) ‘late-night’ would be: S-I-N-N-Y-A. Typing S-I-N-Y-A would yield the same romanization, shinya, but with different syllables, namely しにゃ shi-nya.
In light of the above we can assume consonant frequencies are probably nearing equal between the two hands, probably with slight lean to the left. And if that’s the case, the right hand is still probably typing more overall, as it has to type three of the five vowels, roughly 66% of them. Considering most syllables in Japanese are (C)V, by definition there must be more vowels than consonants overall.
Without really digging into the numbers1 though, I don’t feel confident declaring the right hand to be overworked using QWERTY + romanized input. Though I can say that it certainly feels that way to me, and that it’s very plausible, based on Tamaoka and Makioka’s work, that this is the case. Finding out the degree of asymmetry – if any – would be an interesting research question.
How did Japan come to use QWERTY in the first place?
Probably for much the same reason that QWERTY is used all over the world. It works well enough and is so ubiquitous that the downsides are outweighed by the utility of knowing a standardized layout.
As I said at the start of this post, I only suspected an asymmetry once I’d ramped up my typing day-to-day. For anyone who isn’t hammering away at a keyboard all day, it probably doesn’t matter much at all. Also, if you hunt-and-peck rather than touch-type, then every layout is much the same.
Interestingly, quite a number of alternative layouts and keyboard designs have cropped up over the years. For a nice run-down, I’d recommend Xah Lee’s blogpost on the topic. None of them really caught on, which makes you wonder how much better they really could have been for the average user.
A (time-consuming and possibly completely ineffective) solution to my problem
In response to my dissatisfaction with QWERTY, I’ve been trying to learn and use the kana layout over the past few weeks. At the moment it’s unclear whether this will end up being faster or more ergonomic in the long-run, but I’m willing to give it a decent shot.
And boy is it a strange experience to learn a completely new layout from scratch. The kana layout bears no relation to QWERTY. At all. From the outset, typing even short words inevitably leaves you bewildered. Furious keyboard scanning is a given until you start to get a feel for things.
However, surprisingly, it hasn’t taken me very long to memorize the layout. And I don’t seem to be unique in this respect either. Within a few days I’d managed to internalize most of it and I now feel reasonably confident touch typing. The caveat here being that I’m not especially fast and have to think a lot about what my fingers are doing. Oh, and I make a ton of mistakes. Having to use the number keys so often is really off-putting.
As tricky as it’s been I think I’ll stick with it for a while yet. Even if it’s just to feel like I’m keeping RSI at bay.
Tamaoka, K., & Makioka, S. (2004). Frequency of occurrence for units of phonemes, morae, and syllables appearing in a lexical corpus of a Japanese newspaper. Behavior Research Methods, Instruments, & Computers, 36 (3), 531-547.
Coda /n/ and geminate consonants are the only exceptions to this. ↩︎