How to Type Chinese

Communication is important. It’s a precondition for almost all of the projects of our lives. Communication requires infrastructure. In modern life, industry builds and supports communication infrastructure. It seems weird, I know. Can’t we just talk? But someone comes up with a better infrastructure, like texting each other or using Facebook rather than talking. At first just a few people use it. Then lots use it. Then everyone is using it. It helps them. It helps you. As industry advances to allow communication at a larger scale (e.g. web publishing), with greater clarity (e.g. HD TV), or with a better distribution network (e.g. Twitter), there are specialized machines and standards that you come to need. People are trained just to support specific applications and standards. Change becomes difficult. People still exchange phone numbers, even if they only text and never call. Microsoft Office doesn’t support emojis. People still rent DVDs but betamax died young.

So we are locked in, really, to information technology. We need, see, typewriters and telegraphs and letters and typefaces and speakers and microphones and word processors and so forth. Or else, we need upgrades to these. (An upgrade is the same thing but better, so it’s not a big shift usually.)

Yes, of course, an individual can go without. But if you want to plan a wedding, find a job, publish a novel, get a loan, or move to Chicago, you and your collaborators will need telephones, printers, email, and the rest. It would be quite inconvenient to try to sign a loan without printing digital documents, and while it could be done in theory, in practice it just won’t.

For about a hundred years, the typewriter was a fundament of communication. An office with a typewriter could immediately produce more documents, with better legibility, using less labor power. A typewriter meant you were serious, modern, professional, organized, and authoritative.

The Chinese Typewriter is a story of a technology that didn’t work for a long time. And a story about trying to make it work. The basic problem was that an industrial basis for communication became dominant globally, but Chinese couldn’t use it. Because Chinese has no alphabet.

(Mullaney notes that China was alone at this point in history in having a living language without alphabetical expression. Critics within China lamented that Chinese was a difficult language for information science and technology. Its characters were hard to recognize, hard to remember, hard to write, and hard to find. Many just wanted to reform the Chinese language and move on.)

Typewriters work by plunking down one letter at a time. (Or one number or point of punctuation can happen too.) The system works wonderfully for all Romance languages (the languages for which typewriters were first developed). They work alright for languages with similar alphabets. They fail for languages without an alphabet.

Mullaney warms up with the history of the Siamese typewriter. The language used an alphabet of 86 characters, way more than the usual Roman 26. But the Smith brothers (as in Smith-Corona) figured a way to squeeze the Siamese language onto two keyboards, and made good money selling a two keyboard typewriter in Siam. They used two special tricks. First, they included several “dead” keys that make a mark, but did not advance the platen. Second, they cut out two less important letters. To this day, those two letters are gone from Thai! The two keyboard machine was cumbersome, but complete.

The double keyboard was a winning solution in 1891, but fashion began to change. In the following decades, a new style of typewriter emerged that showed the typist the letters hit the page, which reduced errors and must have been pretty satisfying emotionally. Also, portable typewriters became cheaper and more popular.

The Smith brothers couldn’t create a new design through their existing Smith Premiere company, so they made a new one and released a different typewriter for Siam. This one had a single keyboard, smaller size, and showed the letters land on the page. Half the alphabet could be typed by hitting the letters as usual; the other half required the shift key. The Smiths considered this solution during the design for their first model, but decided it would increase operator error, as a simple ergonomic issue. (The typist must move her hand from its usual position to press shift.)

This book has such nice pictures! Here’s the Remington model that comes after all of the above, using the Smith shift key design.

Mullaney shows design strategies, spread across decades of fashion and industry and political change, to adapt the interface of a known information processing powerhouse. What he’s getting at ultimately is the invention of “input” as an abstracted flow for text. It’s nice to shake this one up a bit, because a book on how image processing takes weird inputs would be obvious, but here he’s showing how the relatively abstract grid of keystrokes provided by the typewriter was still not sufficiently abstract to handle Chinese.

Mullaney reviews many other national contexts where the typewriter was modified to accommodate difference. Hebrew required the carriage to move backwards. Arabic required a very different approach to typefaces.

In each case, there were willing entrepreneurs who took the existing industrial technology and tried to assimilate other languages to its method. He calls this techno-liguistics and suggests there may be, in this story of typewriters, a history of techno-linguistic imperialism. But Mullaney dodges any political implications from this. (He is a tenured professor at a prestige-based institution, so it just has to be “good” scholarship, not helpful for actual populations or just or anything.)

This machine gives the operator a plate and a punch. The characters are all on the plate and the operator slides around to the correct one (left hand), then drops the punch (right hand), printing on the paper beneath. Must have been chilly in the office, judging by the jacket.

His treatment of Chinese typewriters is extensive, beginning with a very exciting system of blocks grabbed and used by hand. (This one was invented by a missionary who was sick of working with locals who misunderstood him — very similar to the motivation for computing for knowledge workers in the US in the 1960s and 1970s. Idiots motivate innovators.) The next generation put such blocks on a plate, and asked the operator to shift around the plate to find the next character. This existed in various forms, with a disc or plates. Another popular set of designs asked the operator to assemble the character from smaller component, “radicals,” which would need to be rotated and scaled. Then, finally, came a machine that looks like a typewriter that used key combinations to produce characters. This was somewhat arbitrary, as the wood blocks had been in their placement, for a particular combination could trigger one of several possible characters, and the machine had to present these options to the user.

Sorry for the photo of a projection of a slide of a photo of a postcard, but this was the first successful design that used the combinatorial approach.

In this technical evolution, Mullaney gets bigger picture than the Siamese shift key and portability dilemma. He elaborates on several fascinating problematics (“puzzlings”) that different designs used or avoided.

To me, this is the best part of the book, abstract design problematics for this most epic challenge. Here are the problematics and how they afforded chances to make Chinese fit on a typewriter.

Common usage
There are too many Chinese characters to put on the keyboard; can we omit some? The common usage approach is: don’t support the edge cases. Everyone can make due with a reduced set of options, or at least enough people can do enough work that it will be fine. This is how most restaurants work: we do not serve duck, but you can have chicken. For the Chinese language, it meant ignoring the more obscure characters. For Thai, it meant chopping off a couple less common characters. Phonetic spelling systems proposed to simplify English have made similar suggestions, e.g. the lost book Skool Reform.

Rather than accepting characters as the smallest possible unit of the language, combinatorialism views characters as assemblies of smaller parts: strokes and radicals. A square, for example, is part of many characters, and so a language processing machine could allow the operator to select a square, then position it. Similar approaches can be found throughout software, and social theory, where a particular object, such as “an email”, can be broken down into multiple member parts, which are simply assembled into larger objects later treated as wholes by another system. In many systems, such as handwriting, this is how accented letters must be produced. Accent + letter = accented letter.

An extension of vernacular taxonomy and natural language, predictivism attempts to put the most often used objects in easiest reach. Modern autocomplete and predictive smartphone keyboards use the same basic approach. In fact, most of what designers do is try to put the most used thing front and center.

Natural Language and Vernacular Taxonomy
Make the machine fit the use. Rather than ordering characters by “the rules” of “what goes first,” typewriter users put characters in positions that made it easier for their writing. Whether writing about Jesus or Socialism, the character tray could be reorganized to make certain words easier to form. Over time, typists learned from each other and developed a bottom-up taxonomy for organizing information. In other fields this is called a “folksonomy” and corresponds to the contemporary UX “card sort,” where a researcher asks a user to order items in a way that makes sense to them.

Codebooks and path-dependency. Rather than presenting all possible characters as options, surrogacy offers a different set of options which are chosen in succession to build up a final selection. Mullaney discusses Chinese telegraphy, where each character had its place in a codebook; an operator could simply signal “1132” and this meant “the character on page 11, row 3, column 2.” Ultimately surrogacy was part of the “input method editor,” the common solution to Chinese typing used today in computing.