Samuel H. Caldwell (standing) with Vannevar Bush sitting in 1949 while Caldwell was developing what became the Sinotype system in 1959.
Samuel H. Caldwell (standing) with Vannevar Bush sitting in 1949 while Caldwell was developing what became the Sinotype system in 1959. This photograph, with the image of Caldwell removed, was clearly the source or inspiration of the cover of Popular Mechanics.
First page of Caldwell's paper on the Sinotype
Photograph of perhaps the first implementation of Chinese text processing on an Apple personal computer.
Photograph of perhaps the first implementation of Chinese text processing on an Apple personal computer using the Sinotype III system. Circa 1981.
Detail map of Boston, Massachusetts, United States,Dongcheng Qu, Beijing Shi, China

A: Boston, Massachusetts, United States, B: Dongcheng Qu, Beijing Shi, China

Samuel H, Caldwell Develops the Sinotype, a System for Phototypesetting & Computer Processing the Chinese Language

1950 to 1981
Vannevar Bush working with Samuel Caldwell's computer on cover of Popular Mechanics
Creative Commons LicenseJeremy Norman Collection of Images - Creative Commons

The earliest efforts to provide a system for phototypesetting and computer processing the Chinese language began about 1950 in the United States through the efforts of computer scientist and logical circuit designer Samuel Hawks Caldwell. The main forces involved in backing and funding this project were William Garth, Jr. and the Graphic Arts Research Foundation (GARF) supported by computer and information retrieval visionary Vannevar Bush, and the Carnegie Foundation then directed by Bush, and the U.S. Army Quartermaster Corps. Because non-Latin-script languages were used in developing countries that had little or no access to computers prior to the micro-computer revolution of the 1980’s, this work represented the earliest effort to computerize those languages.

Because of because of the capacity of the Lumitype Photon typesetting system to store 17,000 characters on a single optical disc, Caldwell, Bush and William Garth, Jr. recognized it potential to provide a method of phototypesetting non-alphabetic Asian languages. Thus some of the early development of the Linotype Photon was tied to its strategic application for non-alphabetic languages such as Chinese and Devanagari.

Before the Chinese could develop an electronic computing system that could process text as well as numbers, computer scientists had to invent a way for computers to process the non-alphabetic Chinese writing system, which consists of tens of thousands of characters. The challenges were many, including how to develop a special purpose computer interface to allow the typesetting of so many characters via a typewriter or computer keyboard, and how to work with large character sets in computers with very limited memory and performance. As early as 1953 Garth and Caldwell produced a report in the archive entitled “Study Leading to Specifications for Equipment for the Economical Composition of Chinese and Devanagari.” Because of funding limitations, the project initially concentrated on the Chinese language. From the beginning the U.S. government recognized that having a way to phototypeset Chinese would have strategic value both for information production in Chinese, and perhaps to gain influence in China. They also recognized the strategic value of being able to phototypeset other “complex language forms” besides Chinese. According to documents in the archive, on May 13, 1954 Nelson Rockefeller sponsored a conference entitled “Development of Suitable and Economical Methods for the Composition of Complex Language Forms.” Around 50 people attended from state and federal agencies, the USIA, Dept. of Commerce, HEW, Library of Congress, Army, and CIA.

In the process of developing the Sinotype, without personally learning Chinese, Caldwell in collaboration with Chinese scholars, analyzed the way that Chinese was written, and developed a method of simplifying the keyboarding of the keystrokes that make up Chinese characters so Chinese could, for the first time ever, be “typed” on a standard QWERTY keyboard. In consultation with Chinese scholars Caldwell discovered that Chinese students learn to write ideographic characters "very much as his alphbetic brother learns to write words....Every Chinese learns to write a characer by using exactly the same strokes in exactly the same sequence."

"As an expert on logical circuit design, the idea of consistent Chinese ‘spellings’ whetted Caldwell’s intellectual curiosity: if every Chinese character was composed in precisely the same way, might it be possible to design a logical circuit that, being fed such Chinese strokes as input data, outputted Chinese characters? If Chinese, despite being a non-alphabetic language, exhibited its own ‘spelling,’ might it be possible to build something that had eluded engineers for years: a computer for the Chinese language?

"Caldwell sought the help of Lien-Sheng Yang, a professor of Far Eastern Languages at Harvard. Caldwell relied upon him to conduct a thorough analysis of the structural make-up of Chinese characters, and to determine the stroke-by-stroke ‘spelling’ of approximately 2,000 common-usage words. Caldwell and Yang ultimately settled upon 22 strokes in all: an ideal number to place upon the keys of a standard Western-style typewriter keyboard.

"Instead of the QWERTY keyboard layout, Caldwell would outfit the keys of the Sinotype with Chinese brushstrokes, which the typist would use to compose—or more accurately to describe and retrieve—Chinese characters. In his own terms, Caldwell’s objective was ‘to furnish the input and output data required for the switching circuit, which converts a character’s spelling to the location coordinates of that character in the photographic storage matrix’.

"In the course of his research, Caldwell made a second startling discovery. Not only did Chinese characters have a spelling, but, as he wrote, ‘the spelling of Chinese characters is highly redundant’. It was almost never necessary for Caldwell to enter every stroke within a character in order for the machine to retrieve it from memory. For a character containing 15 strokes, for example, it might only be necessary for the operator to enter the first five or six strokes before the Sinotype arrived at a positive match." [1]

In so doing he also invented the first auto-complete method of typing—one of the most widely used features of computing today.

By 1959 Caldwell had a working system that used a special purpose electro-mechanical computer of his own design to power a Lumitype Photon machine to set type in Chinese, using the auto-complete method of typing. Caldwell’s most detailed published report on the system was “The Sinotype—A Machine for the Composition of Chinese from a Keyboard,” Journal of the Franklin Institute, 267 (1959) 471-502. In his Concluding Remarks to that paper he wrote,”

“Many will wonder why this work was ever done or why our military establishment devoted substantial funds and attention to the project. The answer to this question seems simple and clear. In selling the idea to the military authorities, the writer had only one real argument. To be sure, it was a fascinating project, but mere fascination was not a sufficient reason for supporting it. The argument that counted was to the effect that a machine for composing Chinese would improve communication among men, and that no improvement of communication ever harmed the cause of peace among men. The writer is burstingly proud of the way the military establishment of the United States of America has supported, both in funds and in enthusiasm, this project to wage peace.”

By the 1970s there were numerous competing code systems, such as Standard Telecode, Chinese Character Indexes, OSCO Onsight encoding developed by Dr. Zhi Ping-yi of the Shanghai Instrument Research Institute. At this time the Garth's Graphic Arts Research Foundation resumed support of non-alphabetic typesetting research.

By 1978-1980, with advances in electronic computing, electronic memory, and software, the optical typesetting system of the Lumitype Photon had been made obsolete by computer applications. Roy Hofheinz, Jr., the Sinologist Director of the Fairbank Center at Harvard, developed the Sinotype II system, using an expensive Data General minicomputer, and the OSCO input system. This system incorporated a real time dot-matrix on-screen display of Chinese.

Hofheinz’s Sinotype II minicomputer system evolved into the Sinotype III system, developed by GARF beginning in 1981. Huge cost savings were obtained using an Apple II personal computer, and it was programmed with cursor control, text scrolling, up/down, the ability to insert or delete text, save files and print. A low-resolution Chinese font editor was included with the software. The system worked with multiple input code systems including Caldwell, STC, ISENS, OSCO, and could handle Latin text intermixed with Chinese.

With the introduction of the Apple II+, a personal computer with 32K RAM (and extended to 80K RAM for Sinotype III), and Apple II High Res graphics, the Sinotype III operated with a Corvus 5 megabyte hard drive, and Epson MX-70 printer, and 5.25 inch floppy disc drive.  The system output 16 x 16 dot matrix characters on screen and in its dot-matrix printer output. It retained the 100 most frequently used Chinese characters in RAM for fast screen refresh and stored the rest on the hard drive.

[1] Thomas S. Mullaney, “America’s Secret Cold War Mission to Build the First Chinese Computer,” The Atlantic, September 14, 2016.



Timeline Themes

Related Entries