The engineering courage that created the first Chinese personal computer – TechCrunch

2021-12-16 08:16:05 By : Mr. JC Chan

China is one of the richest digital economies in the world today, with an unparalleled hardware supply chain, and many well-known and lucrative companies such as Alibaba, Tencent and ByteDance are leading the world. However, all these cutting-edge innovations rely on a 40-year-old solution to meet one of the huge computing challenges: the development of Chinese word processing.

Since the early 1980s, China has greatly expanded its purchases of computers from the United States and the West. In 1980, it imported only 600 foreign-made microcomputers, compared with 130,000 in 1985. Companies in the United States, Japan, and Europe are scrambling to participate in this "crazy buying", as one observer put it.

However, for potential Chinese computer users and Western manufacturers, there is a major problem: there are no Western-made personal computers, printers, monitors, operating systems, programs, or other things that can handle the input or output of Chinese characters-not in the early days. -And the mid-1980s, in any case, certainly not "out of the box". Without some major reforms, mass-produced personal computers are practically useless for anyone who wants to operate in Chinese.

The main problem facing potential Chinese computer users and Western manufacturers is that there are no Western-made personal computers, printers, monitors, operating systems, programs, or other problems that can handle the input or output of Chinese characters.

One of the most important reasons is the memory problem-especially the memory required for Chinese fonts. With the advent of Latin alphabet computing, Western engineers and designers determined that English fonts could be built on a 5×7 bitmap grid—each symbol only requires 5 bytes of memory. Although far from beautiful, the grid provides enough resolution to clearly display the letters of the Latin alphabet on a computer terminal or paper printout. Storing the 95 printable characters of US ASCII requires only 475 bytes of memory—for example, a fraction of the current 48 KB motherboard memory of the Apple II.

In order to achieve the comparability of Chinese characters and minimum legibility, the 5×7 grid is too small. When designing bitmap fonts for Chinese, engineers had to increase the size of the Latin alphabet grid according to the geometric shape, from 5×7 pixels to 16×16 pixels or more, or at least 32 bytes of memory for each Chinese character. For the 8,000 most commonly used Chinese characters, the total memory required to store only bitmaps (in simplified or traditional form, but not both, and no accompanying metadata) is about 256 KB, or the total amount of most Chinese characters Four times the capacity-off-the-shelf personal computers in the early 1980s. All of this, even before considering the RAM requirements of the operating system and application software.

Draft of bitmap from Sinotype III Chinese font prepared before digitization. Image source: Louis Rosenblum paper, Stanford University Special Collection

Draft of bitmap from Sinotype III Chinese font prepared before digitization. Image source: Louis Rosenblum paper, Stanford University Special Collection

This is the background of one of the great engineering histories of modern computing. This is a story about entrepreneurial courage and engineering ingenuity, providing a unique perspective on the global development of the digital revolution.

This is the first of two articles on TechCrunch, in which I studied Sinotype III, which is an experimental machine and one of the first personal computers to process Chinese input and output. Built on a store-bought Apple II-but equipped with a custom-programmed word processor and operating system-Sinotype III served as a "proof of concept", demonstrating how to "translate" a computer made in the West into Chinese to open A vast new market has been opened up.

In the first part, I will study the profound technical challenges faced by the creators of Sinotype III regarding computer memory, fonts, and operating systems, and how they can design novel solutions to overcome these challenges.

Our story begins with the Graphic Arts Research Foundation (GARF)-it can be said that the Chinese computer was born. The ideographic typesetting machine, also known as the Sinotype, was invented by MIT electrical engineer Samuel Hawks Caldwell in the late 1950s with funding from GARF. After his untimely death in 1960, the project came to a halt. In the 1960s and 1970s, the Sinotype project received support from many different parties, including Itek, RCA, and finally GARF.

The Sinotype I keyboard was designed by Samuel Caldwell in the late 1950s. Image source: Louis Rosenblum paper, Stanford University Special Collection

The Sinotype I keyboard was designed by Samuel Caldwell in the late 1950s. Image source: Louis Rosenblum paper, Stanford University Special Collection

The return of Sinotype is largely due to one person: Louis Rosenblum. He was born in New York City in 1921 and is another member of the MIT family. He graduated in 1942 with a major in applied mathematics. Under the guidance of the world-famous electrical engineering professor Harold Edgerton (he took the famous "Milk Drop Crown" photo in the 1930s), Rosenblum worked at Polaroid immediately after graduation , In collaboration with Edwin Rand on various projects, including the development of instant photography. In 1954, he moved to Photon, where he worked on photo typesetting for non-Latin script systems. Rosenblum was very familiar with the late Caldwell's pioneering work on Sinotype, adopted the project effectively, and restored it when he joined GARF as a consultant in the mid-1970s.

The figure shows the configuration of the Sinotype II system running on the Nova 1200 CPU. Image source: Louis Rosenblum paper, Stanford University Special Collection

The figure shows the configuration of the Sinotype II system running on the Nova 1200 CPU. Image source: Louis Rosenblum paper, Stanford University Special Collection

Until the early 1980s, GARF continued to work on China-type projects, at which time it had established an advisory committee composed of many well-known scholars and scholars with profound Chinese experience. Harvard linguist Susumo Kuno joined; so did Richard Solomon, who played a pivotal role during Richard Nixon's visit to China in 1972 and later became the RAND Corporation Director of the Social Sciences Department.

However, as outstanding as this kind of think tank, GARF’s major breakthrough in the Sinotype project-from a small computer-based system (Sinotype II) to a microcomputer-based system (Sinotype III)-was catalyzed by a college student Yes, his only experience is so far. His work at GARF is two weeks of data management for the Sinotype II project in 1979. He is Bruce Rosenblum, the son of Louis Rosenblum.

Bruce Rosenblum uses the Sinotype III system. Image source: Louis Rosenblum paper, Stanford University Special Collection

Bruce Rosenblum uses the Sinotype III system. Image source: Louis Rosenblum paper, Stanford University Special Collection

As an undergraduate at the University of Pennsylvania and an aspiring photojournalist, Bruce balances his time between coursework and photo editing for the independent student-run Pennsylvania Journal. The thesis is very advanced in terms of the operating equipment and the deep professional knowledge of the students in charge.

By the fall of Bruce's junior year, the newspaper's existing typesetting equipment (two Compugraphic typesetting machines) had expired and needed to be replaced. Bruce and three of his student colleagues assisted in researching potential alternatives, and eventually signed a $125,000 contract with two companies: Mycro-Tek in Wichita, Kansas and Wilmington, Massachusetts. Compugraphic.

As for the Sinotype project-Bruce knows the project very well, thanks to his father, but he did not participate-a critical moment occurred in early May 1981. Paper. His colleague Eric Jacobs was there, working hard on Radio Shack's TRS-80 Model II personal computer. Jacobs is considering how to use this microcomputer to run newspaper business operations. Bruce observed for about 30 minutes, then continued his day.

However, the 30 minutes stayed with him. "That was the first time I saw someone working on a microcomputer," Bruce recalled in an email to me. "The inspiration of those few minutes inspired the entire Sinotype III project and finally triggered my computer career. ."

Later in the same week, Bruce made some impromptu remarks on the phone with his father. Speaking of the huge cost of GARF’s Data General hardware used to build Sinotype II at the time, Bruce said that someone might be able to write something equivalent or better on a microcomputer at a fraction of the cost—perhaps only $10,000 worth of hardware, In comparison, the price of equipment currently funded by GARF is more than $100,000.

His father is curious. Louis asked Bruce if he was qualified for the task of writing such a machine. Bruce boasted that he had no formal training in computer science, even though he used computers extensively in high school and taught himself PDP-8 assembly language and BASIC. "Of course," he answered his father's question with "the guts of a fresh graduate who has no direct job prospects."

During his world tour, Bruce Rosenblum continued to work on China Type III projects, including sticky notes from New Delhi. Image source: Louis Rosenblum paper, Stanford University Special Collection.

During his world tour, Bruce Rosenblum continued to work on China Type III projects, including sticky notes from New Delhi. Image source: Louis Rosenblum paper, Stanford University Special Collection.

In June 1981, Bruce held a formal meeting with Bill Gas, Prescott Low and his father Louis in New York to introduce his Sinotype III proposal. Bruce dresses for the character and wears a three-piece suit. In Bruce's formal proposal, he cited a total of $7,500 in hardware costs, plus $5,000 in programming costs. The plan promises to deliver a Chinese word processor running on Apple II in about four months. If this works, it will reduce the cost of such machines by an order of magnitude.

Bruce got the job and continued to write "Sinotype III" from June to November 1981, balancing the time between this and his full-time job as a tour guide at the Independence Hall National Park Service in Philadelphia. When resting during the day, he writes the assembly code by hand and transcribes it at night. When the Labor Day came in 1981, Bruce's tour guide job was over, and it took him two months to complete the code and deliver it to GARF.

The first problem faced by GARF and Rosenblums was computer memory problems. Developers of early personal computers in China explored all available options to extract as much memory as possible from the system. We will specifically explore two strategies, sometimes used alone, but usually used in conjunction: adaptive memory and Chinese character cards.

The Sinotype III system consists of five parts: Sanyo DM5012CM 12-inch display; Epson MX-70 printer; a Corvus 10 MB "hard disk memory", used to store the Chinese character bitmap database and its corresponding "descriptor code"; Apple disk drives for storing text files; and Apple II itself.

The out-of-the-box Apple II is equipped with 32 KB of RAM, which can be expanded to 48 KB on the motherboard. "We maximized the Apple II even before it left the store," Bruce Rosenblum commented in an email to me. However, 48 KB of memory is still too small for his purposes, so Bruce chose a completely standard modification at the time, usually used by the so-called "advanced users" of that era: that is, an extra 16 KB memory card inserted in slot 0 , So that the total available memory reaches 64 KB. However, this is too little. "I need more RAM to store the complete encoding system," he said, "and 16×16 bitmaps of the 100 most commonly used ideograms."

He began to explore the "mod" of Apple II, which few people have tried before. "Somehow," he said, "I found out that I can put a second 16 KB board in slot 2 of the Apple II, so that I have a total of 80 KB. It's completely non-standard," he continued, "but it can be compatible with Use ready-made components together."

However, this modification allows the machine to go beyond its own limitations. The 6502 microprocessor on Apple II can only directly access 64 KB of memory-which means that even if there is an additional 16 KB Bruce manages to boot with a second memory board, Apple has no built-in way II to access these in memory at the same time Additional address. This mod was so "non-standard" that when he told an Apple engineer in one of many conversations, the Apple representative was shocked-he had never heard of or thought of doing such a thing.

In order to allow Apple II to access 80 KB of memory, not just 64 KB, Bruce gave up the out-of-the-box operating system and wrote his own operating system in assembly language. The key to his custom-designed program is the possibility of "choosing between two overlapping 16 KB repositories." In other words, although only 64 KB of memory locations can be accessed at any time, by quickly swinging between the two memory expansion cards, he can actually trick the computer into viewing these two speeds from the user’s point of view. , Could have been ignored. This squeezes more than 25% of the memory from the system, making the onboard memory may contain more than 400 Chinese characters.

Bruce submitted the final code to GARF a week before Thanksgiving, and then embarked on a world backpacking trip that will take him across Europe and Asia. From then on, the development of Sinotype III will be largely in the hands of Louis Rosenblum and GARF, although Bruce continues to serve as a consultant, communicating frequently with his father from Europe, China, India or elsewhere at this moment.

However, even with his clever model, Louis and Bruce estimate that only 600 to 1,000 Chinese characters can fit into the onboard memory. When considering the operating system, program applications and memory requirements of each Chinese character of Sinotype III, most Chinese characters in the machine dictionary need to be stored elsewhere, whether on a floppy disk, external hard drive or through some other hardware solution .

Sinotype III computer monitor. Image source: Louis Rosenblum paper, Stanford University Special Collection

Sinotype III computer monitor. Image source: Louis Rosenblum paper, Stanford University Special Collection

Earlier, Bruce briefly considered using PROM (programmable read-only memory) chips-but this idea quickly proved to be a dead end. Around 1981 and 1982, the largest PROM chip on the market had a maximum memory of 2 KB, which was converted to only 28 to 51 Chinese characters. In order to store 7,000 Chinese characters in this way, Bruce will need 138 or 250 PROM chips. "This is a lot of bargaining chips," he said.

Bruce then considered the possibility of storing characters on a floppy disk. This also proved to be unworkable, not only because it requires a large number of disks, but also because the access and retrieval speed involved in obtaining character bitmaps from floppy disk drive storage is very slow. GARF chose a third solution: to equip Sinotype III with an external hard drive, which was almost unheard of at the time as a microcomputer accessory. In order to overcome the profound memory limitation, GARF will store thousands of low-frequency Chinese characters "off-site" in the system's external hard drive: a 10 MB Corvus "rigid disk storage".

However, this has a negative impact on the operating speed of Sinotype III. In the space-time continuum of computing, most operations are performed at extremely fast sub-second speeds, and hard drives are cumbersome beasts. Especially at this time, they rely on the hard disk spinning inside the device—"platters"—not the same as record players. The contents of the various "tracks" are read by the magnetic head, similar to the way the needle reads the grooves on the record. The retrieval speed depends on the position of the head and the specific rotational position of the disk at the time of retrieval request. Just like when the bus is just driving at the station, you can only wait for the bus to come back.

Specifically, the retrieval time of Chinese characters stored on the hard disk is 10 times slower than that of Chinese characters stored in RAM. Specifically, the retrieval time of Chinese characters stored in RAM can reach approximately 100 milliseconds per character-a unit of time that human cognition cannot perceive. However, for the characters stored in the external memory, inputting any of these characters takes up to a full second to access and retrieve-this time unit is completely within the threshold of human perception.

In the personal computing environment of the mid-1980s, one second of input time would prove to be extremely slow, when users in the English environment quickly became accustomed to real-time typing. In addition, one second is obviously 10 times as long as 100 milliseconds, which means that ordinary users can feel this difference every time they want to input low-frequency characters.

In order to alleviate this problem, Louis Rosenblum came up with an idea he called "adaptive temporary storage". Sinotype III will be able to adjust the character set stored in RAM according to the user's recent input. At the initial startup, the onboard RAM of Sinotype III will only be equipped with a set of predetermined high-frequency characters. As mentioned above, it takes up to one second to enter any infrequently used characters based on the hard drive. However, "because every less commonly used ideogram is entered by the keyboard," he explained in a letter at the time, "its code and dot pattern will be recorded in random access memory." In other words , These characters will be temporarily copied from the hard disk to the onboard RAM cache, thereby reducing subsequent retrieval time.

The internal GARF file shows the Sinotype III character database and metadata. Image source: Louis Rosenblum paper, Stanford University Special Collection

The internal GARF file shows the Sinotype III character database and metadata. Image source: Louis Rosenblum paper, Stanford University Special Collection

Even with the help of switching and adaptive memory, there are still thousands of characters that exceed the limits of such strategies. Although high-frequency Chinese characters account for a large proportion of the overall use, the production of any kind of technology or professional content will definitely bring users into the "off-site" Chinese character library repeatedly. If the Chinese computer experience is to be close to the instant feeling enjoyed by English counterparts, more of these "low frequency" characters need to be brought to the "live".

Engineers in the late 1970s and early 1980s began to explore different hardware solutions, called "Chinese character cards" (Hanka), "Chinese cards" (Zhongwenka), "Chinese character generators", "Chinese character generators" (Hanzi zimo fashengqi ) Or, as an article happily mentions them, "Chinese on a chip". Just like the memory card and the graphics card, the "Chinese character card" is designed to be installed directly into the motherboard's expansion slot. Hard-wired to these cards are thousands of Chinese bitmaps and input codes. In fact, they are the same as external hard drives, but they are faster and more reliable in performance.

The "Chinese on Chip" card is not the focus of GARF's research. Instead, they originated in the early days of custom-designed Chinese systems, and everything preceded the personal computing revolution. These include systems such as Chan Yeh's Ideographix IPX and Olympia 1011, which are equipped with microprocessors whose sole purpose is to generate character bitmaps and store input descriptors. On the Olympia 1011 Chinese word processor-basically a single-purpose electric Chinese typewriter-one of the three Intel 8085 processors is dedicated to Chinese character generation.

In the early 1980s, this character generator was commercialized and turned into a marketable product. It is no longer necessary to purchase a full-featured word processor (such as Olympia 1011) to use this onboard character generator. Instead, people can buy a "Chinese character card" and install it on a personal computer of their choice.

One of the first Chinese computing centers to pay attention to Chinese character cards was Tsinghua University, where researchers developed an early card that could store approximately 6,000 Chinese bitmap patterns in a 32×32 dot matrix format. By the mid to late 1980s, there were dozens of different "Hanka" on the market, manufactured and sold by companies in Japan, China, Taiwan, Hong Kong, the United States and other places.

By the mid to late 1980s, the "Chinese chip" method became so important and common that almost all computers with Chinese or Japanese capabilities were equipped with one or another character generation card.

Therefore, from Caldwell's Sinotype in the 1950s to the father-son Rosenblum team and GARF surrounding Sinotype III in the 1980s, solving the memory problem related to Chinese characters is the key to opening up the Chinese computing market. Hackers have computers with more memory, creating adaptive memory algorithms to prioritize characters, and building dedicated hardware to solve this problem and trigger a computer revolution in China.

However, the next step is how to extend the computer itself to everything that might be connected to it. In the second part of this series, which will be released on TechCrunch, our discussion will continue to delve into the challenges of designing and programming early computer monitors, printers and other peripherals that can handle Chinese text output.

Summer promotion: Extra Crunch members can enjoy 10% discount