This had a 36-bit word, as did its predecessor, and as did the later DEC 10 and DEC 20 systems used famously at MIT and elsewhere, and I understand that people found that word size to have various advantages once they'd used it extensively.
But how was 36 bits originally decided upon back in the 1950s? What were the tradeoffs that made them say "two 18 bit half words are just right, but two 20 bit half words is clearly too big", and so on?
These days everyone takes power-of-two character and integer and floating sizes for granted; I'm not talking about that, I'm just wondering how they looked at it back at the beginning.
The influential machine to look at here, then, would be the IBM 701.
Edit: The IBM 701 patent <https://patents.google.com/patent/US3197624A> says, “A binary number of a full word of 36 bits has a precision equal to that of about a 10 decimal digit number”. It doesn't mention characters, and the machine was designed for scientific (military) calculations, not text/data processing. The associated IBM 716 printer did not use a 6-bit code. However, this does not rule out a 6-bit character code as a design consideration, even if this machine didn't use one, since IBM did have a large business in pre-computer data processing, using punched-card character sets that would fit in no less than 6 bits. So that may have driven the design of shared peripherals like 7-track tape (6 bits plus parity) and led to a multiple-of-6 word size.
Edit 2: The paper, The logical organization of the new IBM scientific calculator <https://doi.org/10.1145/609784.609791> again mentions 10 decimal digits. It also describes card and printer I/O, and it is explicit that these work on binary card images, row by row, not character by character. The machine handles only 72 columns of an 80-column card, reading a row at a time into two 36-bit words. A 40-bit word would have let it use the whole card!
From Stephen H. Kaisler, First Generation Mainframes: The IBM 700 Series:
> IBM conducted a survey of existing machines and applications, ultimately determining that a word length of between 10 and 12 digits or about 35 to 41 bits was desirable. Additionally, single address machines required at least 18 bits for storage of operation codes and addresses given a specified memory size. Since the secondary memory was magnetic tape, a tape technology study showed that 6 parallel channels for data storage were optimal. Thus, the word size should be a multiple of 6 bits so that subparts of the word could be stored in successive locations on tape. As a result, IBM arrived at 36 bits for the IBM 701 word size.
> The 36-bit word size was probably unfortunate, and it most certainly stretched the abilities of customer problem analysts and programming experts later. When you chopped off a piece for the exponent (and sign) in floating point - which in spite of my Jeremiads all the customers except maybe the cryptologists insisted on using - there wasn't really enough left for accurate matrix calculations. Or astronomy, but I could already see before the design was fully roughed out that not much of Wallace's stuff was ever going to run on a Defense Calculator! Anyhow, it was too late; looking back from the Eighties, I guessed 48 bits would have been optimal; the 360/370 experience showed us that 32 is not enough, and 64 would have been wasteful.
I've previously had a look at IBM Stretch and how they considered and rejected 60 bits. A '6-bit' byte was considered to be important - more so that 8-bit - so having a word a multiple of 6 was seen as an advantage.
Useful interesting historical information, thanks. Seymour Cray was amazing, inventing and proving CPU architectural techniques (like out of order execution) that continue to be used today.
"Control Data Corporation 6600 was the fastest computer in the world from 1964 to 1969. It was, in many ways, a pioneer of modern computing using, for example, some of the techniques that would form the basis of RISC architectures... Wikipedia: Generally considered to be the first successful supercomputer, it outperformed the industry's prior recordholder, the IBM 7030 Stretch, by a factor of three."
Your link does say: "Early binary computers aimed at the same market therefore often used a 36-bit word length. This was long enough to represent positive and negative integers to an accuracy of ten decimal digits (35 bits would have been the minimum). It also allowed the storage of six alphanumeric characters encoded in a six-bit character code."
Which helps a little, but it still begs the question: why ten decimal digits? Why not nine or eleven or something?
Are they implying that six characters of six bits was the critical issue? If so, why not seven characters? Or five? Etc.
If you're keen to go down the wikipedia hole, https://en.wikipedia.org/wiki/Six-bit_character_code and then https://en.wikipedia.org/wiki/BCD_(character_encoding) explain that IBM created a 6-bit card punch encoding for alphanumeric data in 1928, that this code was adopted by other manufacturers, and that IBM's early electronic computers' word sizes were based on that code. (Hazarding a guess, but perhaps to take advantage of existing manufacturing processes for card-handling hardware, or for compatibility with customers existing card handling equipment, teletypes, etc.)
So backward compatibility is likely the most historically accurate answer. Fewer bits wouldn't have been compatible, more bits might not have been usable!
I'm guessing it was the smallest practical size to encode alphanumeric data, and making it bigger than it needed to be would have added mechanical complexity and expense.
https://en.wikipedia.org/wiki/Six-bit_character_code: "Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters."
IIRC I think six characters was also the maximum for the length of global symbols in C on early Unix systems, possibly just because that's what everyone was used to on earlier systems.
But note that I asked about why six characters, not why six bits per character -- however your note is perhaps suggestive -- maybe the six character limit is similar to the six bit character after all: something established (possibly for mechanical reasons) in 1928? Perhaps?
Right, good questions. Pure conjecture on my part: maybe it's just that 36 is the smallest integral multiple of 6 that also had enough bits to represent integers of the desired width?
One reason: Because 10 was enough to accurately calculate the differences in atomic masses, which was essential for atomic weapons design. (Source: my mother worked on 36-bit machines back in the 1950s. This was her explanation of the reason for the word size.)
Six character is (in)famously the maximum length of linker symbols on IBM systems, at least for FORTRAN. Perhaps that had something to do with it. And of course, there comes a time when you have to pick a number, so why no 6x6, which is also good for 10^10 integers?
The Wikipedia article cited above has some practical reasons:
> Early binary computers aimed at the same market therefore often used a 36-bit word length. This was long enough to represent positive and negative integers to an accuracy of ten decimal digits (35 bits would have been the minimum). It also allowed the storage of six alphanumeric characters encoded in a six-bit character code.
The entirety of my knowledge about CTSS is from Hackers: Heroes of the Computer Revolution which, being a book about a bunch of misfits that hated the cloistered culture of IBM computing, wasn't complementary. ITS (the Incompatible Timesharing system) was also developed with help from Project MAC.
It's interesting to note that those early 36-bit machines could store 2 addresses (plus maybe a few more bits) in a single word. I wonder what LISP would have been like if that had not been possible.
But how was 36 bits originally decided upon back in the 1950s? What were the tradeoffs that made them say "two 18 bit half words are just right, but two 20 bit half words is clearly too big", and so on?
These days everyone takes power-of-two character and integer and floating sizes for granted; I'm not talking about that, I'm just wondering how they looked at it back at the beginning.