Non-power-of-2 sizes are awkward from a hardware perspective. A lot of designs for e.g. optimized multipliers depend on the operands being divisible into halves; that doesn't work with units of 9 bits. It's also nice to be able to describe a bit position using a fixed number of bits (e.g. 0-7 in 3 bits, 0-31 in 5 bits, 0-63 in 6 bits), e.g. to represent a number of bitwise shift operations, or to select a bit from a byte; this also falls apart with 9, where you'd have to use four bits and have a bunch of invalid values.
To summarize the relevant part of the video. The RDP wants to store pixel color in 18 bits 5 bits red 5 bits blue 5 bits green 3 bits triangle coverage it then uses this coverage information to calculate a primitive but fast antialiasing. so SGI went with two 9-bit bytes for each pixel and magic in the RDP(remember it's also the memory controller) so the cpu sees the 8-bit bytes it expects.
Memory on N64 is very weird it is basicly the same idea as PCIE but for the main memory. PCI big fat bus that is hard to speed up. PCIE small narrow super fast bus. So the cpu was clocked at 93 MHz but the memory was a 9-bit bus clocked at 250 MHz. They were hoping this super fast narrow memory would be enough for everyone but having the graphics card also be the memory controller proved to make the graphics very sensitive to memory load. to the point that the main thing that helps a n64 game get higher frame rate is to have the cpu do as few memory lookups as possible. which in practical terms means having it idle as much as possible. This has a strange side effect that while a common optimizing operation for most architectures is to trade calculation for memory(unroll loops, lookup tables...) on the N64 it can be the opposite. If you can make your code do more calculation with less memory you can utilize the cpu better because it is mostly sitting idle to give the RDP most of the memory bandwidth.
> a common optimizing operation for most architectures is to trade calculation for memory(unroll loops, lookup tables...)
That really depends. A cache miss adds eons of latency thus is far worse than doing a few extra cycles of work but depending on the workload the reorder buffer might manage to negate the negative impact entirely. Memory bandwidth as a whole is also incredibly scarce relative to CPU clock cycles.
The only time it's a sure win is if you trade instruction count for data in registers or L1 cache hits but those are themselves very scarce resources.
Yeah but if the CPU can't use it then it's kinda like saying your computer has 1,000 cores, except they're in the GPU and can't run general-purpose branchy code
In fact, it's not even useful to say it's a "64-bit system" just because it has some 64-bit registers. It doesn't address more than 4 GB of anything ever
> In fact, it's not even useful to say it's a "64-bit system" just because it has some 64-bit registers.
Usually the size of general purpose registers is what defines the bitness of a CPU, not anything else (how much memory it can address, data bus width, etc).
For instance, the 80386SX was considered a 32-bit CPU because its primary register set is 32-bit, despite the fact it had a 24-bit external address bus and a 16-bit external data bus (32-bit requests are split into two 16-bit requests, this was done to allow the chip to be used on cheaper motherboards such as those initially designed with the 80286 in mind).
Note that this is for general purpose registers only: a chip may have 80-bit floating point registers in its FPU parts (supporting floating point with a 64-bit mantissa) but that doesn't make it an 80-bit chip. That was a bit more obvious when FPUs where external add-ons like the 8087 (the co-pro for the 16-bit 8086 family back in the day, which like current FPUs read & wrote IEEE754 standard 32- & 64- bit format floats and computed/held intermediate results in an extended 80-bit format).
>Usually the size of general purpose registers is what defines the bitness of a CPU
The Motorola 68000 has 32-bit registers but it's usually considered a 16-bit CPU because it has 16-bit ALU and 16-bit data bus (both internal and external).
Motorola 68k is a curious case because it originally was supposed to be a 16bit cpu, not 32bit, and the 24bit addressing that ignored upper 8 bits didn't help the perception.
Ultimately, 68k being "16bit" is a marketing thing from home computers that upgraded from 8bit 6502 and the like to m68k but didn't use it fully.
I'd still call it a 32-bit CPU as it had 32-bit registers and instructions (and not just a few special case 32-bit instructions IIRC). Like the 386SX it had a 16-bit external data bus, but some of its internal data routes were 16-bit also (where the 386SX had the full 32-bit core of a 386, later renamed 386DX, with the changes needed to change the external data bus) as were some of its ALUs hence the confusion abaout its bit-ness.
In a way, the fact that you have home computer market calling it 16bit, while at the same time you have workstation systems that plainly talk about 32bit ISA, shows how much of marketing issue it is :)
I'm not aware of that one off the top of my head. If it naturally operated over 16-bit values internally (i.e. it had 16-bit registers and a primarily 16-bit¹ instruction set), at least as fast as it could work with smaller units, then probably yes.
----
[1] So not a mostly 8-bit architecture with 16-bit add-ons. The 8086 had a few instructions that could touch 32 bits, multiply being able to give a 32-bit output from two 16-bit inputs for instance (though the output was always to a particular pair of its registers), but a few special cases like that doesn't count so it is definitely 16-bit.
Well, the 6809 was basically the same in these respects.
Internal registers are 16 bit, with the accumulator (A) being provisioned as two 8 bit registers (A, B) as needed. Index X, Y, Stack, User Stack, PC, are all 16 bit registers.
The Hitachi 6309, adds to that with up to 32 bit register sizes in specific cases.
In any case, the ALU and data transfers are 8 bits and I am not sure I ever saw the 6809 referenced as a 16 bit device.
I'd say that it's a somewhat extended 8bit device because it's still 8bit focused architecture (6800) with extensions towards better handling of 16bit values and certain common parts involved including the zero/direct page are also effectively an increase in flexibility for 8bit code not so much move to 16bit.
I'm glad you like it. I've used that term for years because I don't know what to call a chip like the 6809.
It certainly can punch well above its weight class, at least when compared with 6502 z80 and some others.
I really can't call it 16 bit, because of the small address space, and the fact that the ALU is 8-bit. But you can't always go by the ALU because I believe the z80 and 8080 have four bit ALUs. And I don't think there's anyone that would call those chips four bit.
Motorola seemed to design things in a specific way that people really liked, and this pushing the limits of what is an expert design seems to be one of those because even going back to the 6800, the one index register was 16 bit.
And lastly the 68k is an exemplary design, but in the same design language is 32 bit curious.
Plato argued that 7! was the ideal number of citizens in a city because it was a highly factorable number. Being able to cut numbers up is an time-tested favorite. That's why there are 360 degrees.
360 degrees in a circle predates Plato by quite a lot (2000 years I think!). It comes from the Summarians more than 4000 years ago. They used a method of counting on fingers that goes up to 12 on one hand and 60 using both hands, so their numbering system was based on 60. 360 is 6 * 60 and also roughly how many days in a year.
Later societies inherited that from them along with 60 minutes in and hour.
Not that these are exclusive, but I thought it's a rounding of 365.25 days a year stemming from Egypt. 360 is a pretty useful number of degrees for a starry sky that changes ince a night.
I just can't resist, pointing out that a "minute" is what you get when you split up an hour into 60 minute (i.e. the word pronounced my-newt) pieces, and a "second" is what you get if you break a minute into 60 pieces (i.e. you've performed the division a "second" time).
By this logic, 0.016 (recurring) seconds should be a called a "third".
I've always held opinion that ideal base for our day life computation is 12. It's close enough to 10, so most things would work just as well (like you just need to remember 2 more digits), but it's actually divisible by 3, 4, 6 which is a lot more useful than 5, compared to 10-base.
> "(like you just need to remember 2 more digits)"
"The standard among mathematicians for writing larger bases is to extend the Arabic numerals using the Latin alphabet, so ten is written with the letter A and eleven is written with the letter B. But actually doing it that way makes ten and eleven look like they're too separate from the rest of the digits so you can use an inverted two for ten and an inverted three for eleven. But those don't display in most fonts so you can approximate them with the letters T and E which also happen to be the first letters of the English words ten and eleven. But actually as long as we're okay for using the Latin alphabet characters for these digits then we might as well use X for ten like in Roman numerals. But actually now we're back to having them look too different from the other ten digits so how about instead we use the Greek letters Chi and Epsilon but actually if we're using Greek letters then there's no association between the X looking letter and the number ten, so maybe you can write ten with the Greek letter delta instead.
And all you really need to learn is those 'two new digits' and you're ready to use dozenal."
- Jan Misali in his comedy video on why base 6 is a better way to count than base 12 or base 10 https://www.youtube.com/watch?v=qID2B4MK7Y0 (which is a pisstake and ends up making the point that Base 10 isn't so bad).
("in dozenal, a seventh is written as 0.186X35 recurring because it's equal to one gross eight dozen ten great gross ten gross three dozen five eleven gross eleven dozen eleven great gross eleven dozen eleventh's").
Ideally you learn with what you are both with. It’s easy to have base 10 as you have ten fingers. If we only had 8 fingers we could have ended up with octal
Yeah, metric is cool and all, you can divide by ten and multiply by ten. But even better would be a hexadecimal system so that you could halve, third and quarter it. Plus it's n^2 so it's a perfect square \s
7! 5040 has the less than useful property of being quite large for interacting with human scales.
5! 120 however lacks fine precision required at human scale. Haven't done the math but it's probably something like using 3.1 as the analog of Pi.
360 seems like it might have been chosen based on a mix of precision and practicality. Many small prime factors ( 2 2 2 3 3 5 ). Also an extra prior prime factor for every added prime. 75600 too big, and 12 what analog clock faces use as their primary number.
And many of the conversions between metric and imperial align with the Fibonacci sequence on any order of magnitude. 130km/h is roughly 80mph simply because the fibo sequence has 8 and 13.
Obviously not an emergent property but shows how these things were designed.
I don’t think any common conversions fall near there other than miles->km. It’s certainly not the case that the systems were designed to have the golden ratio as conversions between the two.
1 mile = 1,000 [double] paces of 0.8m each = 1,600m
1m = 1e-10 times half-meridian from the North Pole to the equator, via Paris for a croissant, apparently.
So kind of a coincindence... But a very neat one. Meanwhile, ratio of adjacent Fibonacci numbers converves to some expression involving sqrt(5) which is approx 1.6
Author here. It's true that you'd need one more bit to represent a bit position in a word, like for shifts, but we're already vastly over-provisioned; even in 64-bit registers we're only using six of eight bits. (Plus, in a lot of places we'd have that extra bit around!)
Some hardware circuits are a bit nicer with power-of-two sizes but I don't think it's a huge difference, and hardware has to include weird stuff like 24-bit and 53-bit multipliers for floating-point anyway (which in this alternate world would be probably 28-bit and 60-bit?). Not sure a few extra gates would be a dealbreaker.
Weird bytelengths were very much the norm in early computing but noone (seemingly) ever mass-produced a 10 bit computer[1].
In the first 3⁄4 of the 20th century, n is often 12, 18, 24, 30, 36, 48 or 60. In the last 1⁄3 of the 20th century, n is often 8, 16, or 32, and in the 21st century, n is often 16, 32 or 64, but other sizes have been used (including 6, 39, 128).
"DEC's 36-bit computers were primarily the PDP-6 and PDP-10 families, including the DECSYSTEM-10 and DECSYSTEM-20. These machines were known for their use in university settings and for pioneering work in time-sharing operating systems. The PDP-10, in particular, was a popular choice for research and development, especially in the field of artificial intelligence. "
"Computers with 36-bit words included the MIT Lincoln Laboratory TX-2, the IBM 701/704/709/7090/7094, the UNIVAC 1103/1103A/1105 and 1100/2200 series, the General Electric GE-600/Honeywell 6000, the Digital Equipment Corporation PDP-6/PDP-10 (as used in the DECsystem-10/DECSYSTEM-20), and the Symbolics 3600 series.
Smaller machines like the PDP-1/PDP-9/PDP-15 used 18-bit words, so a double word was 36 bits.
Personally I think 12/48/96 would be more practical than the current 8/32/64. 32 bits is almost trivially easy to overflow whereas 48 bits is almost always enough when working with integers. And 64 bits is often insufficient or at least uncomfortably tight when packing bits together. Whereas by the time you've blown past 96 you should really just bust out the arrays and eat any overhead. Similarly I feel that 24 bits is also likely to be more practical than 16 bits in most cases.
12 bit color would have been great. In the old days 4 bits for each of RGB, or even packing 2 pixels per byte. Today 12 bit per channel would be awesome, although high end cameras seem to be at 14 (which doesn't fit bytes well either).
Instruction sets - 12 bits for small chips and 24 for large ones. RISC-V instructions encode better in 24bits if you use immediate data after the opcode instead of inside it.
Physical memory is topping out near 40bits of address space and some virtual address implementations don't even use 64 bits on modern systems.
Floating point is kinda iffy. 36 bits with more than 24bit mantissa would be good. not sure what would replace doubles.
Yeah it would be much more practical for color. 12 bit rgb4, 24 bit rgb8 or rgba6, and 48 bit rgb16 or rgba12 would all have proper alignment. The obvious rgb12 would obviate the need for the unholy mess of asymmetric 32 bit packed rgb formats we "enjoy" today.
Physical memory - Intel added support for 57 bits (up from 48 bits) in 2019, and AMD in 2022. 48 bit pointers obviously address the vast majority of needs. 96 bit pointers would make the developers of GC'd languages and VMs very happy (lots of tag bits).
For floats presumably you'd match the native sizes to maintain alignment. An f48 with a 10 bit exponent and an f96 with a 15 or 17 bit exponent. I doubt the former has any downsides relative to an f32 and the latter we've already had the equivalent of since forever in the form of 80 bit extended precision floats with a 16 bit exponent.
Amusingly I'm just now realizing that the Intel 80 bit representation has a wider exponent than IEEE binary128.
I guess high end hardware that supports f128 would either be f144 or f192. The latter maintains alignment so presumably that would win out. Anyway pretty much no one supports f128 in hardware to begin with.
The fixed point TI dsp chips always had a long int that was 48. Intel had 84 bit floating point registers before simd registers took over. And the pdp-11...
Powers of two aren't as ubiquitous as it seems. If anything, the hardware uses whatever sizes it wants and that gets abstracted from the rest of the world by compilers and libraries.
Not really - I worked on a DSP with 9-bit bytes in the 90's (largely because it was focused on MPEG decode for DVDs, new at the time) largely because memory was still very expensive and MPEG2 needed 9-bit frame difference calculations (most people do this as 16-bits these days but back then as I said memory was expensive and you could buy 9-bit parity RAM chips)
It had 512 72-bit registers and was very SIMD/VLIW, was probably the only machine ever with 81-bit instructions
The bit shifts were my first idea too where this would break down; but actually, 1-8 bit shifts would be just fine, and they can be encoded in 3 bits. 0 and 9 are special cases anyway (nop and full nonyte/nyte) for the programmer/compiler to become a tiny bit more clever; or use the shift-by-register instruction instead. T
This is not the case for 18 or 36 bits; I would imagine an architecture like this wouldn’t have a swap/swapb but a shuffle type instructions to specify where each nyte is expected to end up, encoded in 4x2 bit in the most generic case.
With this, I think I can get behind the 9-bit archs with the niceties described in the post..
reminds me of GA144 forthchips where it is effectively 20-bit architecture (vs 32-bit architecture). The instructions are 5-bit, and so 4 instructions can fit in 20-bits.
There was a very good reasons for it, indigenous designs were obsolete by the time they left the drawing boards and countless design bureaus cost stupid amounts of money while producing dozens of incompatible computers. By the time they decided to adopt ES EVM they lagged by some 5 years and continued to lag further behind.
But with 5 valued electronics, Up, down, left, right and charm...
You could have the equivalent of 45-bit numbers ( 44 + parity ).
And you could have the operands of two 15 bit numbers and their result encoded in 9 quint-bits or quits. Go pro or go home.
It works poorly at any speed. Hi-Z is an undriven signal, not a specific level, so voltage-driven logic like (C)MOS can't distinguish it from an input that's whatever that signal happens to be floating at. In current-driven logic like TTL or ECL, it's completely equivalent to a lack of current.
Times have changed. Gnome people will yell at you for mentioning things as innocuous as pixel measurements. You'd probably be crucified for suggesting there's a hardware-correct way of handling address space.
Don't those issues only apply to odd number of bits, rather than non-power-of-2? For example, 12 isn't a power of 2 but doesn't suffer from any of those things you mentioned.