Having worked with devicetrees and hated it, I don't like this idea. I don't like the modern world of ARM and embedded where you have hardcoded firmware images for every single device. I like the x86 model, where you have a one size fits all boot disk which just loads different drivers. You don't port your OS to a new computer, you just write drivers for the new hardware parts. And your hardware describes itself to the OS.
Granted, UEFI is grotesquely complicated. Probably you could replace most of it with a EEPROM that has a mapping of device ID to memory address, and some instructions how to speak to the embedded controller (basically, ACPI).
Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world, so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.
Historically, OpenFirmware provided exactly what you were asking for — self-describing devices, device driver methods hanging off the device nodes, and portable, architecture-independent bytecode-based option ROM drivers on expansion cards.
Devicetree is a paired down version of OF; they retained the tree containing key/value device metadata, and dropped all the good bits.
Something like the missing Forth interpreter would be invaluable on development boards as well as for diagnostics on embedded devices. Or simply for fun. I used to spend hours exploring PowerMacs and Forth programming, all thanks to OpenFirmware.
Microsoft won't sign EBC drivers, but even before Secure Boot made that difficult I think we'd found a total of 2 EBC UEFI drivers in the real world. It's honestly easier to just embed an x86 emulator.
How are they NOT the BEST bits? With Open Firmware you can build Open Source systems that boot on hardware that you’ve not only never seen but never will see. Open Firmware + ELF can still be our future if we embrace it.
I wonder if anyone could implement an OF binary for the Z-Machine.
Can OF read files? I mean, not just booting them into memory at $ADDRESS, but read/parse them.
I know a Tetris it's doable.
100% agree. It’s soooo painful messing with device trees and building a boot image. I’ve got a rock pi n10 for which I’ve been struggling to compile a newer kernel image that works (the device vendor has only published Linux ISOs up to buster). The whole process is so weird and rough, working with u-boot and ARM Trusted firmware, device trees, etc. it’s like a 20 step process to go through. Installing Gentoo for the first time is 10x easier, in my opinion.
I too have found installing Gentoo to be way easier than building a working image for an ARM development board. I think a lot of that is due to Gentoo having better documentation and tools, though. Then again, part of the reason Gentoo can write good documentation is because every x86_64 computer works pretty much the same way!
>Having worked with devicetrees and hated it, I don't like this idea. I don't like the modern world of ARM and embedded where you have hardcoded firmware images for every single device. I like the x86 model, where you have a one size fits all boot disk which just loads different drivers. You don't port your OS to a new computer, you just write drivers for the new hardware parts. And your hardware describes itself to the OS.
Having worked with ARM too I have to disagree. I found device trees really flexible and easy to get going.
However, I worked with Rockchip hardware so I had pretty good documentation, lots of examples, and source code for everything(for an old linux kernel, but still). This basically ensured I could do everything I wanted.
Of course, when a vendor doesn't provide documentation and example driver source this may not be so easy.
I guess the different experience is because I had to deal with Allwinner.
But even with Rockchip, different vendors have different levels of support. We had one potential vendor that had excellent documentation, delivered the source code to everything, and responded really quickly and helpfully - in english - to questions. But they were way to expensive. And we had another who just gave us some mystery Android + Linux images, GPL violation included. No schematics. I spent some time porting Linux from one board to the other - they were very similar - but never finished, and constantly was afraid of configuring a voltage regulator wrong and blowing it up, or something similar.
Compare to x86 where I can just order random parts form newegg, screw them together and pop in a boot USB drive, and it will mostly work. I know that you get what you pay for, but at least I would expect the SBC vendors to do hardware enablement, and provide some kind of abstraction that I can run my OS on (or upstream their code).
Have you ever configured a generic MIPI LCD panel and showing a custom splash screen?
With EDK2, on x86?
Or configure and tuning some custom DDR that happen to be already soldered on the board? Or bootstraping some parameters from an EEPROM or external microcontroller?
Compiling and customizing EDK2 for x86 (to boot Windows) was a nightmare and I never want to do it again. I'd rather "make aboot -j8" my entire life.
Please tell me you've found the secret formula to automagically add ANY display panel purchased somewhere in Asia, on x86. Otherwise I'll back to praise the devicetree.
I’ve been struggling with Rockchip! Do you have any links to the docs you’ve used? The ones I have found for the rock pi n10 on radxa’s site are very incomplete.
Yes, but all my docs are for rk3566, pine's Quartz64-A and soquartz boards. Radxa uses the same chip in few places, but I'm not sure if the board schematics will be of any use to you. If this is the chip you need info for let me know by replying to this message and I'll do a pastebin listing the docs I have.
Everything I have came from 3 sources. There is a Linux BSP SDK file (36GB or so) and an Android SDK (75gb or so). Both have docs folders with lots of docs documenting various sdks (for the gpu, npu, camera etc). Those files also contain drivers source for old linux kernels that can be adapted to newer ones with a bit of work. Both are linked on pine's wiki under soquartz. Just Google soquartz pine wiki.
>where you have a one size fits all boot disk which just loads different drivers
But that's exactly the point of device trees. You should be able to use generic ARM kernel build, and it loads appropriate drivers based on dtb provided by system firmware.
> Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world
What do you mean by that? Every ARM server has UEFI/ACPI, and it's now looking like every new PC using ARM will include it out of the box. The Google Pixel smartphones use UEFI. There are even UEFI images for Raspberry Pi 4. Sure, the embedded space is probably going to be holding out for a while longer, but seems disingenuous to say UEFI is going nowhere in the ARM world.
True, but most of todays SBCs and phones don't support UEFI. If you look at SBSA, it has server in the name.
You should be able to put vanilla Ubuntu on a SD card and pop it into a computer and get at least the basic features working. Instead, now you have the choice between opaque firmware images from the vendor (with an ancient kernel, and random additions to userspace), or building your own images with a lot of tinkering.
A happy medium might involve something like a BIOS that's just a vendor-provided library in an EEPROM but which somehow is guaranteed not to run anything in hidden cores, BMCs, or via interrupts that the OS does not set up.
Even then, it would be even better if that library were open source that the OS could carry its own artifacts of. Not because open source is great, but so the OS can truly be in control. The point u/bcantrill makes about the OS not getting DRAM error information gives me cold sweats!
If the library is open source, then, yes, each boot loader and kernel will have to be built for the hardware it will boot on, at least until it gets far enough to get additional firmware/drivers from a boot image RAM disk or whatever.
I may be completely wrong, but I remember reading that Apple M1/M2 computers use a similar approach, with a minimal firmware in the boot image and the rest loaded with the OS.
so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.
From firsthand experience, I can say that at least installing Linux on a UEFI ARM system of the type that QEMU emulates was surprisingly straightforward, and even the ARM version of Windows seemed to detect the virtual hardware just fine.
...and I say this as someone who has worked on the PC platform for over 3 decades, and am a fan of the original BIOS. UEFI is a bloated mess, but it's better than nothing.
Most ARM hardware is cellphones, raspberry pi and the Mac M1, which certainly aren't that type.
But a lot of ARM hardware is that type. The keywords are SBSA / SBBR / SystemReady. If your hardware is SBBR compatible then Fedora and Ubuntu's ARM64 iso, and Windows ARM64, downloaded from their website, will at least boot fine (drivers are a different question as always).
DeviceTree is low-level enough that you can implement UEFI on top of it. There's a UEFI port for the Raspberry Pi 4 at https://rpi4-uefi.dev/ that produces an SBBR layer, allowing it to boot any off-the-shelf ARM64 SBBR distro.
I think it can be fairly said that the thesis of the talk is that UEFI is not, in fact, better than nothing -- and that in a literal sense, having nothing (along with a well-documented part!) is vastly preferable to UEFI.
Well-documented parts are few and far between in the ARM space. Anyway, if you have a really well-documented part, it will also be supported by lightweight solutions like coreboot (for x86) and U-Boot (for other archs). You're not going to be limited by UEFI either way.
The problem isn't with device trees. And actually, ACPI is just like DT really (but more complicated because it has "methods" in addition to just properties).
The difference is that you don't see it so much because the PC vendor has already written the ACPI tables.
The major difference on X86 is that it is (for historical reasons) much more standardised at the hardware level than on ARM (at least on embedded ARM, there is more standardisation in the ARM server space). It also makes more use of discoverable busses like USB and PCI so the parts that need to be described in ACPI are smaller than the parts that have to be described in DT on ARM (as much is either "implicitly standardised" or on a discoverable bus).
And that is because all PCs are basically the same whereas there is a huge variety in ARM SoC hardware and boards, mostly for understandable reasons (there are a huge variety of use cases and price points in embedded systems).
If you were to build an embedded X86 system not using a PC architecture (North bridge, South bridege etc) but using the processor and your own design you would have exactly the same problem.
That said you don't have to have "hardcoded firmware images for every single device" in the embedded ARM world.
I build an in house BSP for a product family of ~15 devices based on 3 different SoCs and we have a single kernel and userspace (Debian based) for all of them. All the DTs are on the boot partition and the bootloader selects the right one based on hardware information (typically a small EEPROM).
The only part that has to be different is the bootloader itself (and then not different for every device but for each SoC) - so we have 3 bootloaders all supplied as part of the BSP and when building the install media we say which one to write.
Even that, in theory, could be avoided, at least on SoCs with the same ISA by improving u-boot a bit to make stuff like clocks configurable from data (the u-boot device tree) like the kernel already does. This may one day be possible once u-boot completely adopts the driver model.
You realize that you're just moving the ball around on who provides what, right? Someone has to write that software that "describes itself to the OS" and rather than having the hardware describe itself to the OS, why can't the OS itself determine what hardware it has?
That "describes itself to the OS" part is actually often being done by a complete OS as well.
> Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world, so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.
That's a really good thing. Bryan rants about ARM trying to go that way and thinks it's a terrible idea. I'm glad it's failing, assuming that's actually the case.
> Bryan rants about ARM trying to go that way and thinks it's a terrible idea. I'm glad it's failing, assuming that's actually the case.
I'd be curious why because the PC model seems to be way more user-friendly, competition-friendly, and long-term-reliable than the current ARM model. I wish I could just download an ISO of Android 13 and install it on almost any Android phone like I could Windows.
The PC model worked because a lot of people put a lot of blood sweat and tears into making it work and then embraced the sunk cost fallacy to a degree only previously seen in politics and religion.
Open Firmware—between the device tree system and the bytecode driver system—plus ELF provide a truly universal boot protocol for 32-bit and larger systems. And not only that, major and tractable implementations are now Open Source too, including things like the bytecode compilers.
It’s practically unconscionable that modern ARM and RISC-V don’t simply require Open Firmware boot support.
In RISC-V, the boot model is standarized. There's SBI (from years ago, where the open implementation opensbi is used by most boards) and UEFI (implemented not in any random way, but after SBI runs, and as per the spec published earlier this year).
All the pieces are in place for RISC-V to effectively dodge boot chaos. Way before the server/workstation RISC-V hardware hit the market.
In ARM, it's a mess. Because ARM took way too long to do way too little re: boot standardization.
You also have to look at ARMs largest customers. This influences development and the most prominent aren't necessarily interested in ARM becoming a more open platform. Which it fundamentally isn't just as x86.
It is wildly easier to get a mouse to tell the operating system first off that it is a mouse, than it is for the operating system to analyze the mouse and make that determination for itself.
Imagine scaling that up to GPUs! The device needs to describe itself because IT is unique.
In fact the mouse doesn't tell the operating system that it's a mouse. The operating system reads that off as part of the hardware USB handshake what the device is. USB is an example of a system done largely correctly as compared to all the other processors embedded into a SoC.
For what it's worth, a holistic system only allows the user to bring their own code only above some layer. For Unix systems of ye olden days, that would be your C source code that you compile. For Chrome OS, that's gonna be websites and extensions. For Oxide, it seems like that is going to be VM images.
But it's important to note that these days, there will always be somebody who wants to bring their code in at a lower level. And the immediate reaction of all vendors is "But why would you want to do that?". Why would you want to bring your own Unix flavor to our hardware? Why would you want to run Linux executables on Chrome OS? Why would you want to run a different hypervisor on Oxide racks?
Those people will exist. And layers like BIOS and UEFI allow system vendors to say "Eh, sure, we don't think that what you're doing is wise, but whatever, here's a standard way we'll allow you to use". Holistic systems don't have these standards. They are only usable as their creators intended. And not everyone will agree with their creators intentions. But the work required to refit them to fit another purpose, which the hardware is very fit to do, but the software fights against, is usually too high, and you'll be better off using something that is worse, but left you an avenue to customize it for your purpose easily.
My main takeaway from this talk, is that we need some standards for phase-based booting, as there is some very cool improvements that could be done with it. But I see no need for holistic systems as described in that talk, I just see a need for better, more transparent systems. And a system doesn't have to be holistic to be transparent, or good.
>But I see no need for holistic systems as described in that talk, I just see a need for better, more transparent systems. And a system doesn't have to be holistic to be transparent, or good.
my takeaway is that it's not the holistic systems themselves that are needed -- it's that when holistic systems aren't enforced that vendors will go mad with power.
IBM Z systems are arguably the most holistic systems currently available, everything down to the silicon is designed in a coherent way. And yet, it is all incredibly proprietary.
Holistic systems don't prevent vendors from going mad with power. Competition and pressure to document their products does.
But it's terrible power for them because now they have to maintain that firmware, and that's very expensive, so they don't do a good job of it at all, and we can pretend it's OK right up until it's not.
Legal liability for broken, unmaintained firmware would very quickly bring about the world that BMC wants.
Legal liability for the madness of alarming confusing systems in the flight control room when the airplane goes down with more than 200 lives lost in the sea is about half the yearly income of an exec's $400K.
This was my feeling as well. PC BIOS and UEFI are far from ideal, but does it really make sense that, for every single ARM (or whatever) board, every single OS vendor has to write their own bringup code? Certainly some of it will be reusable across OSes, but it's still a huge amount of work. This would also disadvantage a hobbyist OS developer who wants to build a new OS, whether as a toy or for eventual educational or production use. Suddenly they have to read through pages and pages of low-level documentation for the board they have, and then once they get it working, it still won't work on any other board.
I just don't think it's realistic to expect every vendor to provide deep-enough documentation to make this work. Even the cooperative ones (of which I assume there are currently very few) would probably have trouble with it.
And even if you do have a bunch of cooperative vendors who are great at releasing hardware documentation, that still doesn't save every OS developer from the tedious, error-prone process of converting that documentation into low-level bringup code.
Isn't it so much better that a Linux developer, when confronted with a new board, need only craft a device tree description (or use one provided by the vendor), and then -- aside from any exotic peripherals that may not have drivers written yet -- the board will mostly work? It's at least likely to boot, even if it comes up in a not-fully-functional state.
Cantrill talks about an experience he had at Joyent, where he diagnosed that memory was failing on systems, but the errors were suppressed by the firmware. Having adequate documentation of the firmware doesn't help you in this kind of sutuation. Holistic systems would not lie about what the system is doing in this way.
Good firmware tells no lies. But it doesn't have to be built with a specific OS in mind to be good - on the contrary, I believe that building firmware with specific OS in mind is what got us into this whole bad firmware mess.
And yet, holistic systems can lie. They can lie horribly about their state, and you'd be none the wiser. Being "holistic", created with the intention of the whole, does not mean that the system cannot lie. If the whole is created with the intention of lying to you, than it will.
We don't need holistic systems. We need good systems. A system doesn't need to be holistic to be good, and a system doesn't need to be good to be holistic.
So what would you want to have happening in the case Cantrill was talking about? The aim is to have good systems and there have been plenty of bad holistic systems in the history of computing. But by good, non-holistic systems, do you mean that we have a motherboard with well-documented firmware that the buyer of the hardware can't see into except according to the interfaces the vendor has provided? This seems to be potentially hiding the kind of information about a malfunctioning system that gets Cantrill so worked up about.
I do think that one of the prerequisites of (firm/soft)ware being good is being open source. For hardware to be good, I do believe it has to be well documented.
If we go by those two requirements, we can see how it works. We can fix things ourselves. And most importantly, anyone can create their own good firmware, without the need for the system to be designed in a holistic way.
Literary tangent: The title seems to be an allusion to Shakespeare. "I come to bury Caesar, not to praise him." Except, the entire purpose of Antony's funeral oration is to praise Caesar. His intent was to defend Caesar's legacy and shift public opinion against Caesar's assassins.
Those summary paragraphs feel like they were written in a deliberately obfuscated style; not exactly a good first impression.
Rather than have one operating system that boots another that boots another, we have returned the operating system to its roots as the software that abstracts the hardware: we execute a single, holistic system from first instruction to running user-level application code.
...also known as single-purpose firmware for an embedded system. I'm in agreement with the other comment here that this is not a good idea. Standardised interfaces like what the BIOS provides lets the OS not be tied to subtle differences in hardware.
It's possible that a single-purpose firmware is too few stages, and it's also possible that the current acpi->firmware->uefi->grub->initrd->linux is too many stages.
UEFI does have many of the features of a full operating system including shell, multiprocessing, networking and it really should be Linux in this position instead. LinuxBoot is another attempt at reducing layers here.
This is an amazing talk, and I think I have seen it on HN before. It lays out the problem of operating systems on modern computers well. I'm not sure that it offers the right solution by continuing to offer a system with the same set of abstraction layers as a solution to the problem of the "operating system" abstraction.
I think the folks at Oxide are probably closer to the right track by silo-busting the BIOS, OS, and hypervisor layers of a modern VM-hosting stack.
Edit: I should also add that this talk lays out a huge, gaping hole in the field of OS research, which might be its most important contribution.
But it's not an OS problem; the OS being written that way is only the symptom because it can't touch the real hardware.
Cantrill and Roscoe both pointed at the SOC vendors: if there's a new problem to solve, they add more proprietary, undocumented, cores and enclaves with tightly held secret functions and OSes of their own, all of which are out of visibility of the OS. Some of them are even designed at odds with the user's interests, such as DRM goop.
This gets back to the war on general purpose computing as Cory Doctrow put it. The hardware is ceasing to work for the user's interest and is starting to work for everyone else in the stack against the user.
I have a background in digital circuits and computer architecture, and I completely disagree with you that these SoC components should be run by the traditional CPU cores. Most of these cores have strict latency guarantees or security boundaries that are much harder to achieve when you are trying to share silicon with the general-purpose code running on the machine, and since the cores are so small, it does not meaningfully save silicon to use them.
OSes are generally terrible at providing strict latency guarantees. If you will crash the CPU if you don't do [X] every 10 microseconds, you should not be telling the operating system to do [X]. This is the case of audio systems, radio systems, PCIe, and almost every other complicated I/O function. These could all be done with hardware state machines, but it is better (cheaper) to use a small dedicated CPU instead.
When I refer to security boundaries, a lot of people think that "security through obscurity" is the idea. It is not. It is more similar to closing all the ports on a server and applying "API security" ideas to hardware. It is a lot easier to secure a function that has a simple queue interface to the main cores than to secure a function that shares the main cores - Spectre and Meltdown have showed us that it might be impossible. Yes, these secure enclaves are used for DRM crap and other nonsense, so I can see why you might not like the existence of that core, but even if you erase the DRM software from that core and make it work for the user completely, you still will want the boundary.
Not to mention that every modern motherboard these days has a board management controller, which cannot be part of the CPU, and controls power and resets.
From a hardware perspective, these SoCs and motherboards really need to be heterogeneous systems. It's up to the system software to work with that. Heterogeneous SoCs really have nothing to do with taking power away from the user. The user can program all of these cores, but we live with abstractions that make it very hard.
It's fine to have many smaller CPUs in the system, for all the purposes you state. The software that runs on them needs to be open, though, and something we can understand, fix, or even replace completely as needed. The operation of the components also needs to be documented so that it's possible to do that not just in principle but in practice.
We've added several small cores of our own to the systems that make up the Oxide rack, but critically we can control them with our own (open source) software stack. The large scale host CPUs where hypervisor workloads live can communicate with those smaller cores (service processor, root of trust, etc) in constrained ways across security and responsibility based boundaries.
It helps a lot that you are working with server-style computing, where you can realistically do this.
Things like audio and radio functions (eg bluetooth and wifi) have algorithms that are very proprietary and often patent-encumbered. The hardware architectures for radios are also similarly weird and proprietary. That kind of thing would result in a binary blob (at best) in a fully-open-source environment.
Hot take: I don't mind if Dolby or a wifi chipset vendor hides their software from me as long as it has very strict definitions for its role and a very narrow I/O interface.
There is another area that computer scientists are not studying much and namely new architectures for computers. Is Von Neumann the best and only option? It seems so called computer scientists are not really studying computers at all, they are studying applications. The fundamental science of computers is the hardware architecture and the operating system, and computer science almost completely ignores fundamental basic research in both of them.
It is also a worry that the hidden code in the SoC's subsystems might be hidden for a reason, namely to give control of computers to someone other than the user and OS. That seems a perfect way to compromise all computers while giving an illusion of security. That's why the Intel Management System has been controversial (TPM's also), but in truth there are many processors on the SoC's that each have enormous security implications that the OS does not control. This has been an issue for decades. I remember people running code on the floppy drive MCU for Amiga's, so this has been known about for a long time. Cell phones are intentionally designed so the OS has no control over basic cellular radio functionality. There is a totally separate processor with its own non-public firmware controlling the radio. Whoever writes the code for these subsystems has enormous power with very little oversight.
How different is this really from the fact that hardware designs themselves are proprietary? You can do computing without (what we commonly call) software, and much hardware does that - so even if 100% of the software running on your system were open source, that doesn't mean that your system is not compromised; you can very easily build a hardware key logger, and have it send the password over a cable (sending it over the internet through pure hardware would be quite hard, but modulating a signal shouldn't be). Note that even if the hardware design were open, you would still have a hell of a problem trying to check whether an IC actually matches the open design.
The only solution to these problems is actually regulation and trust. There is just no realistic way to actually check that your system is not doing some of the things you don't want it to do. Instead, you have to go to the source and make sure you can trust the vendors you buy from, and in order to do so, they themselves have to have hiring and shipment etc practices that allow this type of trust.
Sounds kind of horrible. PCs have a level of compatibility that is basically unheard of anywhere in tech.
Through whatever historical accident, the IBM PC is... pretty amazing. Things are standardized. Stuff just works. We can choose our OS. We don't have a ton of fragmentation. Distros aren't niche community things developed for specific hardware.
If you can make a single OS image that runs on any device of any manufacturer, the way that one Linux distro runs on almost any PC with any combination of parts, great.
But removing layers of abstraction seems like it could easily lead to incompatibility. I'd much rather have proprietary blobs everywhere than incompatibility.
There is something to be said for comparability, but as the speaker points out. That great if you want to install #CrazyLinuxDistro on some old machine and have fun.
However if you want to run a whole cloud or have significant amount of server hardware, you might now want to have 5 other core with random OSs doing random things, eating errors and so on and so on. Same goes if you want to make secure devices like IPhone, Chromebooks and so on. I would like my standard linux laptop to be like that too.
Also we could have both. If AMD/Intel documented these low level systems then the needed patches could flow into the linux kernel. You could likely boot LinuxBoot and still put UEFI on top if you really want to.
And if the system was that open you can still define layers of abstraction on top for situation where you want that compatibility. At the moment compatibility is forced on us with a really thicc layer of abstractions that do some useful things, but also do many not so useful things.
In theory it should work, but I'm still suspicious. In general, thicc compatibility layers seem to work in practice, and everything else never really seems to reach IBM PC levels of compatibility, fragmentation seems to happen fast if there's no consensus driving everything to follow some One True Standard.
We'd probably wind up with 5 UEFI alikes, 3 distros that don't use any of them, and there would probably be subtle compatibility issues.
Eventually the consensus would evolve to some new standard, the way systemd has, and then everyone would complain about that being too heavy and how they don't want a monoculture.
Or, worse, we could wind up in the same place ARM is, for a decade.
There's no technical reason for it, but there also doesn't seem to be a very strong pressure to standardize.
Consumers aren't exactly going to unionize and engineers love to fragment, so to me a standard with that level of popularity is something special and should only be messed with if you have a real clear plan to preserve compatibility.
One of the things that bugged me about this presentation (and, to a similar extent, with Oxide as a whole) is the assertion that the only way we can have safe and trustworthy systems is if each and every component is trustworthy. This is, obviously, a true statement—if we manage to wrangle every component of a system and put it under the control of safe and secure software, we have a really secure system. I don't, however, believe that this is the only way (or, in fact, even the most realistic). Rather, I think we will (for better or for worse) go straight in the opposite direction and end up treating the entire system as an autonomous sea of untrustworthy cores. SGX, for all of its flaws, I think got the nearest to this because it bootstrapped trusted compute in an environment where it couldn't even trust DRAM to not change underneath it. Assuming that every other system agent is hostile lets you use random proprietary garbage without needing to fully control every single core and microcontroller on the platform. It is truly a pain in the ass to program this way but it is, fundamentally, something we can do.
This, of course, isn't to say I wouldn't love to have a system where we can run our own open implementation on every bit of the platform, but rather that I don't believe we ever will. Oxide has made it pretty far, no doubt, and it's an incredible feat but as Bryan mentioned, their staff is literally reverse engineering hardware and finding completely undocumented cores in the things they're putting in charge of their platform. Hell, most companies I've been at hardly even know the full capabilities of their shipped silicon (did we slap a chicken bit on that? did we fuse off that performance analysis module for production runs? did we fully remove that one feature that didn't verify before our tapeout deadline? did we backport that one bug fix to all relevant generations? and on and on) and so its many times not even a case of companies being over protective but rather people not being able to even reason about these complex systems.
Both Bryan and Roscoe raise the question of who is at the helm of the ship and each find a different monster steering. The truth is that nobody is actually in control on these SoCs because nobody has the last word or some distinct power that cannot be compromised with some other random power. SoCs are not hierarchical; they're ridiculously complex systems of federated power, and we really need to treat them as such.
You still need at least one trusted core/chip, you need trusted comms to it (eg. it needs an embedded, or securely programmable secret/privkey, and it needs to be able to run some kind of crypto, key exchange), and then you are still left with wondering what the other cores/components are doing. Who is leaking/sniffing/corrupting data, when, and how...
So what we need is compartmentalization and a blessed central core ... and that's hierarchy anyway.
Yes, of course 99.99..% it's a forgotten fuse or whatever legacy junk. But apparently some folks believe there's is a business model to be built on that last 0.00.. :)
I expected to like this talk much more than I actually did.
Basically what they did is do a reverse engineering and replay attack on a boot sequence that is already so convoluted that the company that built it didn't think that would be possible.
Good for them but all this work will be wasted if the next hardware version comes around and needs a different boot sequence. They basically nailed themselves to a specific version of a specific hardware platform that AMD is replacing as we speak (the next iterations are called Genoa and Bergamo and are expected this and next year respectively.
I was hoping this would feel like a breath of freedom, but it really feels like a few rebellious nerds trying to prove something, but not actually proving it. The talk feels to me like they actually proved the opposite. We are already too far down the path. You will never be able to, say, boot Windows on this hardware.
Can you boot a stock Linux on this hardware, or will you be dependent on patches from them?
As a Linux user I love the idea. As a PC user I never asked for UEFI or the Management Engine. I should be all for this. But somehow I'm not.
Easy and sensible why you might be against this. The simple reality is that BIOS did not prevent you to run anything you wanted on your machine. Recent developments are usually to restrict the user, so any change would very likely be bad. Extrapolated experience does not need to be true, but it very well can be.
UEFI secure boot is an example. Yes, it can increase security (although I think the threat is specific or outdated), but it can well be used to limit the user practically. And if such a mechanism is established, there will be a class system of trusted and untrusted devices. This is not a development that is hard to predict. So UEFI already failed to a large degree, at least regarding the openness of systems. It is no accident that some companies push these developments enthusiastically. It is not for user security, it is simply for market dominance.
You expected a startup to solve all problems all ADM/Intel have created? That seems like expectation going out of hand.
As he says in the video, what is important is documentation. Their code will serve as open documentation for a lot of those details.
Sure with the next version there will be some changes, but likely much of it will still be the same. And when they upgrade their product they will have to integrate those changes.
For 'stock Linux' to boot like that it would need to add those same low level drivers and I don't know if there is a reason the phased approach shouldn't work on linux.
The hope is for many companies to do this and demand this documentation so that with each new version these things will quickly find their way into the open source software. This is now happening with coreboot where multiple companies collaborate on new CPU generations to get it in before the hardware is even released.
But with fireware there is never this amazing solution anybody can do unless the manufacture simply does it. Its sad but its the reality.
> Basically what they did is do a reverse engineering and replay attack on a boot sequence that is already so convoluted that the company that built it didn't think that would be possible.
That is not at all what was done here, and my apologies if I implied that it is. Indeed, quite the opposite was done: we determined how it actually worked and what the part actually needed -- discarding much of the needless gunk and repeated initialization. So it's not a "boot sequence" per se, it is initialization of various on-die components. Is this AMD-specific? Yes. Is it likely to change in Genoa, Bergamo and beyond? To a degree, yes -- but in broad strokes, unlikely. And because we have the OS (and importantly, its tooling) available where this enablement is taking place, we believe that we will be able to make the necessary changes for Genoa and beyond relatively faster than the extant approach of waiting for a proprietary BIOS.
Hi Bryan, thanks for the talk. I think we are saying the same thing just from a different point of view.
Your approach is very laudable and I have in fact been fanboying it for a while. I wish you the best of luck and profits for your company. I sure hope this approach is more sustainable than my gut feeling told me after viewing your talk.
One question, if I may: How sure are you that AMD won't sue you at some point over this? Your ought to be in their best interest but that has rarely stopped legal departments in the past.
Where you see an "this is an awesome opportunity for a company, I'll found one", I see an "would I really want to base my business on something that AMD may at any time decide voids the warranty"?
We have worked very closely with AMD, and they have broadly been very supportive of what we're doing here. As for the more general point of the peril of tightly integrating with partners (AMD is not the only one that we have gone very deep with): yes, it's a risk. But we have consciously decided that we would rather make deliberate decisions and deeply integrate than build a system composed of the lowest common denominators. Relationships are always a gamble (which is why it's important to enter them carefully!) but our belief is that the upside of a tight partnership more than compensates for the risk.
I was somewhat disappointed that there was only one "need" for holistic systems presented: the fact that correctable DIMM errors were getting eaten by underlying firmware. While I don't necessarily disagree with the speaker's premise, I'd have at least liked to see more than one example of why this is important - I can't say that I've walked away with a good idea of why that is. The fact that correctable versus uncorrectable DIMM errors were differentiable tells me that there already _is_ an interface that allows propagating both up to the OS, and that the firmware of the system just wasn't fully implementing it.
Dealing with existing PC compatible servers is just a constant onslaught of tedium. Some of it is as disruptive as the DIMM stuff Bryan talked about. Some of it is just the stuff people have seemingly become desensitised to. Some of the things I've had to deal with in the last few years:
* several different servers where you can set the boot order through the Redfish API, except that it only works about 60% of the time and there is no way to tell from the response if it took or not. Just have to keep doing it and rebooting until it hopefully eventually works!
* PXE boot support that is incredibly slow, so for any reasonable payload size you need to chainload iPXE every time
* servers with oddball Broadcom NICs where chainloading iPXE doesn't work at all
* servers that sit at the BIOS screen for twenty minutes unless you pull out all the U.2 NVMe devices, then they boot OK. Maybe a firmware update will fix it!
* firmware updates that don't install properly through the BMC
* firmware updates that don't _show_ they're installed properly until the BIOS boots completely and can report the new version through apparently some kind of HTTP request on an internal USB NIC to the BMC, even though the BMC has control of the SPI flash and thus is lying to you until that reboot occurs
* BMCs that just stop working until you remove power at the wall from the whole system
* IPMI serial over LAN redirection that drops about 4% of output characters but not in a predictable way so copying and pasting, say, a serial number has be done several times to be sure you got the whole thing
* an interrupt controller in a HP system that doesn't emulate fixed interrupts correctly, and so eventually after some random number of millions of interrupts the lines are just stuck on until you power cycle the system
That's just stuff that comes to mind at the moment. There's literally no way to compose a reliable automated production system on this ridiculous tower of packing peanuts. In contrast, in the lab, I can pretty much just ask an Oxide machine to replace the contents of the SPI ROM and the M.2 storage device and power cycle it and it does what it's told. There are comparatively few moving parts and we have the source to almost all of them.
There was more mentioned. The BMCs and MEs being proprietary, all-powerful, and insecure. The DIMM thing applies to other devices too, like SSD wear-leveling controllers. Then there's the whole trusted boot Pluton/TPMs thing: where you have to "measure" all these proprietary firmware blobs and bless them as trusted knowing very little about them, often nothing really other than they were there in option ROMs before you even booted the thing for the first time. The list is not insignificant.
Really good talk by Bryan Cantrill (co-founder of Oxide Computer COmpany) on the problems with closed source firmware and the need to move away from BIOS and EFI and how they booted their own x86 AMD custom board system all made with open source firmware and without using AMD's firmware.
I have booted a few SoCs without a BIOS or anything of the sort before (nothing nearly as big as an AMD Milan chip). Doing this with a huge, fast chip is really impressive from the folks at Oxide. DDR5 DRAM training is also a ridiculously complicated and touchy exercise, as is some of the PCIe link training that (according to the talk) Linux/Unix handles.
Apparently the DRAM training is apparently one piece of AMD firmware I believe they re-used. Though I had trouble following that part. I think they said that, but they only didn't do it because they didn't need to because they had to pick and choose battles.
Edit: Yes, I believe he says they used the AMD firmware for the PSP (Platform Security Processor).
Edit2: This post may actually be incorrect. Please go watch the talk. I'm not sure anymore.
No, you're correct: the PSP does DIMM training. Also, note that this is AMD Milan, so it's DDR4, not DDR5 -- DDR5 is still forthcoming from both AMD and Intel.
DDR4 link training is a bit easier to comprehend than DDR5, but still a headache. I think almost every high-performance SoC uses an auxiliary processing core for link training at this point (including large FPGAs).
I have been waiting for someone to find a security vulnerability in one of these cores.
One part I couldn't follow on your talk is which software is Oxide designed and which is vendor supplied software? Have you really stripped out every piece of vendor software from the product? Or are there still some parts that are vendor supplied black boxes?
The question after your talk implied that DIMM training was still being done by non-Oxide software.
On AMD, DIMM training is done by the PSP. And you have to run the PSP to run the SoC, so we have no alternative there.
More generally, we have implemented and opened everything we can; there still remain opaque bits like the PSP, as well as some smaller bits scattered through the machine (e.g., SSD firmware, MCU boot ROMs, VR firmware, etc.). We have endeavored to make as open a system as possible within the constraint of not making our own silicon. And while we haven't talked about it publicly, we will also open our schematics when we ship the rack, allowing everyone to see every component in the BOM. While we do have some (necessary) proprietary blobs in the system, we want to at least be transparant about where they are!
What's the "VR" in "VR firmware" stand for? Unless Oxide has invented a revolutionary VR-based user interface for server hardware, I expect I misread that line. ;)
Oh, sorry: "VR" is a "voltage regulator" in this context. You can see our drivers for these parts in Hubris, our all-Rust system that runs on our service processor.[0] All of our drivers are open, but the parts themselves do contain (small) proprietary firmware blobs.
Why not write your own firmware for these? They may be a lot more sophisticated than I am thinking, but voltage regulators usually have a datasheet/manual with a set of control registers, and you likely want to set those registers based on the physical hardware you have.
The regulators are actually quite sophisticated and have many undocumented registers that set how things like the communications with the processor work, nonlinear control algorithms, etc.
Very interesting. I would assume that parameters for the control algorithms are actually a few of the things you want to set for yourself, since that lets you optimize system stability under load switches. Your board might also have more or less inductance or capacitance than other motherboards, and this can affect the stability and performance of the control loop a lot. It's a shame they don't give you the documentation about those control systems to figure out how to set those parameters for yourself.
My very limited understanding of DIMM training is that it's about the digital system learning and setting the precise analog timing settings required to talk to the DIMM. Every memory cell has tiny manufacturing differences and so these need to be learned on the fly at computer boot and they change over time with use so need to be re-determined every boot.
Timing adjustment per x number of data bits has been required since DDR3 but DDR4 also has internal reference voltage calibration for DQ bits (VREF_DQ). This voltage sets threshold by which the IO cell determines if a voltage represents a logic high or low. This VREF_DQ value is calibrated per x number of bits in addition to adjusting the timing to try to find the best place to sample the signal.
This[1] page does a decent job of going into the initialization procedures of DDR4, and why the various steps are needed. Really quite fascinating and complex, due to the high speed.
The essence is that the signals are so high-speed, ie each bit takes a very short time, that the physical distances between DIMM modules and DRAM modules on a DIMM start to matter and has to be compensated for so that all relevant signals arrive at the same time at a given DRAM module.
I've worked on this before. We had explicit help from AMD - their documentation is better than Intels for this stuff, but still.
if it weren't for linux's dependence on the bios pci and memory probes, you could really cut out all that crap. you need to program the memory controllers and bring in the kernel from storage.
oh no, my recollection was that at least at the time, linux used the apci pci traversal to start driver discovery instead of doing its own. obviously after that it assumes complete control of the device tree.
I would really like to know if my x86 'linux PC', is actually running another 'hypervisor' OS that runs Linux - it seems like a recipe for security vulnerabilities.
If so, there are a lot of Qns :
- what OS, is it up to date ?
- can this OS be communicated with from the network ?
- on which chips does it run ?
- can it be re-flashed / upgraded / replaced ?
- can it interrupt/schedule my normal os ?
- how much CPU/power does it use ?
I would certainly trust an Oxide supplied (Rust) bare metal open-source low-level OS to host linux vms on my dev machine, than say, a totally opaque binary blob that the US government has forbidden xyz company from talking about.. just to speculate wildly.
I also think Oxide has wider market that just the server space - eg. one has to do all sorts of shenanigans to get a core freed up so that you can run timing/latency sensitive apps, without getting interrupted by linux threads doing noisy housekeeping on each core.
> I would really like to know if my x86 'linux PC', is actually running another 'hypervisor' OS that runs Linux - it seems like a recipe for security vulnerabilities.
There's not so much a "hypervisor" OS, but instead a congealed set of many different OSes, some realtime OSes, some possibly old Linux variants, some possibly closed source proprietary one-off OSes. I suggest taking a look at the talk Bryan mentioned: https://www.youtube.com/watch?v=36myc8wQhLo
Its kinda difficult to understand what are the key differences between their approach and what coreboot does with Linux payload, which afaik does not use any BIOS/UEFI layer in between and also is pretty minimal on coreboot size => boot as early as possible to Linux
The difference is that coreboot comes up and runs a payload, something like LinuxBoot. LinuxBoot is then responsible to find the actual OS that you want to boot, and uses kexec to boot that OS. LinuxBoot still has to pass hardware information to the HostOS and then hand over control.
In Oxide this doesn't happen, the Pico thingy boots into their OS and that's it. There is no in-between step. The OS then simple loads the additional things it needs. It never hands of control to anther kernel.
I see the sibling comment, so here's my two cents -
I wonder if there's some confusion because LinuxBoot is being talked about relative to Oxide. LinuxBoot isn't trying to do that.
It was originally a project called "LinuxBIOS" at Los Alamos National Labs to get their supercomputer to boot up in a reasonable amount of time. Once the team started down the path of "rip out the legacy BIOS" they discovered that a lot of things become simpler.
LinuxBoot was written with the goal of booting up the machine to GRUB, like this:
(CPU Reset) > LinuxBIOS > GRUB > Linux
See the description of this video (though the video is worth the watch): https://vimeo.com/724454408 "Why Linux? Because firmware always evolves to become an operating system. Rather than wait for evolution to take its course, LANL decided to save some time and use Linux as the BIOS: hence LinuxBIOS."
And the answer to your question: "could you start booting up userland directly?" I think if the LinuxBoot authors were around they would mention something about how constrained the on-board flash chip is (SPI Flash). There's room for LinuxBoot and maybe GRUB, but there is not room for all the drivers needed to get to userland.
Someone from Oxide (maybe Bryan) talked a bunch about how the philosophy of Oxide differs from Coreboot somewhere. I remember either reading about it or hearing it, but I can't for the life of me remember where I heard/read it.
As others pointed out it's (mis)quoting shakespeare's Julius Caesar. It is worth mentioning that I hope one day we (HN?) will be quoting code - that some portions of code (maybe elegant implementations of encryption or quadratic voting) will become canon just as shakespeare is taught in schools.
It's always been a worry that if more people can quite shakespeare than do long division or understand a car engine then we will have "uninformed" decisions.
I am dubious about that - I think of elections are meaningful, the populace takes them seriously and self informs to a level they are happy with (!) but anyway I like the idea that some of the Linux Kernel maybe in peoples heads enough to pop up in middle age
It should be illegal to hide CPU-facing hardware programming information, such as registers and functions, behind an NDA. Even DRAM training stuff. Let the open source community do it all from the ground up.
If the hardware is really a smaller CPU on the other side of a bus or HCI, and what's exposed to the CPU is really a remote-RAM-loading-and-communication interface, that's fine. That's not host CPU firmware, that's device CPU firmware and while that sucks for it to be under NDA/undocumented, at least it's mostly in its own little world. The registers/process needed to setup and talk to that remote embedded CPU should be fully documented and available to anyone.
ACPI is a travesty for enabling this hiding of specific hardware interfaces-often tied to things you need to have your laptop be useful like battery controllers and fans. SMM is a travesty for giving it a place to live.
Very unclear what the business case for this actually is.
I watched all the talk, and it seems to be centered around the fact that opaque, vendor-specific blobs that run on hidden cores are a security weakness.
I agree with that, but to be fair, what segment of the market actually cares enough about this to pay oxide money?
There may be a business case in there, but the talk certainly doesn't explain what that is.
From my own perspective, the business case is reliability. We can take responsibility for the entire rack. No “actually that bug is in code from our vendor, we’ll file a ticket upstream” and then maybe someday it’ll get fixed and maybe someday eventually roll out.
If (let’s be real, when: nothing is perfect…) problems in our system are found, we can fix them.
There’s other reasons too. But imho that’s the big one.
It’s simple: Everyone should use Open Firmware. It’s an IEEE standard, it works well, and device trees are already widely used. Just adopt the FORTH monitor too and embrace standardization.
A lot of comments here focusing on whether this is a good thing or not, but I'm worried about a more fundamental issue: how is it supposed to happen in the first place?
As I understand it, the proposal is that every bit of software running on a device that is currently considered "firmware" ought to be in the OS instead. What does the story of how we get from here to there look like?
1. We convince manufacturers that it would be a great idea to release thorough documentation for all of their hardware products, and that way open source kernel developers can write their own firmware and merge it into open source kernels. Bryan gestures in this direction several times throughout the talk, claiming that this would be good for the hardware manufacturers because it would enable us to by more of their products. But ... no one's really convinced this is going to work, right? The hardware companies don't want to release documentation for the exact same reason they don't want to open source their hardware; it's a barrier (however small) against leaking trade secrets.
2. We focus our efforts on small companies like Oxide building integrated, top-to-bottom systems designed to run without firmware. This is great because it enables BIOS-less system design from the ground up and can be every bit as open source as we want it to be. But this means waiting and hoping that a tiny number of companies manage to reverse engineer the chips (like the AMD processors discussed in the presentation), which probably means running years behind and an extremely limited selection of hardware. This is not a solution that is going to deliver my next laptop, let alone my grandma's next laptop.
3. We convince manufacturers that EFI is a broken, sucky, insecure mess and include and let them help lead a process designed to replace it. This could maybe happen, we did get rid of BIOS (in favor of EFI) after all, but what is the end result likely to look like? It looks like binary blobs running unknown code necessarily tainting nearly every Linux system under the sun. And that's if they don't just decide to integrate with Microsoft and ignore Linux completely.
That's the thing about isolation. It has problems, as this talk discusses, but it also means that (putting aside a great number of bugs) an OS like Linux can exist without needing to reverse engineer firmware for the enormous amount of hardware it runs on. It's great to be able to run an OS with few-to-no blobs active, e.g. with open source Wi-Fi and graphics drivers. Closed firmware is in theory a security nightmare, but in practice it's not that often that anyone managed to exploit these vulnerabilities. Can you imagine having to run closed source Wi-Fi firmware in the Linux Ring 0?
My own personal war against UEFI is a losing battle (already can't use newest Intel generations), and I'd love to see this become a new standard. It might make OSes slightly more complicated, but the benefits of everything being open source and well-understood would more than make up for it.
Would every user having to create their own boot image for the specific device combination they own (processor, motherboard, RAM, disk, GPU, power source etc) really be worth it? I think if this became the norm, you would see Linux adoption on this brave new hardware drop even lower than it is today.
This talk just made me realize how little I understand about what the firmware is really doing. I had previously understood the BIOS to be a small piece of firmware that runs the bootloader at startup and maybe some other coordination between other hardware devices and the CPU. Then CPU takes over and the firmware takes a backseat. The speaker mentioned a whole can of firmware worms that I had no idea even existed.
I’m interested to see the progress of open source firmware now.
That's pretty much where I'm at - my understanding of most of this hasn't moved since CP/M.
CPU boots, PC sets to zero, starts reading instructions from ROM, optionally paging out the ROM if we really need to recover that memory window. I get that x86 is a bit messier because you have optional bioses that get run, and then you read and run the first page of the storage device, but .. man have they made life difficult since I last paid attention.
What would your criteria be for a holistic yet non-hacked-together boot process? How do you decide what layers/abstractions are needed and what are not?
I think holistic implies some principles guiding design decisions, drawing something from research areas like formal methods, programming language theory, operating systems, hardware, etc and unifying them more of what the cited talk It's Time for Operating Systems to Rediscover Hardware covered.
Haha, that's pretty clever :) The intended point was more that the concept can be iterated upon. I think you'd be the last to claim Hubris ought to be the end-all-be-all of holistic operating systems.
the tooling in the demo was impressive, and I don't mean to be overly critical I have gazed into the os/hardware abyss and I'm still paralyzed to figure out an approach. and maybe you will find what I'm looking for out there, or maybe we are looking for different things and that is fine too. it's certainly better than doing nothing.
Indeed doing what they have done is impressive, and talented folks at play for sure. Not sure, even vaguely commercially viable(heck 0xide as a whole), but nerdy interesting for sure. But I can’t see practically why this is so much more useful than what society has built together and call the x86 platform over the last few decades.
Granted, UEFI is grotesquely complicated. Probably you could replace most of it with a EEPROM that has a mapping of device ID to memory address, and some instructions how to speak to the embedded controller (basically, ACPI).
Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world, so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.