I have come to bury the BIOS, not to open it [video]

captainmuon · on Oct 9, 2022

Having worked with devicetrees and hated it, I don't like this idea. I don't like the modern world of ARM and embedded where you have hardcoded firmware images for every single device. I like the x86 model, where you have a one size fits all boot disk which just loads different drivers. You don't port your OS to a new computer, you just write drivers for the new hardware parts. And your hardware describes itself to the OS.

Granted, UEFI is grotesquely complicated. Probably you could replace most of it with a EEPROM that has a mapping of device ID to memory address, and some instructions how to speak to the embedded controller (basically, ACPI).

Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world, so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.

catiopatio · on Oct 9, 2022

Historically, OpenFirmware provided exactly what you were asking for — self-describing devices, device driver methods hanging off the device nodes, and portable, architecture-independent bytecode-based option ROM drivers on expansion cards.

Devicetree is a paired down version of OF; they retained the tree containing key/value device metadata, and dropped all the good bits.

II2II · on Oct 10, 2022

> and dropped all the good bits.

Something like the missing Forth interpreter would be invaluable on development boards as well as for diagnostics on embedded devices. Or simply for fun. I used to spend hours exploring PowerMacs and Forth programming, all thanks to OpenFirmware.

bpye · on Oct 9, 2022

> architecture-independent bytecode-based option ROM drivers on expansion cards.

UEFI still has this doesn't it? EFI byte code [0].

[0] - https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-gui...

mjg59 · on Oct 10, 2022

Microsoft won't sign EBC drivers, but even before Secure Boot made that difficult I think we'd found a total of 2 EBC UEFI drivers in the real world. It's honestly easier to just embed an x86 emulator.

tremon · on Oct 9, 2022

How are architecture-independent bytecode-based option ROM drivers the good bits in the context of open source systems?

catiopatio · on Oct 9, 2022

You could ship bootable PCI cards that worked on any OF-capable architecture.

Open source doesn’t mean not shipping binaries.

eschaton · on Oct 10, 2022

How are they NOT the BEST bits? With Open Firmware you can build Open Source systems that boot on hardware that you’ve not only never seen but never will see. Open Firmware + ELF can still be our future if we embrace it.

calvin_ · on Oct 11, 2022

Except no one wants to write drivers in Forth.

vetinari · on Oct 9, 2022

It was Forth bytecode; the disassembly was pretty readable, without any fancy tools, straight from the OF console (at least on PPC Macs).

anthk · on Oct 10, 2022

I wonder if anyone could implement an OF binary for the Z-Machine. Can OF read files? I mean, not just booting them into memory at $ADDRESS, but read/parse them. I know a Tetris it's doable.

koprulusector · on Oct 10, 2022

100% agree. It’s soooo painful messing with device trees and building a boot image. I’ve got a rock pi n10 for which I’ve been struggling to compile a newer kernel image that works (the device vendor has only published Linux ISOs up to buster). The whole process is so weird and rough, working with u-boot and ARM Trusted firmware, device trees, etc. it’s like a 20 step process to go through. Installing Gentoo for the first time is 10x easier, in my opinion.

csande17 · on Oct 10, 2022

I too have found installing Gentoo to be way easier than building a working image for an ARM development board. I think a lot of that is due to Gentoo having better documentation and tools, though. Then again, part of the reason Gentoo can write good documentation is because every x86_64 computer works pretty much the same way!

Roark66 · on Oct 10, 2022

>Having worked with devicetrees and hated it, I don't like this idea. I don't like the modern world of ARM and embedded where you have hardcoded firmware images for every single device. I like the x86 model, where you have a one size fits all boot disk which just loads different drivers. You don't port your OS to a new computer, you just write drivers for the new hardware parts. And your hardware describes itself to the OS.

Having worked with ARM too I have to disagree. I found device trees really flexible and easy to get going.

However, I worked with Rockchip hardware so I had pretty good documentation, lots of examples, and source code for everything(for an old linux kernel, but still). This basically ensured I could do everything I wanted.

Of course, when a vendor doesn't provide documentation and example driver source this may not be so easy.

captainmuon · on Oct 10, 2022

I guess the different experience is because I had to deal with Allwinner.

But even with Rockchip, different vendors have different levels of support. We had one potential vendor that had excellent documentation, delivered the source code to everything, and responded really quickly and helpfully - in english - to questions. But they were way to expensive. And we had another who just gave us some mystery Android + Linux images, GPL violation included. No schematics. I spent some time porting Linux from one board to the other - they were very similar - but never finished, and constantly was afraid of configuring a voltage regulator wrong and blowing it up, or something similar.

Compare to x86 where I can just order random parts form newegg, screw them together and pop in a boot USB drive, and it will mostly work. I know that you get what you pay for, but at least I would expect the SBC vendors to do hardware enablement, and provide some kind of abstraction that I can run my OS on (or upstream their code).

scoutt · on Oct 10, 2022

Have you ever configured a generic MIPI LCD panel and showing a custom splash screen?

With EDK2, on x86?

Or configure and tuning some custom DDR that happen to be already soldered on the board? Or bootstraping some parameters from an EEPROM or external microcontroller?

Compiling and customizing EDK2 for x86 (to boot Windows) was a nightmare and I never want to do it again. I'd rather "make aboot -j8" my entire life.

Please tell me you've found the secret formula to automagically add ANY display panel purchased somewhere in Asia, on x86. Otherwise I'll back to praise the devicetree.

koprulusector · on Oct 10, 2022

I’ve been struggling with Rockchip! Do you have any links to the docs you’ve used? The ones I have found for the rock pi n10 on radxa’s site are very incomplete.

Roark66 · on Oct 10, 2022

Yes, but all my docs are for rk3566, pine's Quartz64-A and soquartz boards. Radxa uses the same chip in few places, but I'm not sure if the board schematics will be of any use to you. If this is the chip you need info for let me know by replying to this message and I'll do a pastebin listing the docs I have.

Everything I have came from 3 sources. There is a Linux BSP SDK file (36GB or so) and an Android SDK (75gb or so). Both have docs folders with lots of docs documenting various sdks (for the gpu, npu, camera etc). Those files also contain drivers source for old linux kernels that can be adapted to newer ones with a bit of work. Both are linked on pine's wiki under soquartz. Just Google soquartz pine wiki.

The third resource is just two parts of rk3568 TRM(same as rk3566) . They are here: https://github.com/heitbaum/rk3568/tree/main/doc

xani_ · on Oct 10, 2022

> Having worked with ARM too I have to disagree. I found device trees really flexible and easy to get going.

... as opposed to not having to "get it going at all"?

Roark66 · on Oct 10, 2022

Well, As opposed to having to plug register addresses directly into C code. That's the alternative, isn't it?

garaetjjte · on Oct 10, 2022

>where you have a one size fits all boot disk which just loads different drivers

But that's exactly the point of device trees. You should be able to use generic ARM kernel build, and it loads appropriate drivers based on dtb provided by system firmware.

tadfisher · on Oct 10, 2022

Then why does every permutation of every supported ARM-based product have a dtb checked in to the kernel source?

mappu · on Oct 10, 2022

They're usually also in U-boot in ROM. The trees in the kernel override the platform's one because the kernel is newer and knows better.

ACPI on x86 works the same way - it's in ROM but the OS doesn't necessarily use it blindly, it has plenty of quirks to interpret it with.

als0 · on Oct 10, 2022

> Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world

What do you mean by that? Every ARM server has UEFI/ACPI, and it's now looking like every new PC using ARM will include it out of the box. The Google Pixel smartphones use UEFI. There are even UEFI images for Raspberry Pi 4. Sure, the embedded space is probably going to be holding out for a while longer, but seems disingenuous to say UEFI is going nowhere in the ARM world.

captainmuon · on Oct 10, 2022

True, but most of todays SBCs and phones don't support UEFI. If you look at SBSA, it has server in the name.

You should be able to put vanilla Ubuntu on a SD card and pop it into a computer and get at least the basic features working. Instead, now you have the choice between opaque firmware images from the vendor (with an ancient kernel, and random additions to userspace), or building your own images with a lot of tinkering.

als0 · on Oct 10, 2022

> If you look at SBSA, it has server in the name.

There is now a generic BSA https://developer.arm.com/documentation/den0094/latest (not server specific)

cryptonector · on Oct 10, 2022

A happy medium might involve something like a BIOS that's just a vendor-provided library in an EEPROM but which somehow is guaranteed not to run anything in hidden cores, BMCs, or via interrupts that the OS does not set up.

Even then, it would be even better if that library were open source that the OS could carry its own artifacts of. Not because open source is great, but so the OS can truly be in control. The point u/bcantrill makes about the OS not getting DRAM error information gives me cold sweats!

If the library is open source, then, yes, each boot loader and kernel will have to be built for the hardware it will boot on, at least until it gets far enough to get additional firmware/drivers from a boot image RAM disk or whatever.

occamrazor · on Oct 10, 2022

I may be completely wrong, but I remember reading that Apple M1/M2 computers use a similar approach, with a minimal firmware in the boot image and the rest loaded with the OS.

ignaloidas · on Oct 10, 2022

Not really. The list of firmware is lengthy, and most of it is loaded by the bootloader. Though some is indeed loaded by the OS.

https://github.com/AsahiLinux/docs/wiki/Introduction-to-Appl...

userbinator · on Oct 9, 2022

so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.

From firsthand experience, I can say that at least installing Linux on a UEFI ARM system of the type that QEMU emulates was surprisingly straightforward, and even the ARM version of Windows seemed to detect the virtual hardware just fine.

...and I say this as someone who has worked on the PC platform for over 3 decades, and am a fan of the original BIOS. UEFI is a bloated mess, but it's better than nothing.

mort96 · on Oct 9, 2022

But most ARM hardware aren't "UEFI ARM system[s] of the type that QEMU emulates", which is the problem.

mappu · on Oct 10, 2022

Most ARM hardware is cellphones, raspberry pi and the Mac M1, which certainly aren't that type.

But a lot of ARM hardware is that type. The keywords are SBSA / SBBR / SystemReady. If your hardware is SBBR compatible then Fedora and Ubuntu's ARM64 iso, and Windows ARM64, downloaded from their website, will at least boot fine (drivers are a different question as always).

There's a good list of supported hardware in the lower half of https://community.arm.com/arm-community-blogs/b/architecture... . Many systems from Avantek, Gigabyte, NXP, Marvell, Solidrun etc are standardizing on this way of booting.

DeviceTree is low-level enough that you can implement UEFI on top of it. There's a UEFI port for the Raspberry Pi 4 at https://rpi4-uefi.dev/ that produces an SBBR layer, allowing it to boot any off-the-shelf ARM64 SBBR distro.

bcantrill · on Oct 9, 2022

I think it can be fairly said that the thesis of the talk is that UEFI is not, in fact, better than nothing -- and that in a literal sense, having nothing (along with a well-documented part!) is vastly preferable to UEFI.

zozbot234 · on Oct 9, 2022

Well-documented parts are few and far between in the ARM space. Anyway, if you have a really well-documented part, it will also be supported by lightweight solutions like coreboot (for x86) and U-Boot (for other archs). You're not going to be limited by UEFI either way.

mfuzzey · on Oct 11, 2022

The problem isn't with device trees. And actually, ACPI is just like DT really (but more complicated because it has "methods" in addition to just properties). The difference is that you don't see it so much because the PC vendor has already written the ACPI tables.

The major difference on X86 is that it is (for historical reasons) much more standardised at the hardware level than on ARM (at least on embedded ARM, there is more standardisation in the ARM server space). It also makes more use of discoverable busses like USB and PCI so the parts that need to be described in ACPI are smaller than the parts that have to be described in DT on ARM (as much is either "implicitly standardised" or on a discoverable bus).

And that is because all PCs are basically the same whereas there is a huge variety in ARM SoC hardware and boards, mostly for understandable reasons (there are a huge variety of use cases and price points in embedded systems).

If you were to build an embedded X86 system not using a PC architecture (North bridge, South bridege etc) but using the processor and your own design you would have exactly the same problem.

That said you don't have to have "hardcoded firmware images for every single device" in the embedded ARM world.

I build an in house BSP for a product family of ~15 devices based on 3 different SoCs and we have a single kernel and userspace (Debian based) for all of them. All the DTs are on the boot partition and the bootloader selects the right one based on hardware information (typically a small EEPROM).

The only part that has to be different is the bootloader itself (and then not different for every device but for each SoC) - so we have 3 bootloaders all supplied as part of the BSP and when building the install media we say which one to write.

Even that, in theory, could be avoided, at least on SoCs with the same ISA by improving u-boot a bit to make stuff like clocks configurable from data (the u-boot device tree) like the kernel already does. This may one day be possible once u-boot completely adopts the driver model.

mlindner · on Oct 9, 2022

You realize that you're just moving the ball around on who provides what, right? Someone has to write that software that "describes itself to the OS" and rather than having the hardware describe itself to the OS, why can't the OS itself determine what hardware it has?

That "describes itself to the OS" part is actually often being done by a complete OS as well.

> Unfortunately, UEFI/ACPI or simliar seems to be going nowhere in the ARM world, so it doesn't seem installing Windows/Linux is going to be as easy on ARM as on x86 anytime soon.

That's a really good thing. Bryan rants about ARM trying to go that way and thinks it's a terrible idea. I'm glad it's failing, assuming that's actually the case.

gjsman-1000 · on Oct 9, 2022

> Bryan rants about ARM trying to go that way and thinks it's a terrible idea. I'm glad it's failing, assuming that's actually the case.

I'd be curious why because the PC model seems to be way more user-friendly, competition-friendly, and long-term-reliable than the current ARM model. I wish I could just download an ISO of Android 13 and install it on almost any Android phone like I could Windows.

eschaton · on Oct 10, 2022

The PC model worked because a lot of people put a lot of blood sweat and tears into making it work and then embraced the sunk cost fallacy to a degree only previously seen in politics and religion.

Open Firmware—between the device tree system and the bytecode driver system—plus ELF provide a truly universal boot protocol for 32-bit and larger systems. And not only that, major and tractable implementations are now Open Source too, including things like the bytecode compilers.

It’s practically unconscionable that modern ARM and RISC-V don’t simply require Open Firmware boot support.

snvzz · on Oct 10, 2022

In RISC-V, the boot model is standarized. There's SBI (from years ago, where the open implementation opensbi is used by most boards) and UEFI (implemented not in any random way, but after SBI runs, and as per the spec published earlier this year).

All the pieces are in place for RISC-V to effectively dodge boot chaos. Way before the server/workstation RISC-V hardware hit the market.

In ARM, it's a mess. Because ARM took way too long to do way too little re: boot standardization.

raxxorraxor · on Oct 10, 2022

You also have to look at ARMs largest customers. This influences development and the most prominent aren't necessarily interested in ARM becoming a more open platform. Which it fundamentally isn't just as x86.

mlindner · on Oct 9, 2022

Well it's in the talk, he spends considerable time on the subject in fact. I suggest taking a watch.

THENATHE · on Oct 10, 2022

It is wildly easier to get a mouse to tell the operating system first off that it is a mouse, than it is for the operating system to analyze the mouse and make that determination for itself.

Imagine scaling that up to GPUs! The device needs to describe itself because IT is unique.

mlindner · on Oct 10, 2022

In fact the mouse doesn't tell the operating system that it's a mouse. The operating system reads that off as part of the hardware USB handshake what the device is. USB is an example of a system done largely correctly as compared to all the other processors embedded into a SoC.

stefan_ · on Oct 10, 2022

Based on USB descriptors filled out by the mouse vendor and sent by the mouse. Thanks for playing.

95014_refugee · on Oct 10, 2022

This is perhaps the first time I've heard "USB" and "correctly" in a clause without a negative.

I encourage you to spend more time with it. Meet your hero. Get to know it. Understanding will surely follow. 8)

95014_refugee · on Oct 10, 2022

> I like the x86 model, where you have a one size fits all boot disk which just loads different drivers.

This is not "the x86 model". This is the "Microsoft refuses to support anything" model.

It is dead, and good f*ing riddance.

ignaloidas · on Oct 10, 2022

> This is the "Microsoft refuses to support anything" model.

It's rather "Everyone wants to boot Windows, and Microsoft can't be bothered to deal with everyone" model.

ignaloidas · on Oct 10, 2022

For what it's worth, a holistic system only allows the user to bring their own code only above some layer. For Unix systems of ye olden days, that would be your C source code that you compile. For Chrome OS, that's gonna be websites and extensions. For Oxide, it seems like that is going to be VM images.

But it's important to note that these days, there will always be somebody who wants to bring their code in at a lower level. And the immediate reaction of all vendors is "But why would you want to do that?". Why would you want to bring your own Unix flavor to our hardware? Why would you want to run Linux executables on Chrome OS? Why would you want to run a different hypervisor on Oxide racks?

Those people will exist. And layers like BIOS and UEFI allow system vendors to say "Eh, sure, we don't think that what you're doing is wise, but whatever, here's a standard way we'll allow you to use". Holistic systems don't have these standards. They are only usable as their creators intended. And not everyone will agree with their creators intentions. But the work required to refit them to fit another purpose, which the hardware is very fit to do, but the software fights against, is usually too high, and you'll be better off using something that is worse, but left you an avenue to customize it for your purpose easily.

My main takeaway from this talk, is that we need some standards for phase-based booting, as there is some very cool improvements that could be done with it. But I see no need for holistic systems as described in that talk, I just see a need for better, more transparent systems. And a system doesn't have to be holistic to be transparent, or good.

serf · on Oct 10, 2022

>But I see no need for holistic systems as described in that talk, I just see a need for better, more transparent systems. And a system doesn't have to be holistic to be transparent, or good.

my takeaway is that it's not the holistic systems themselves that are needed -- it's that when holistic systems aren't enforced that vendors will go mad with power.

ignaloidas · on Oct 10, 2022

IBM Z systems are arguably the most holistic systems currently available, everything down to the silicon is designed in a coherent way. And yet, it is all incredibly proprietary.

Holistic systems don't prevent vendors from going mad with power. Competition and pressure to document their products does.

cryptonector · on Oct 10, 2022

But it's terrible power for them because now they have to maintain that firmware, and that's very expensive, so they don't do a good job of it at all, and we can pretend it's OK right up until it's not.

Legal liability for broken, unmaintained firmware would very quickly bring about the world that BMC wants.

abudabi123 · on Oct 11, 2022

Legal liability for the madness of alarming confusing systems in the flight control room when the airplane goes down with more than 200 lives lost in the sea is about half the yearly income of an exec's $400K.

kelnos · on Oct 10, 2022

This was my feeling as well. PC BIOS and UEFI are far from ideal, but does it really make sense that, for every single ARM (or whatever) board, every single OS vendor has to write their own bringup code? Certainly some of it will be reusable across OSes, but it's still a huge amount of work. This would also disadvantage a hobbyist OS developer who wants to build a new OS, whether as a toy or for eventual educational or production use. Suddenly they have to read through pages and pages of low-level documentation for the board they have, and then once they get it working, it still won't work on any other board.

I just don't think it's realistic to expect every vendor to provide deep-enough documentation to make this work. Even the cooperative ones (of which I assume there are currently very few) would probably have trouble with it.

And even if you do have a bunch of cooperative vendors who are great at releasing hardware documentation, that still doesn't save every OS developer from the tedious, error-prone process of converting that documentation into low-level bringup code.

Isn't it so much better that a Linux developer, when confronted with a new board, need only craft a device tree description (or use one provided by the vendor), and then -- aside from any exotic peripherals that may not have drivers written yet -- the board will mostly work? It's at least likely to boot, even if it comes up in a not-fully-functional state.

chalst · on Oct 11, 2022

Cantrill talks about an experience he had at Joyent, where he diagnosed that memory was failing on systems, but the errors were suppressed by the firmware. Having adequate documentation of the firmware doesn't help you in this kind of sutuation. Holistic systems would not lie about what the system is doing in this way.

ignaloidas · on Oct 11, 2022

Good firmware tells no lies. But it doesn't have to be built with a specific OS in mind to be good - on the contrary, I believe that building firmware with specific OS in mind is what got us into this whole bad firmware mess.

And yet, holistic systems can lie. They can lie horribly about their state, and you'd be none the wiser. Being "holistic", created with the intention of the whole, does not mean that the system cannot lie. If the whole is created with the intention of lying to you, than it will.

We don't need holistic systems. We need good systems. A system doesn't need to be holistic to be good, and a system doesn't need to be good to be holistic.

chalst · on Oct 11, 2022

So what would you want to have happening in the case Cantrill was talking about? The aim is to have good systems and there have been plenty of bad holistic systems in the history of computing. But by good, non-holistic systems, do you mean that we have a motherboard with well-documented firmware that the buyer of the hardware can't see into except according to the interfaces the vendor has provided? This seems to be potentially hiding the kind of information about a malfunctioning system that gets Cantrill so worked up about.

ignaloidas · on Oct 11, 2022

I do think that one of the prerequisites of (firm/soft)ware being good is being open source. For hardware to be good, I do believe it has to be well documented.

If we go by those two requirements, we can see how it works. We can fix things ourselves. And most importantly, anyone can create their own good firmware, without the need for the system to be designed in a holistic way.

Merciernmon · on Oct 10, 2022

Literary tangent: The title seems to be an allusion to Shakespeare. "I come to bury Caesar, not to praise him." Except, the entire purpose of Antony's funeral oration is to praise Caesar. His intent was to defend Caesar's legacy and shift public opinion against Caesar's assassins.

Watch here: https://www.youtube.com/watch?v=0bi1PvXCbr8

bcantrill · on Oct 10, 2022

Yeah, yeah, we know -- it's actually Frankenstein's monster.

pas · on Oct 10, 2022

Not everyone grew up in the thick of the Anglosphere :)

vermilingua · on Oct 10, 2022

Also Damian Lewis' (spectacular) rendition: https://www.youtube.com/watch?v=q89MLuLSJgk

secondcoming · on Oct 10, 2022

Marlon Brando did it better!

https://www.youtube.com/watch?v=101sKhH-lMQ

userbinator · on Oct 9, 2022

Those summary paragraphs feel like they were written in a deliberately obfuscated style; not exactly a good first impression.

Rather than have one operating system that boots another that boots another, we have returned the operating system to its roots as the software that abstracts the hardware: we execute a single, holistic system from first instruction to running user-level application code.

...also known as single-purpose firmware for an embedded system. I'm in agreement with the other comment here that this is not a good idea. Standardised interfaces like what the BIOS provides lets the OS not be tied to subtle differences in hardware.

mappu · on Oct 9, 2022

It's possible that a single-purpose firmware is too few stages, and it's also possible that the current acpi->firmware->uefi->grub->initrd->linux is too many stages.

UEFI does have many of the features of a full operating system including shell, multiprocessing, networking and it really should be Linux in this position instead. LinuxBoot is another attempt at reducing layers here.

lifeisstillgood · on Oct 9, 2022

Link to the other talk he references in this talk

https://youtu.be/36myc8wQhLo

And yes, this is (as ever) a call to arms on fixing a massive security problem we all have.

pclmulqdq · on Oct 9, 2022

This is an amazing talk, and I think I have seen it on HN before. It lays out the problem of operating systems on modern computers well. I'm not sure that it offers the right solution by continuing to offer a system with the same set of abstraction layers as a solution to the problem of the "operating system" abstraction.

I think the folks at Oxide are probably closer to the right track by silo-busting the BIOS, OS, and hypervisor layers of a modern VM-hosting stack.

Edit: I should also add that this talk lays out a huge, gaping hole in the field of OS research, which might be its most important contribution.

imglorp · on Oct 10, 2022

But it's not an OS problem; the OS being written that way is only the symptom because it can't touch the real hardware.

Cantrill and Roscoe both pointed at the SOC vendors: if there's a new problem to solve, they add more proprietary, undocumented, cores and enclaves with tightly held secret functions and OSes of their own, all of which are out of visibility of the OS. Some of them are even designed at odds with the user's interests, such as DRM goop.

This gets back to the war on general purpose computing as Cory Doctrow put it. The hardware is ceasing to work for the user's interest and is starting to work for everyone else in the stack against the user.

pclmulqdq · on Oct 10, 2022

I have a background in digital circuits and computer architecture, and I completely disagree with you that these SoC components should be run by the traditional CPU cores. Most of these cores have strict latency guarantees or security boundaries that are much harder to achieve when you are trying to share silicon with the general-purpose code running on the machine, and since the cores are so small, it does not meaningfully save silicon to use them.

OSes are generally terrible at providing strict latency guarantees. If you will crash the CPU if you don't do [X] every 10 microseconds, you should not be telling the operating system to do [X]. This is the case of audio systems, radio systems, PCIe, and almost every other complicated I/O function. These could all be done with hardware state machines, but it is better (cheaper) to use a small dedicated CPU instead.

When I refer to security boundaries, a lot of people think that "security through obscurity" is the idea. It is not. It is more similar to closing all the ports on a server and applying "API security" ideas to hardware. It is a lot easier to secure a function that has a simple queue interface to the main cores than to secure a function that shares the main cores - Spectre and Meltdown have showed us that it might be impossible. Yes, these secure enclaves are used for DRM crap and other nonsense, so I can see why you might not like the existence of that core, but even if you erase the DRM software from that core and make it work for the user completely, you still will want the boundary.

Not to mention that every modern motherboard these days has a board management controller, which cannot be part of the CPU, and controls power and resets.

From a hardware perspective, these SoCs and motherboards really need to be heterogeneous systems. It's up to the system software to work with that. Heterogeneous SoCs really have nothing to do with taking power away from the user. The user can program all of these cores, but we live with abstractions that make it very hard.

jclulow · on Oct 10, 2022

It's fine to have many smaller CPUs in the system, for all the purposes you state. The software that runs on them needs to be open, though, and something we can understand, fix, or even replace completely as needed. The operation of the components also needs to be documented so that it's possible to do that not just in principle but in practice.

We've added several small cores of our own to the systems that make up the Oxide rack, but critically we can control them with our own (open source) software stack. The large scale host CPUs where hypervisor workloads live can communicate with those smaller cores (service processor, root of trust, etc) in constrained ways across security and responsibility based boundaries.

pclmulqdq · on Oct 10, 2022

It helps a lot that you are working with server-style computing, where you can realistically do this.

Things like audio and radio functions (eg bluetooth and wifi) have algorithms that are very proprietary and often patent-encumbered. The hardware architectures for radios are also similarly weird and proprietary. That kind of thing would result in a binary blob (at best) in a fully-open-source environment.

Hot take: I don't mind if Dolby or a wifi chipset vendor hides their software from me as long as it has very strict definitions for its role and a very narrow I/O interface.

rapjr9 · on Oct 10, 2022

There is another area that computer scientists are not studying much and namely new architectures for computers. Is Von Neumann the best and only option? It seems so called computer scientists are not really studying computers at all, they are studying applications. The fundamental science of computers is the hardware architecture and the operating system, and computer science almost completely ignores fundamental basic research in both of them.

It is also a worry that the hidden code in the SoC's subsystems might be hidden for a reason, namely to give control of computers to someone other than the user and OS. That seems a perfect way to compromise all computers while giving an illusion of security. That's why the Intel Management System has been controversial (TPM's also), but in truth there are many processors on the SoC's that each have enormous security implications that the OS does not control. This has been an issue for decades. I remember people running code on the floppy drive MCU for Amiga's, so this has been known about for a long time. Cell phones are intentionally designed so the OS has no control over basic cellular radio functionality. There is a totally separate processor with its own non-public firmware controlling the radio. Whoever writes the code for these subsystems has enormous power with very little oversight.

tsimionescu · on Oct 10, 2022

How different is this really from the fact that hardware designs themselves are proprietary? You can do computing without (what we commonly call) software, and much hardware does that - so even if 100% of the software running on your system were open source, that doesn't mean that your system is not compromised; you can very easily build a hardware key logger, and have it send the password over a cable (sending it over the internet through pure hardware would be quite hard, but modulating a signal shouldn't be). Note that even if the hardware design were open, you would still have a hell of a problem trying to check whether an IC actually matches the open design.

The only solution to these problems is actually regulation and trust. There is just no realistic way to actually check that your system is not doing some of the things you don't want it to do. Instead, you have to go to the source and make sure you can trust the vendors you buy from, and in order to do so, they themselves have to have hiring and shipment etc practices that allow this type of trust.

anthk · on Oct 10, 2022

CS in Europe requieres a big chunk of Math and EE...

eternityforest · on Oct 9, 2022

Sounds kind of horrible. PCs have a level of compatibility that is basically unheard of anywhere in tech.

Through whatever historical accident, the IBM PC is... pretty amazing. Things are standardized. Stuff just works. We can choose our OS. We don't have a ton of fragmentation. Distros aren't niche community things developed for specific hardware.

If you can make a single OS image that runs on any device of any manufacturer, the way that one Linux distro runs on almost any PC with any combination of parts, great.

But removing layers of abstraction seems like it could easily lead to incompatibility. I'd much rather have proprietary blobs everywhere than incompatibility.

panick21_ · on Oct 10, 2022

There is something to be said for comparability, but as the speaker points out. That great if you want to install #CrazyLinuxDistro on some old machine and have fun.

However if you want to run a whole cloud or have significant amount of server hardware, you might now want to have 5 other core with random OSs doing random things, eating errors and so on and so on. Same goes if you want to make secure devices like IPhone, Chromebooks and so on. I would like my standard linux laptop to be like that too.

Also we could have both. If AMD/Intel documented these low level systems then the needed patches could flow into the linux kernel. You could likely boot LinuxBoot and still put UEFI on top if you really want to.

And if the system was that open you can still define layers of abstraction on top for situation where you want that compatibility. At the moment compatibility is forced on us with a really thicc layer of abstractions that do some useful things, but also do many not so useful things.

eternityforest · on Oct 11, 2022

In theory it should work, but I'm still suspicious. In general, thicc compatibility layers seem to work in practice, and everything else never really seems to reach IBM PC levels of compatibility, fragmentation seems to happen fast if there's no consensus driving everything to follow some One True Standard.

We'd probably wind up with 5 UEFI alikes, 3 distros that don't use any of them, and there would probably be subtle compatibility issues.

Eventually the consensus would evolve to some new standard, the way systemd has, and then everyone would complain about that being too heavy and how they don't want a monoculture.

Or, worse, we could wind up in the same place ARM is, for a decade.

There's no technical reason for it, but there also doesn't seem to be a very strong pressure to standardize.

Consumers aren't exactly going to unionize and engineers love to fragment, so to me a standard with that level of popularity is something special and should only be messed with if you have a real clear plan to preserve compatibility.

Sirened · on Oct 10, 2022

One of the things that bugged me about this presentation (and, to a similar extent, with Oxide as a whole) is the assertion that the only way we can have safe and trustworthy systems is if each and every component is trustworthy. This is, obviously, a true statement—if we manage to wrangle every component of a system and put it under the control of safe and secure software, we have a really secure system. I don't, however, believe that this is the only way (or, in fact, even the most realistic). Rather, I think we will (for better or for worse) go straight in the opposite direction and end up treating the entire system as an autonomous sea of untrustworthy cores. SGX, for all of its flaws, I think got the nearest to this because it bootstrapped trusted compute in an environment where it couldn't even trust DRAM to not change underneath it. Assuming that every other system agent is hostile lets you use random proprietary garbage without needing to fully control every single core and microcontroller on the platform. It is truly a pain in the ass to program this way but it is, fundamentally, something we can do.

This, of course, isn't to say I wouldn't love to have a system where we can run our own open implementation on every bit of the platform, but rather that I don't believe we ever will. Oxide has made it pretty far, no doubt, and it's an incredible feat but as Bryan mentioned, their staff is literally reverse engineering hardware and finding completely undocumented cores in the things they're putting in charge of their platform. Hell, most companies I've been at hardly even know the full capabilities of their shipped silicon (did we slap a chicken bit on that? did we fuse off that performance analysis module for production runs? did we fully remove that one feature that didn't verify before our tapeout deadline? did we backport that one bug fix to all relevant generations? and on and on) and so its many times not even a case of companies being over protective but rather people not being able to even reason about these complex systems.

Both Bryan and Roscoe raise the question of who is at the helm of the ship and each find a different monster steering. The truth is that nobody is actually in control on these SoCs because nobody has the last word or some distinct power that cannot be compromised with some other random power. SoCs are not hierarchical; they're ridiculously complex systems of federated power, and we really need to treat them as such.

pas · on Oct 10, 2022

You still need at least one trusted core/chip, you need trusted comms to it (eg. it needs an embedded, or securely programmable secret/privkey, and it needs to be able to run some kind of crypto, key exchange), and then you are still left with wondering what the other cores/components are doing. Who is leaking/sniffing/corrupting data, when, and how...

So what we need is compartmentalization and a blessed central core ... and that's hierarchy anyway.

Yes, of course 99.99..% it's a forgotten fuse or whatever legacy junk. But apparently some folks believe there's is a business model to be built on that last 0.00.. :)

fefe23 · on Oct 10, 2022

I expected to like this talk much more than I actually did.

Basically what they did is do a reverse engineering and replay attack on a boot sequence that is already so convoluted that the company that built it didn't think that would be possible.

Good for them but all this work will be wasted if the next hardware version comes around and needs a different boot sequence. They basically nailed themselves to a specific version of a specific hardware platform that AMD is replacing as we speak (the next iterations are called Genoa and Bergamo and are expected this and next year respectively.

I was hoping this would feel like a breath of freedom, but it really feels like a few rebellious nerds trying to prove something, but not actually proving it. The talk feels to me like they actually proved the opposite. We are already too far down the path. You will never be able to, say, boot Windows on this hardware.

Can you boot a stock Linux on this hardware, or will you be dependent on patches from them?

As a Linux user I love the idea. As a PC user I never asked for UEFI or the Management Engine. I should be all for this. But somehow I'm not.

raxxorraxor · on Oct 10, 2022

Easy and sensible why you might be against this. The simple reality is that BIOS did not prevent you to run anything you wanted on your machine. Recent developments are usually to restrict the user, so any change would very likely be bad. Extrapolated experience does not need to be true, but it very well can be.

UEFI secure boot is an example. Yes, it can increase security (although I think the threat is specific or outdated), but it can well be used to limit the user practically. And if such a mechanism is established, there will be a class system of trusted and untrusted devices. This is not a development that is hard to predict. So UEFI already failed to a large degree, at least regarding the openness of systems. It is no accident that some companies push these developments enthusiastically. It is not for user security, it is simply for market dominance.

panick21_ · on Oct 10, 2022

You expected a startup to solve all problems all ADM/Intel have created? That seems like expectation going out of hand.

As he says in the video, what is important is documentation. Their code will serve as open documentation for a lot of those details.

Sure with the next version there will be some changes, but likely much of it will still be the same. And when they upgrade their product they will have to integrate those changes.

For 'stock Linux' to boot like that it would need to add those same low level drivers and I don't know if there is a reason the phased approach shouldn't work on linux.

The hope is for many companies to do this and demand this documentation so that with each new version these things will quickly find their way into the open source software. This is now happening with coreboot where multiple companies collaborate on new CPU generations to get it in before the hardware is even released.

But with fireware there is never this amazing solution anybody can do unless the manufacture simply does it. Its sad but its the reality.

bcantrill · on Oct 10, 2022

I'm not sure what gave you this impression:

> Basically what they did is do a reverse engineering and replay attack on a boot sequence that is already so convoluted that the company that built it didn't think that would be possible.

That is not at all what was done here, and my apologies if I implied that it is. Indeed, quite the opposite was done: we determined how it actually worked and what the part actually needed -- discarding much of the needless gunk and repeated initialization. So it's not a "boot sequence" per se, it is initialization of various on-die components. Is this AMD-specific? Yes. Is it likely to change in Genoa, Bergamo and beyond? To a degree, yes -- but in broad strokes, unlikely. And because we have the OS (and importantly, its tooling) available where this enablement is taking place, we believe that we will be able to make the necessary changes for Genoa and beyond relatively faster than the extant approach of waiting for a proprietary BIOS.

fefe23 · on Oct 10, 2022

Hi Bryan, thanks for the talk. I think we are saying the same thing just from a different point of view.

Your approach is very laudable and I have in fact been fanboying it for a while. I wish you the best of luck and profits for your company. I sure hope this approach is more sustainable than my gut feeling told me after viewing your talk.

One question, if I may: How sure are you that AMD won't sue you at some point over this? Your ought to be in their best interest but that has rarely stopped legal departments in the past.

Where you see an "this is an awesome opportunity for a company, I'll found one", I see an "would I really want to base my business on something that AMD may at any time decide voids the warranty"?

bcantrill · on Oct 10, 2022

We have worked very closely with AMD, and they have broadly been very supportive of what we're doing here. As for the more general point of the peril of tightly integrating with partners (AMD is not the only one that we have gone very deep with): yes, it's a risk. But we have consciously decided that we would rather make deliberate decisions and deeply integrate than build a system composed of the lowest common denominators. Relationships are always a gamble (which is why it's important to enter them carefully!) but our belief is that the upside of a tight partnership more than compensates for the risk.

don-code · on Oct 10, 2022

I was somewhat disappointed that there was only one "need" for holistic systems presented: the fact that correctable DIMM errors were getting eaten by underlying firmware. While I don't necessarily disagree with the speaker's premise, I'd have at least liked to see more than one example of why this is important - I can't say that I've walked away with a good idea of why that is. The fact that correctable versus uncorrectable DIMM errors were differentiable tells me that there already _is_ an interface that allows propagating both up to the OS, and that the firmware of the system just wasn't fully implementing it.

jclulow · on Oct 10, 2022

Dealing with existing PC compatible servers is just a constant onslaught of tedium. Some of it is as disruptive as the DIMM stuff Bryan talked about. Some of it is just the stuff people have seemingly become desensitised to. Some of the things I've had to deal with in the last few years:

* several different servers where you can set the boot order through the Redfish API, except that it only works about 60% of the time and there is no way to tell from the response if it took or not. Just have to keep doing it and rebooting until it hopefully eventually works!

* PXE boot support that is incredibly slow, so for any reasonable payload size you need to chainload iPXE every time

* servers with oddball Broadcom NICs where chainloading iPXE doesn't work at all

* servers that sit at the BIOS screen for twenty minutes unless you pull out all the U.2 NVMe devices, then they boot OK. Maybe a firmware update will fix it!

* firmware updates that don't install properly through the BMC

* firmware updates that don't _show_ they're installed properly until the BIOS boots completely and can report the new version through apparently some kind of HTTP request on an internal USB NIC to the BMC, even though the BMC has control of the SPI flash and thus is lying to you until that reboot occurs

* BMCs that just stop working until you remove power at the wall from the whole system

* IPMI serial over LAN redirection that drops about 4% of output characters but not in a predictable way so copying and pasting, say, a serial number has be done several times to be sure you got the whole thing

* an interrupt controller in a HP system that doesn't emulate fixed interrupts correctly, and so eventually after some random number of millions of interrupts the lines are just stuck on until you power cycle the system

That's just stuff that comes to mind at the moment. There's literally no way to compose a reliable automated production system on this ridiculous tower of packing peanuts. In contrast, in the lab, I can pretty much just ask an Oxide machine to replace the contents of the SPI ROM and the M.2 storage device and power cycle it and it does what it's told. There are comparatively few moving parts and we have the source to almost all of them.

cryptonector · on Oct 10, 2022

There was more mentioned. The BMCs and MEs being proprietary, all-powerful, and insecure. The DIMM thing applies to other devices too, like SSD wear-leveling controllers. Then there's the whole trusted boot Pluton/TPMs thing: where you have to "measure" all these proprietary firmware blobs and bless them as trusted knowing very little about them, often nothing really other than they were there in option ROMs before you even booted the thing for the first time. The list is not insignificant.

mlindner · on Oct 9, 2022

Really good talk by Bryan Cantrill (co-founder of Oxide Computer COmpany) on the problems with closed source firmware and the need to move away from BIOS and EFI and how they booted their own x86 AMD custom board system all made with open source firmware and without using AMD's firmware.

pclmulqdq · on Oct 9, 2022

I have booted a few SoCs without a BIOS or anything of the sort before (nothing nearly as big as an AMD Milan chip). Doing this with a huge, fast chip is really impressive from the folks at Oxide. DDR5 DRAM training is also a ridiculously complicated and touchy exercise, as is some of the PCIe link training that (according to the talk) Linux/Unix handles.

Edit: this board uses DDR4, not DDR5.

mlindner · on Oct 9, 2022

Apparently the DRAM training is apparently one piece of AMD firmware I believe they re-used. Though I had trouble following that part. I think they said that, but they only didn't do it because they didn't need to because they had to pick and choose battles.

Edit: Yes, I believe he says they used the AMD firmware for the PSP (Platform Security Processor).

Edit2: This post may actually be incorrect. Please go watch the talk. I'm not sure anymore.

bcantrill · on Oct 9, 2022

No, you're correct: the PSP does DIMM training. Also, note that this is AMD Milan, so it's DDR4, not DDR5 -- DDR5 is still forthcoming from both AMD and Intel.

pclmulqdq · on Oct 9, 2022

DDR4 link training is a bit easier to comprehend than DDR5, but still a headache. I think almost every high-performance SoC uses an auxiliary processing core for link training at this point (including large FPGAs).

I have been waiting for someone to find a security vulnerability in one of these cores.

mlindner · on Oct 9, 2022

One part I couldn't follow on your talk is which software is Oxide designed and which is vendor supplied software? Have you really stripped out every piece of vendor software from the product? Or are there still some parts that are vendor supplied black boxes?

The question after your talk implied that DIMM training was still being done by non-Oxide software.

bcantrill · on Oct 9, 2022

On AMD, DIMM training is done by the PSP. And you have to run the PSP to run the SoC, so we have no alternative there.

More generally, we have implemented and opened everything we can; there still remain opaque bits like the PSP, as well as some smaller bits scattered through the machine (e.g., SSD firmware, MCU boot ROMs, VR firmware, etc.). We have endeavored to make as open a system as possible within the constraint of not making our own silicon. And while we haven't talked about it publicly, we will also open our schematics when we ship the rack, allowing everyone to see every component in the BOM. While we do have some (necessary) proprietary blobs in the system, we want to at least be transparant about where they are!

mlindner · on Oct 9, 2022

What's the "VR" in "VR firmware" stand for? Unless Oxide has invented a revolutionary VR-based user interface for server hardware, I expect I misread that line. ;)

bcantrill · on Oct 9, 2022

Oh, sorry: "VR" is a "voltage regulator" in this context. You can see our drivers for these parts in Hubris, our all-Rust system that runs on our service processor.[0] All of our drivers are open, but the parts themselves do contain (small) proprietary firmware blobs.

[0] https://github.com/oxidecomputer/hubris

pclmulqdq · on Oct 10, 2022

Why not write your own firmware for these? They may be a lot more sophisticated than I am thinking, but voltage regulators usually have a datasheet/manual with a set of control registers, and you likely want to set those registers based on the physical hardware you have.

eaasen · on Oct 10, 2022

The regulators are actually quite sophisticated and have many undocumented registers that set how things like the communications with the processor work, nonlinear control algorithms, etc.

pclmulqdq · on Oct 10, 2022

Very interesting. I would assume that parameters for the control algorithms are actually a few of the things you want to set for yourself, since that lets you optimize system stability under load switches. Your board might also have more or less inductance or capacitance than other motherboards, and this can affect the stability and performance of the control loop a lot. It's a shame they don't give you the documentation about those control systems to figure out how to set those parameters for yourself.

monocasa · on Oct 9, 2022

Probably voltage regulator.

pabs3 · on Oct 10, 2022

> the constraint of not making our own silicon

Is changing that on your roadmap?

gonad_autopay · on Oct 9, 2022

What is DIMM training and what is the PSP?

mlindner · on Oct 10, 2022

My very limited understanding of DIMM training is that it's about the digital system learning and setting the precise analog timing settings required to talk to the DIMM. Every memory cell has tiny manufacturing differences and so these need to be learned on the fly at computer boot and they change over time with use so need to be re-determined every boot.

eaasen · on Oct 10, 2022

Timing adjustment per x number of data bits has been required since DDR3 but DDR4 also has internal reference voltage calibration for DQ bits (VREF_DQ). This voltage sets threshold by which the IO cell determines if a voltage represents a logic high or low. This VREF_DQ value is calibrated per x number of bits in addition to adjusting the timing to try to find the best place to sample the signal.

magicalhippo · on Oct 10, 2022

This[1] page does a decent job of going into the initialization procedures of DDR4, and why the various steps are needed. Really quite fascinating and complex, due to the high speed.

The essence is that the signals are so high-speed, ie each bit takes a very short time, that the physical distances between DIMM modules and DRAM modules on a DIMM start to matter and has to be compensated for so that all relevant signals arrive at the same time at a given DRAM module.

[1]: https://www.systemverilog.io/ddr4-initialization-and-calibra...

mustache_kimono · on Oct 9, 2022

I don't know about DDR5, etc., but coreboot apparently has its own DRAM training code if any one cares to take a look.

convolvatron · on Oct 9, 2022

I've worked on this before. We had explicit help from AMD - their documentation is better than Intels for this stuff, but still.

if it weren't for linux's dependence on the bios pci and memory probes, you could really cut out all that crap. you need to program the memory controllers and bring in the kernel from storage.

Teknoman117 · on Oct 9, 2022

Is Linux actually dependent on the firmware's PCIe probe? You can have the kernel do probe and enumeration if you want.

(Outside of the process of getting into the kernel. Most x86 users are going to be booting off of storage that's behind PCIe somehow.)

Edit - for clarity, I mean calling back into firmware after boot for PCIe functionality.

convolvatron · on Oct 10, 2022

oh no, my recollection was that at least at the time, linux used the apci pci traversal to start driver discovery instead of doing its own. obviously after that it assumes complete control of the device tree.

jstgord · on Oct 9, 2022

What is the truth of this ?

I would really like to know if my x86 'linux PC', is actually running another 'hypervisor' OS that runs Linux - it seems like a recipe for security vulnerabilities.

If so, there are a lot of Qns :

- what OS, is it up to date ?

- can this OS be communicated with from the network ?

- on which chips does it run ?

- can it be re-flashed / upgraded / replaced ?

- can it interrupt/schedule my normal os ?

- how much CPU/power does it use ?

I would certainly trust an Oxide supplied (Rust) bare metal open-source low-level OS to host linux vms on my dev machine, than say, a totally opaque binary blob that the US government has forbidden xyz company from talking about.. just to speculate wildly.

I also think Oxide has wider market that just the server space - eg. one has to do all sorts of shenanigans to get a core freed up so that you can run timing/latency sensitive apps, without getting interrupted by linux threads doing noisy housekeeping on each core.

mlindner · on Oct 9, 2022

> I would really like to know if my x86 'linux PC', is actually running another 'hypervisor' OS that runs Linux - it seems like a recipe for security vulnerabilities.

There's not so much a "hypervisor" OS, but instead a congealed set of many different OSes, some realtime OSes, some possibly old Linux variants, some possibly closed source proprietary one-off OSes. I suggest taking a look at the talk Bryan mentioned: https://www.youtube.com/watch?v=36myc8wQhLo

zokier · on Oct 9, 2022

Its kinda difficult to understand what are the key differences between their approach and what coreboot does with Linux payload, which afaik does not use any BIOS/UEFI layer in between and also is pretty minimal on coreboot size => boot as early as possible to Linux

panick21_ · on Oct 10, 2022

The difference is that coreboot comes up and runs a payload, something like LinuxBoot. LinuxBoot is then responsible to find the actual OS that you want to boot, and uses kexec to boot that OS. LinuxBoot still has to pass hardware information to the HostOS and then hand over control.

In Oxide this doesn't happen, the Pico thingy boots into their OS and that's it. There is no in-between step. The OS then simple loads the additional things it needs. It never hands of control to anther kernel.

At least this is how I understand it.

zokier · on Oct 10, 2022

> LinuxBoot is then responsible to find the actual OS that you want to boot, and uses kexec to boot that OS

I'm not super familiar with coreboot, but isn't that step kinda optional? I mean could you start booting up userland directly too?

sounds · on Oct 10, 2022

I see the sibling comment, so here's my two cents -

I wonder if there's some confusion because LinuxBoot is being talked about relative to Oxide. LinuxBoot isn't trying to do that.

It was originally a project called "LinuxBIOS" at Los Alamos National Labs to get their supercomputer to boot up in a reasonable amount of time. Once the team started down the path of "rip out the legacy BIOS" they discovered that a lot of things become simpler.

LinuxBoot was written with the goal of booting up the machine to GRUB, like this:

(CPU Reset) > LinuxBIOS > GRUB > Linux

See the description of this video (though the video is worth the watch): https://vimeo.com/724454408 "Why Linux? Because firmware always evolves to become an operating system. Rather than wait for evolution to take its course, LANL decided to save some time and use Linux as the BIOS: hence LinuxBIOS."

See also https://lwn.net/Articles/10590/

-----

And the answer to your question: "could you start booting up userland directly?" I think if the LinuxBoot authors were around they would mention something about how constrained the on-board flash chip is (SPI Flash). There's room for LinuxBoot and maybe GRUB, but there is not room for all the drivers needed to get to userland.

panick21_ · on Oct 11, 2022

The current normal is:

(CPU Reset) > Coreboot > LinuxBoot > GRUB (or whatever) > Linux

Coreboot: https://www.coreboot.org/

LinuxBoot: https://www.linuxboot.org/

LinuxBIOS was renamed Coreboot for marketing reasons.

LinuxBoot is much newer concept.

panick21_ · on Oct 10, 2022

I have not heard of anybody doing this so I would assume there is some reason for that.

mlindner · on Oct 9, 2022

Someone from Oxide (maybe Bryan) talked a bunch about how the philosophy of Oxide differs from Coreboot somewhere. I remember either reading about it or hearing it, but I can't for the life of me remember where I heard/read it.

aidenn0 · on Oct 9, 2022

Anyone else worried by the title that Bryan is going to setup an autocratic triumvirate?

lifeisstillgood · on Oct 9, 2022

Friends, Romans, Countrymen, lend me your binaries?

rrss · on Oct 9, 2022

Here was the BIOS, when comes such another?

aidenn0 · on Oct 9, 2022

My heart is in the coffin there with the BIOS, And I must pause until it returns to me.

xdmr · on Oct 9, 2022

No, I am Ole Torvalds the poet!

encryptluks2 · on Oct 9, 2022

I'm not exactly worried but it wouldn't surprise me if he wanted to.

mlindner · on Oct 9, 2022

I appear to missing some inside joke. Anyone want to explain it?

lifeisstillgood · on Oct 10, 2022

As others pointed out it's (mis)quoting shakespeare's Julius Caesar. It is worth mentioning that I hope one day we (HN?) will be quoting code - that some portions of code (maybe elegant implementations of encryption or quadratic voting) will become canon just as shakespeare is taught in schools.

It's always been a worry that if more people can quite shakespeare than do long division or understand a car engine then we will have "uninformed" decisions.

I am dubious about that - I think of elections are meaningful, the populace takes them seriously and self informs to a level they are happy with (!) but anyway I like the idea that some of the Linux Kernel maybe in peoples heads enough to pop up in middle age

aidenn0 · on Oct 10, 2022

It was around 700 years from Beowulf to Shakespeare, so using English literature as a measuring stick, we have plenty of time yet.

skybrian · on Oct 9, 2022

The title alludes to the second line of a speech from one of Shakespeare's plays.

https://www.poetryfoundation.org/poems/56968/speech-friends-...

wmf · on Oct 9, 2022

https://en.wikipedia.org/wiki/Friends,_Romans,_countrymen,_l...

mlindner · on Oct 10, 2022

Interesting. I've read/watched some amount of shakespear but literally never even heard of this. Maybe it's based on regional schooling curricula?

aidenn0 · on Oct 10, 2022

I've noticed that Shakespeare's history plays get much less time in school than the comedies and tragedies.

tenebrisalietum · on Oct 10, 2022

It should be illegal to hide CPU-facing hardware programming information, such as registers and functions, behind an NDA. Even DRAM training stuff. Let the open source community do it all from the ground up.

If the hardware is really a smaller CPU on the other side of a bus or HCI, and what's exposed to the CPU is really a remote-RAM-loading-and-communication interface, that's fine. That's not host CPU firmware, that's device CPU firmware and while that sucks for it to be under NDA/undocumented, at least it's mostly in its own little world. The registers/process needed to setup and talk to that remote embedded CPU should be fully documented and available to anyone.

ACPI is a travesty for enabling this hiding of specific hardware interfaces-often tied to things you need to have your laptop be useful like battery controllers and fans. SMM is a travesty for giving it a place to live.

ur-whale · on Oct 10, 2022

Very unclear what the business case for this actually is.

I watched all the talk, and it seems to be centered around the fact that opaque, vendor-specific blobs that run on hidden cores are a security weakness.

I agree with that, but to be fair, what segment of the market actually cares enough about this to pay oxide money?

There may be a business case in there, but the talk certainly doesn't explain what that is.

steveklabnik · on Oct 10, 2022

From my own perspective, the business case is reliability. We can take responsibility for the entire rack. No “actually that bug is in code from our vendor, we’ll file a ticket upstream” and then maybe someday it’ll get fixed and maybe someday eventually roll out.

If (let’s be real, when: nothing is perfect…) problems in our system are found, we can fix them.

There’s other reasons too. But imho that’s the big one.

panick21_ · on Oct 10, 2022

Why would you expect the talk at a firmware conference to explain their business plan?

Also he mentions more then security.

eschaton · on Oct 10, 2022

It’s simple: Everyone should use Open Firmware. It’s an IEEE standard, it works well, and device trees are already widely used. Just adopt the FORTH monitor too and embrace standardization.

waynecochran · on Oct 9, 2022

   But the need the BIOS offers lives after it, 
   and proprietary systems are interred within its bones.

DerSaidin · on Oct 10, 2022

https://www.slideshare.net/bcantrill/i-have-come-to-bury-the...

bscphil · on Oct 10, 2022

A lot of comments here focusing on whether this is a good thing or not, but I'm worried about a more fundamental issue: how is it supposed to happen in the first place?

As I understand it, the proposal is that every bit of software running on a device that is currently considered "firmware" ought to be in the OS instead. What does the story of how we get from here to there look like?

1. We convince manufacturers that it would be a great idea to release thorough documentation for all of their hardware products, and that way open source kernel developers can write their own firmware and merge it into open source kernels. Bryan gestures in this direction several times throughout the talk, claiming that this would be good for the hardware manufacturers because it would enable us to by more of their products. But ... no one's really convinced this is going to work, right? The hardware companies don't want to release documentation for the exact same reason they don't want to open source their hardware; it's a barrier (however small) against leaking trade secrets.

2. We focus our efforts on small companies like Oxide building integrated, top-to-bottom systems designed to run without firmware. This is great because it enables BIOS-less system design from the ground up and can be every bit as open source as we want it to be. But this means waiting and hoping that a tiny number of companies manage to reverse engineer the chips (like the AMD processors discussed in the presentation), which probably means running years behind and an extremely limited selection of hardware. This is not a solution that is going to deliver my next laptop, let alone my grandma's next laptop.

3. We convince manufacturers that EFI is a broken, sucky, insecure mess and include and let them help lead a process designed to replace it. This could maybe happen, we did get rid of BIOS (in favor of EFI) after all, but what is the end result likely to look like? It looks like binary blobs running unknown code necessarily tainting nearly every Linux system under the sun. And that's if they don't just decide to integrate with Microsoft and ignore Linux completely.

That's the thing about isolation. It has problems, as this talk discusses, but it also means that (putting aside a great number of bugs) an OS like Linux can exist without needing to reverse engineer firmware for the enormous amount of hardware it runs on. It's great to be able to run an OS with few-to-no blobs active, e.g. with open source Wi-Fi and graphics drivers. Closed firmware is in theory a security nightmare, but in practice it's not that often that anyone managed to exploit these vulnerabilities. Can you imagine having to run closed source Wi-Fi firmware in the Linux Ring 0?

chungy · on Oct 10, 2022

My own personal war against UEFI is a losing battle (already can't use newest Intel generations), and I'd love to see this become a new standard. It might make OSes slightly more complicated, but the benefits of everything being open source and well-understood would more than make up for it.

tsimionescu · on Oct 10, 2022

Would every user having to create their own boot image for the specific device combination they own (processor, motherboard, RAM, disk, GPU, power source etc) really be worth it? I think if this became the norm, you would see Linux adoption on this brave new hardware drop even lower than it is today.

olive247 · on Oct 10, 2022

This talk just made me realize how little I understand about what the firmware is really doing. I had previously understood the BIOS to be a small piece of firmware that runs the bootloader at startup and maybe some other coordination between other hardware devices and the CPU. Then CPU takes over and the firmware takes a backseat. The speaker mentioned a whole can of firmware worms that I had no idea even existed.

I’m interested to see the progress of open source firmware now.

mlindner · on Oct 10, 2022

You should check out this other talk https://www.youtube.com/watch?v=36myc8wQhLo . It goes into a bunch of details about how many parts there are inside modern SoCs.

soneil · on Oct 10, 2022

That's pretty much where I'm at - my understanding of most of this hasn't moved since CP/M.

CPU boots, PC sets to zero, starts reading instructions from ROM, optionally paging out the ROM if we really need to recover that memory window. I get that x86 is a bit messier because you have optional bioses that get run, and then you read and run the first page of the storage device, but .. man have they made life difficult since I last paid attention.

germandiago · on Oct 10, 2022

what are the implications of replacing the BIOS actually, from a safety point of view?

musicale · on Oct 10, 2022

BIOS seemed to work well for CP/M, DOS, etc.. Would probably work well elsewhere.

haskellandchill · on Oct 10, 2022

they talk about holistic systems and then just hacked some shit together like everybody else, very let down. "hubris" is an apt choice of name.

wmf · on Oct 10, 2022

What would your criteria be for a holistic yet non-hacked-together boot process? How do you decide what layers/abstractions are needed and what are not?

(Also, user name checks out.)

haskellandchill · on Oct 10, 2022

I think holistic implies some principles guiding design decisions, drawing something from research areas like formal methods, programming language theory, operating systems, hardware, etc and unifying them more of what the cited talk It's Time for Operating Systems to Rediscover Hardware covered.

chungy · on Oct 10, 2022

Maybe your generation 2 will be called "Humility" to reflect lessons learned :)

bcantrill · on Oct 10, 2022

We're one step ahead of you: Humility is actually the debugger for Hubris.[0]

[0] https://github.com/oxidecomputer/humility

chungy · on Oct 10, 2022

Haha, that's pretty clever :) The intended point was more that the concept can be iterated upon. I think you'd be the last to claim Hubris ought to be the end-all-be-all of holistic operating systems.

haskellandchill · on Oct 10, 2022

the tooling in the demo was impressive, and I don't mean to be overly critical I have gazed into the os/hardware abyss and I'm still paralyzed to figure out an approach. and maybe you will find what I'm looking for out there, or maybe we are looking for different things and that is fine too. it's certainly better than doing nothing.

mmphosis · on Oct 10, 2022

An audited open core CPU like RISC-V. “Instant” on booting from my camera card with the starting address set with DIP switches. That is all.

haskellandchill · on Oct 10, 2022

that's nice but doesn't speak to any "holistic"ness.

welkjl · on Oct 10, 2022

[flagged]

haskellandchill · on Oct 10, 2022

they certainly have talent, which is the reason for my disappointment

magnawave · on Oct 10, 2022

Indeed doing what they have done is impressive, and talented folks at play for sure. Not sure, even vaguely commercially viable(heck 0xide as a whole), but nerdy interesting for sure. But I can’t see practically why this is so much more useful than what society has built together and call the x86 platform over the last few decades.