Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> And since nobody is still able to compile things from scratch, everybody just downloads precompiled binaries from random websites. Often without any authentication or signature.

Apache has official mirrors that host repo files for various package managers so you can install using apt-get or whatever it is that replaced yum (dnf? dnf):

https://www.apache.org/dyn/closer.lua/bigtop/bigtop-1.2.1/re...

So precompiled binaries from official sources are certainly available.



I think the author's point was this:

"Unless you compile it yourself, you can't trust it."


There are over 2.9 million lines of code in Apache Hadoop alone, not counting dependencies. If you can't trust Apache, you can't trust Hadoop, regardless of whether or not you can compile it yourself.


EXACTLY. It’s just software. There’s no easy answers here. There’s vulns in everything from hypervisors to node modules. Building from scratch isn’t going help.

Pragmatic solutions where possible. Like scanning containers, using OWASP tools on your repos etc


I wish I could upvote this twice. Auto scanning containers should become the default process for everyone.


There are nearly 10 million lines of code in libreoffice, and yet I can and have built it from source just by typing:

1. $ git clone git://anongit.freedesktop.org/libreoffice/core

2. $ apt-get build-dep libreoffice

3. $ ./autogen.sh && make

Just because something has a large code base doesn't mean we shouldn't be able to build it from source ourselves.


Did you read all those lines yourself? Did you even confirm checksums matched before running them?

I think that's the parent's point. You can build from source, but how do you trust the source? Is it any more egregious to trust a prebuilt binary from a specific website than it is the raw source? If you can't trust the binary being hosted by the author/caretaker, can you really trust the source being hosted or maintained by the author/caretaker?


I don't think his point is so much about the source as it is about updating N containers. For instance, say there's a known libssl bug. Can you tell how many of your containers are running that version of libssl? And how do they get updated?


1) List the number of containers running pre-fix versions of images of libssl-using server software. 2) Bump the version of the images you're using as a base for your server images to post-libssl-fix and push.


I think the point isn't that we can build from source, but why. If its a huge codebase you can't independently audit that source code. So ultimately if you compile it or the organization making it doesn't matter for purposes of trusting that code not to be malicious.


How is it more secure? Do you read the entire source code to search for backdoors?


Web servers hosting binaries seem to be compromised more often than git repos. Transmission comes to mind as a semi-recent example.


But this isn't a website hosting a binary. These are binary repos hosted by Apache, who self-hosts their VCS repos as well. The idea that Apache can be trusted to host one safely but not the other is absurd, and the idea that you are more likely to notice malicious tampering via MitM attack on 2.9 million lines of code than you are a binary is laughable.


I didn't say I agreed with the statement. Which I don't.


I believe this is true and if so, then the argument should be against package managers, not Docker specifically. Most (if not all) of the official Docker images are built either by compiling the binaries from source, properly installing the binaries from the base distro's package manager, or pulling and verifying a pre-built binary from the vendor's website. For most cases, I don't see anything wrong with any of these.

Personally, what I like is no longer having to setup arch specific build machines containing all of the build tools and dependencies for all binaries that I wish to self-compile. Instead, I either use the vendor's Dockerfile which already contains everything it needs to build from source or I simply write my own if there is not one available. Building and distributing these binaries in the form of Docker images is a breeze using Gitlab CI and container registry and is just as easy with a small VPS and Docker Hub.


Actually, I think there's a subtle distinction everyone's missing (which the original article may or may not have been making):

Unless one can compile it oneself, how can one trust that a particular version of a binary release correspond to a particular version of a source release?

If the process is reproduced by another trusted-enough source and is identical to the official release, then I'd say one can go ahead and trust the binary release of either one.

Sadly, I don't think this is generally done, though perhaps ones own spot-checking of the official release is enough.

That's supposed to be the basis of modern science, too, though, of course, it's not generally done there, either.


No, the point was "when you routinely use binaries from a bajillion different sources of varying degrees of trustworthiness, bad stuff is bound to happen".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: