Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Transactional installs are an advantage. I m trying to understand & brainstorm if any disadvantages exist especially with instances with multiple copies of every subcomponent.

* Transactional updates across instances: Let's say I have app, web, db, and some other roles of servers. How can I ensure all coordinating sets of instances to get updated altogether or not? For example, I don't want my app servers end up with a previous version of postgres adapter while my database is already updated.

* Memory requirements: does the approach increase the total memory requirements?

* Security: do we need to rely on a 3rd party for updates or can we still compile or own subcomponents? ( We had to in recent bash vulnerabilities)

* Security: If every image sits with its own copies and versions of each subcomponent, do we end up having to prepare a permutation of different images to ensure all is fine?

* Updates: Does it make integrators get lazier and end up with a lot of obsolete or non-improved versions of many subcomponents?

* Architecture: Do we give up the idea of reusable shared building blocks at this level of abstraction (sub-instance)?



I can address some of those questions based on my reading of the literature and what they will integrate with:

* This seems like it would be coordinated by fleet, mesos, kubernetes. If I recall, some of these would allow you to direct new connections to new instances. For databases where clustering requires more sophisticated upgrades, it might have to be manually rolled/scheduled, but could probably be scripted with these.

* Memory requirements: Generally yes, but the thought process is that by having a read-only filesystem for most data, deduplicating filesystems (BRTFS, ZFS) can reduce your memory and storage requirements.

* Security is the toughest nut to crack. You're right, if a package incorporates BASH as a simple shell to run exec against, then you end up dependent on the app provider to use that. Likewise, openssh, libc, other libraries seem like you could get stuck with whatever the app developer has packaged. Alternatively, it looks like if there is a security fix, it should be easy to handroll your own temporary version by unpacking a package, dropping in a new lib, and repackaging. Hopefully they're not pushing for static compilation (which would defeat my argument on memory as well.)

* Updates: Yes, but the same problem happens when everyone has long dependency chains. Instead of laziness, it becomes a hurdle to overcome to get people to up their constraints and incorporate fixes. At least this way, every app developer can ship what works for them.

* Architecture: The reusable component aspect would likely shift closer to compiler/build process. e.g.: Look at how Cabal for Haskell and Cargo for Rust work (and occasionally, fail to work.) I think the goal would be to have reliable, repeatable builds using components managed by something else, using repositories of source code/binaries to build against.


> deduplicating filesystems (BRTFS, ZFS) can reduce your memory and storage requirements

This is getting into a pretty tangential discussion, but I'd be surprised if there are net memory savings from deduplication. Disk yes, but the dedup process itself has significant memory overhead (both in memory usage, and memory accesses), which would need to be offset to have a net win. At least on ZFS, it's usually recommended to turn it on only for large-RAM servers where the saving in disk space (and/or reduction in uncached disk access) is worth allocating the memory to it.


For online deduplication you are correct, but there is not much need for online deduplication on a mostly read-only system.

BTRFS currently only supports offline, and I believe the current state of ZFS is only online. A curious situation, but I imagine ZFS will eventually support offline dedupe and with that, the memory requirements will fall in terms of what needs to be cached.

And memory usage would decrease, because offline dedupe on read-only files reduces duplication in cache. Even memory-only deduplication would be sufficient. I'm not sure if zswap/zram/zcache support it, but it seems like a worthwhile feature.


BTRFS has no built in dedup support as of yet but any snapshots made initially share all data which is very fast and lightweight


Zfs dedupe has memory overhead for writes. Technically the read overhead is zero since the metadata on disk just points to the correct file chunks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: