Transactional installs are an advantage. I m trying to understand & brainstorm i...

AaronFriel · on Dec 9, 2014

I can address some of those questions based on my reading of the literature and what they will integrate with:

* This seems like it would be coordinated by fleet, mesos, kubernetes. If I recall, some of these would allow you to direct new connections to new instances. For databases where clustering requires more sophisticated upgrades, it might have to be manually rolled/scheduled, but could probably be scripted with these.

* Memory requirements: Generally yes, but the thought process is that by having a read-only filesystem for most data, deduplicating filesystems (BRTFS, ZFS) can reduce your memory and storage requirements.

* Security is the toughest nut to crack. You're right, if a package incorporates BASH as a simple shell to run exec against, then you end up dependent on the app provider to use that. Likewise, openssh, libc, other libraries seem like you could get stuck with whatever the app developer has packaged. Alternatively, it looks like if there is a security fix, it should be easy to handroll your own temporary version by unpacking a package, dropping in a new lib, and repackaging. Hopefully they're not pushing for static compilation (which would defeat my argument on memory as well.)

* Updates: Yes, but the same problem happens when everyone has long dependency chains. Instead of laziness, it becomes a hurdle to overcome to get people to up their constraints and incorporate fixes. At least this way, every app developer can ship what works for them.

* Architecture: The reusable component aspect would likely shift closer to compiler/build process. e.g.: Look at how Cabal for Haskell and Cargo for Rust work (and occasionally, fail to work.) I think the goal would be to have reliable, repeatable builds using components managed by something else, using repositories of source code/binaries to build against.

_delirium · on Dec 10, 2014

> deduplicating filesystems (BRTFS, ZFS) can reduce your memory and storage requirements

This is getting into a pretty tangential discussion, but I'd be surprised if there are net memory savings from deduplication. Disk yes, but the dedup process itself has significant memory overhead (both in memory usage, and memory accesses), which would need to be offset to have a net win. At least on ZFS, it's usually recommended to turn it on only for large-RAM servers where the saving in disk space (and/or reduction in uncached disk access) is worth allocating the memory to it.

AaronFriel · on Dec 10, 2014

For online deduplication you are correct, but there is not much need for online deduplication on a mostly read-only system.

BTRFS currently only supports offline, and I believe the current state of ZFS is only online. A curious situation, but I imagine ZFS will eventually support offline dedupe and with that, the memory requirements will fall in terms of what needs to be cached.

And memory usage would decrease, because offline dedupe on read-only files reduces duplication in cache. Even memory-only deduplication would be sufficient. I'm not sure if zswap/zram/zcache support it, but it seems like a worthwhile feature.

pwr22 · on Dec 10, 2014

BTRFS has no built in dedup support as of yet but any snapshots made initially share all data which is very fast and lightweight

XorNot · on Dec 10, 2014

Zfs dedupe has memory overhead for writes. Technically the read overhead is zero since the metadata on disk just points to the correct file chunks.