Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Understanding Disk Usage in Linux (ownyourbits.com)
118 points by octosphere on Sept 26, 2019 | hide | past | favorite | 22 comments


Website seems to be hugged to death, here's an archived copy: https://web.archive.org/web/20190429214402/https://ownyourbi...

The article is well worth reading as it's reasonably comprehensive, including a small foray into COW filesystem.


Reminder that all HN submissions are archived at archive.is, just prepend archive.is/ to the url


Is that a reference to Hug Bot? https://pbfcomics.com/comics/hug-bot/


Not sure, but I always thought "hug of death" came from reddit:

https://en.wikipedia.org/wiki/Slashdot_effect


Previously, that was called "slashdotting".

(Knowing such things makes me feel kind of old).


Ah yes that is surely it.


One feature not mentioned in this article is that BTRFS (and some other file systems) supports transparently compressing files. It's another (very important) reason why the actual number of bits the file contains might not be how much of your disk space the file is actually using.


To show extents - including sharing - of a file you can use the tool filefrag -v, it's more general than btrfs filesystem du since it uses the fiemap and fibmap ioctls that are supported by multiple, but not all, filesystems.


Related: is it possible to reliably maintain physical disk space quotas in Linux (similar to cgroups)?

Furthermore, is it possible to say how much "space" you would use if you were to create a file with a given size, accounting for block-size, fragmentation, and metadata? Matters such as block-size, inode usage, and metadata seem to make this very difficult even if you add special integration to the userspace application, for example by using stat or statfs. This could help prevent quota overruns for example.

These seem like hard problems unfortunately, and I suspect the best solution is to just create separate disk partitions for each quota group.


First question: Quotas have been supported on Linux for a very long time. All major (and native) file systems support them.

Secondly: disk usage accounting for metadata as well as regular file data may or may not be tricky. ZFS always tells you how much data+metadata is used by a file, helped by metadata itself being dynamically allocated on ZFS like everything else is. File systems like ext4 that have fixed metadata locations on disk don't report back metadata allocation with the file; it wouldn't really be useful to see this information since removing the file doesn't free any metadata in the ext4 case.


Project quotas appear to be more similar to cgroups. They are available in xfs and ext4 https://lwn.net/Articles/623835/


The traditional 4.2BSD-style quotas (1983!) on linux also support quotas on unix groups. Not sure if you had this in mind, but anyway.

I suppose project quotas as outlined here would allow multi-group support though.

Another option could be sparsely-provisioned COW LVM volumes.


Questions about "cp --reflink" (I have never used that option so far, but it sounds useful).

Quoting "man":

"When --reflink[=always] is specified, perform a lightweight copy, where the data blocks are copied only when modified. If this is not possible the copy fails"

Q1: this is copy-on-write, right?

Q2: once/if the command completes successfully, are there any (potential) dangers (e.g. if I then immediately delete the original file) or can the new file be treated 100% as if I had copied it the classical way (without that option)?

Thx


1) Yes, it's copy on write, and requires such support from the underlying filesystem (Btrfs, maybe XFS not sure of what others support it)

2) It's intended to be treated 100% as if it had been copied the classical way. It's not a hardlink or a symlink and can be treated as a completely new file


Support for reflinks in XFS for a while now, although it requires a mkfs time option to enable it. That option was just enabled by default in xfsprogs-5.1.0 which is now in Fedora 31 (still prerelease).


Cool => I'll then probably reformat on my root server the partition hosting the images of my VMs from EXT4 to XFS => then to make 100% backups from time to time I would just have to shut them down for a minute, do a "cp --reflink" of their image-file on the host/dom0, then I could start them up again and do anytime later the slow download of their "copied" file.

Any recommendations about a utility (working on a layer higher than the filesystem) that can synchronize only the chunks of a specific file that has changed since the last sync and which generates as final result a "normal" file (no behind-the-curtains database - just copy the identical data from the local/old file and the changes/new data from the remote location to get a normal up-to-date local file)?


I believe rsync is the utility you are looking for.


Damn, now I'm ashamed of myself - it's since forever that I'm using rsync... :)

I think that when I tested this with rsync a long time back I did a similar mistake like this one https://stackoverflow.com/questions/28819379/block-level-cop... (doing a sync not over a network) and/or this one https://www.reddit.com/r/linux/comments/3nhx0p/rsync_block_l... (doing a sync by adding/removing some bytes instead of just changing their values, as it would happen in the preallocated img-file of the VMs).

I'll test it again - thanks a lot for the info!!


Yea my memory was fuzzy and I knew it was planned and being worked on but couldn't remember if it had happened yet. I think F2FS and bcachefs are also supposed to support it but I'm not sure about their level of support either.


It's essentially duplicating the file metadata into a new inode with shared data extents and incrementing the extent backref count so that the extent tree with those extents shows that there are multiple referrers.

If either file is written to, it's COW.

You can treat both files as if they are normal files, as if they're identical copies. It's just that the duplication doesn't take up additional space other than metadata.

A possible caution, just in general, on Btrfs is that each subvolume gets its own stash of inodes. In a subvolume, inode numbers are unique, you won't see the same inum in use more than once. But inode numbers aren't unique on the volume. So two completely unrelated files can have the same inum but they would be in different subvolumes (including snapshots where this is particularly noticable because a snapshot is really just a pointer to the parent subvolume and all the inums are unchanged.)


Q2: (not 100%) this probably wouldn't apply if crossing fs boundaries, and so cloning to another filesystem would probably occupy more space than the source one. Also, many save out files to temporary copies, remove the old file, and move the new one into place. That likely wouldn't track either. Not really 'immediate safety' of the file, but secondary concerns.. Similar caveats apply for traditional soft/hardlinks and sparse files.


add (2018)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: