The thing that interests me, though, is the idea of modifying your hard drive firmware for better performance.
My understanding is that the effective width of the write head is 10x the width of the read head... E.g. with the right firmware, it should be possible, if you are okay with a write-once medium, to write the outermost track, move the write head in 1/10th what you'd normally move it, then write the next track, etc... and get 10x the space out of the drive you normally would. In theory, the read head wouldn't have trouble. (of course, this would be write once storage, as the effective width of your write head is still pretty huge; but for a bunch of things? I can totally work with that... if more than X% of a drive was garbage data, I copy the good data to a new drive and reformat the old one. Done.)
I hear rumors that both the major drive manufacturers are actually shipping drives with this technology, but are only selling those drives to really big players, for some reason.
Here's a reasonable reference to the 'shingle' technology, and he roadmap for the rest of us:
but that's the thing, with the datasheets (and, well, a lot more skill than I personally have) we should be able to setup something like shingling on the cheap disks we have today.
Of course, from reading the article, I'm not sure I'm any closer to that particular dream.
Shingled writes require a special asymmetrical write head, you can't do it with current drives. Actual shingled write drives are not yet shipping AFAIK.
I'm just using shingled writes as one example. Your kernel could, for example, more efficiently reorder reads and writes with more information about the physical drive layout. Hell, just removing the bad-sector remapping (and moving it up to the kernel or the like) would help solve the performance degradation that remapped sectors cause during apparently sequential reads/writes.
Ignoring the doubters, there are plenty of cases where custom drive firmware might be useful.
I read online (probably on HN or similar) that Amazon Glacier is using drives with custom firmware that keeps them spun down to 300rpm so that more drives can fit in a rack without power and cooling concerns.
That's certainly an interesting case that certainly wasn't possible without drive manufacturers stepping up to Amazon's wishes. Being able to do custom mods like this to your own disks is pretty excellent as well.
I'm sure the people who make the drives are trying to get as much performance as possible from the firmware. They're also working with information you won't have.
While this is true they are also optimizing for a wide range of operating environments. If you could reduce the size of the optimization space you could get a better result. There are many places where I could think of better behavior out of the disks, their current interface is somewhat limited and very much a black box. If you could open it up a little and maybe provide for some extra operations you could gain quite a bit.
For example, most RAID systems don't really care so much about the first error on the disk, if the disk fails to read we can save a lot of time by not retrying too much and just go to build from the RAID. If by any chance this is the second (RAID5) or third (RAID6) error than you want a much stronger retry logic. Current disk firmwares do not allow for such logic.
Everything worth doing isn't already done yet. I'd rather encourage others to try novel ideas and approaches. Most won't work, but a working model in this space could become a great company. (Because the potential market for smarter disk interaction is huge.)
>I'm sure the people who make the drives are trying to get as much performance as possible from the firmware.
Huh. I think it's fairly common that companies engage in price discrimination by producing a lot of the same hardware, then crippling the hardware sold to the lower-end. Note, my example of hard drive manufactures doing this has to do with the next bit of your quote:
>They're also working with information you won't have.
So the 'crippling' I whine the most about is the difference between 'consumer' and 'enterprise' hard drives.
If you aren't running a hard drive in a raid, if it's just one drive in a desktop, generally speaking, if there's a problem? you want the thing to keep retrying, if there is any chance at all that it might be able to resolve the problem.
If it's just one drive in a desktop, it's almost always best to do something that will make the drive go slower than to cause the drive to fail.
My situation? where drives are sitting in a RAID? almost the exact opposite.
So yeah; me? I spend twice as much money to get "enterprise" drives that are almost identical, mechanically, but come with slightly better firmware. Firmware that just fails, rather than waking me up in the middle of the night.
(A friend of mine has been telling me: "Luke, a hung drive is just a special case of a slow drive; You need to monitor read/write latency and proactively fail slowish drives. check out blocktrace" - and he's probably right.)
Note, WD has TLER, which they say you can change with WDTLER.exe. In my experience? works on about half the drives you try, and even then those drives are far more likely to get slow (but not completely hang) than an 'enterprise' drive.
Now... let's talk about bad sectors. Filesystems have been handling bad sectors, well, for most of my life now. they can do it fairly well.
The problem with letting the firmware handle bad sectors is that the OS doing read/write reordering assumes that if you write sector 559 560 561, those are physically sequential. Once the hardware firmware remaps sector 560 off into the fucking boonies, my nice sequential read is now completely fucking random... and way slower. My point is that something like ZFS can handle bad sectors way better than the drive firmware, because it's got a lot more information. A lot more information in the case of read errors... all the firmware can do is hang you up retrying; the RAID layer could actively grab that block from another drive.
So yeah, they have information I don't have... and my computer would go dramatically faster if I could have that information. My pager
From what I know when a drive "reallocates a sector" it actually reallocates a track or something very close to that. So that at least for the rotational sequencing the performance will not change that much. Ofcourse, the track that used to be just one track seek away now became further off.
There are also several places along the way where reallocations go to and the drive tries to find the closest one to the reallocated tracks to avoid too large seeks.
My understanding is that the effective width of the write head is 10x the width of the read head... E.g. with the right firmware, it should be possible, if you are okay with a write-once medium, to write the outermost track, move the write head in 1/10th what you'd normally move it, then write the next track, etc... and get 10x the space out of the drive you normally would. In theory, the read head wouldn't have trouble. (of course, this would be write once storage, as the effective width of your write head is still pretty huge; but for a bunch of things? I can totally work with that... if more than X% of a drive was garbage data, I copy the good data to a new drive and reformat the old one. Done.)
I hear rumors that both the major drive manufacturers are actually shipping drives with this technology, but are only selling those drives to really big players, for some reason.
Here's a reasonable reference to the 'shingle' technology, and he roadmap for the rest of us:
http://www.theregister.co.uk/2013/06/25/wd_shingles_hamr_roa...
but that's the thing, with the datasheets (and, well, a lot more skill than I personally have) we should be able to setup something like shingling on the cheap disks we have today.
Of course, from reading the article, I'm not sure I'm any closer to that particular dream.