Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Those numbers make me think of "store all the things" rather than useful statistical data.

1PB is arguably enough data to store genetic variation across all human beings.



1pb / 100gb / genome = 10,000 genome sequences. And that's just raw data from one platform. If you're interested in e.g. splicing diversity you would want to do long read RNA sequencing. Leaving room for intermediate results (alignments, assemblies) you would only have room for a thousand people.


Thanks for picking up on this :) As I said, it's arguable.

I was working on the principle that the effective population size of humans is 10,000.

(And your genome is oversized, no? 3 billion base-pairs is less than 1 Gigabyte)


I'm referring to guys I know who have worked or presently work at CERN.


In addition to CERN, the LSST is expected to generate 15 terabytes per day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: