Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For any small data set, the inefficiencies and Library Hell of $tooling don't matter... much.

For any non-trivial data set, the efficiency and internal consistency of $tooling is critical. Throwing stupendous hardware resources at a problem because the $tooling is unstable rubbish is the wrong approach.

For any non-trivial data set, the data must have a cohesive reason to exist. In other words... Big Data Garbage In, Big Data Garbage Out. Telemetry is often BDGI.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: