I work at Prezi.
We have about a petabyte of data.
It's usage data coming from the product and the website.
Clicks in the editor and such.
Then we have a data warehouse with cleaned and accurate datasets, that's much less.
We are on AWS, we use S3, EMR for Hadoop, Pig, Redshift for SQL, chartio, etc. We have our own hourly ETL written in Go which we will opensource this year.
I recently talked at Strata, here's the Prezi:
https://prezi.com/d1889jmlziks/strata-2014/