Can I ask a non-technical question about analysing the real-time web...
What's the business benefit? I don't think enough happens 'real time' for it to make much of a difference - reading a few blogs, news websites, digg / reddit & trending topics on twitter is enough to catch up on what happened in the world in a day.
Has anyone actually thought of anything that could make this useful?
The realtime web is basically about breaking news. That's a significant proportion of search queries, perhaps 20-30% (you can quantify with the AOL query dataset).
I don't personally care so much about 5000 Tweets saying "Michael Jackson RIP". That's ultra-mainstream breaking news that you would hear about anyway.
Instead the interesting thing about Twitter is that it's gotten big enough to explore the tail of breaking news in areas I care about, e.g. conferences and events that I want to follow without being present. A lot of that stuff would never make a full news piece or journal article, but is important to gauge attitudes and trends.
Basically it reduces the threshold of a "least publishable unit", and in doing so unlocks a large amount of breaking news that would not otherwise get out there.
Has anyone actually thought of anything that could make this useful?
Yes. This is part of what we are doing at http://causata.com/. Understanding the actions of people in near-real time is important if you want to target them with the right information, ads, offers, etc.
But why does this have to be real time and not, say, 4 hours delayed? If a person is interested in soccer at 8am, then they probably still are interested in soccer later in the day.
What's the business benefit? I don't think enough happens 'real time' for it to make much of a difference - reading a few blogs, news websites, digg / reddit & trending topics on twitter is enough to catch up on what happened in the world in a day.
Has anyone actually thought of anything that could make this useful?