It's not like all forum-software-innovation stopped in June 2005 when the 2 of us launched reddit to the world.
The hard part is going to be quantifying "quality of submissions" in a scalable way. We thought a lot about this and while it's not perfect, the vast majority of content on reddit across those half million communities is indeed good.
It's a fascinating problem that I hope someone can solve -- improve on Steve's hotness algorithm!
Right now, the primary thing that causes something to succeed on reddit is the rate of upvotes. Anything that takes time to upvote will be less likely to succeed, because it will receive its initial upvotes at a lower rate. (It takes at least an hour to upvote a great new yorker article vs. something that will be voted based on the title alone or a 1 second click.)
To fix this, you need to track click -> upvote interval and correct for this.
This is the main reason why subreddit quality goes down with size. Only the extreme head of the "upvote rate" distribution has an opportunity to succeed when the subreddit is large, so the "upvote rate" drowns out the "upvote ratio" as a factor.
Isn't this an integral problem to the reddit model though, that you can point to half a million communities and say "Look, so many people doing so much good" while what many will point to is "look, you've made thousands of dollars from celebrity leaks" and "You've got huge communities of people sharing images of underage girls.
I guess my commentary would be that there's a lot of places for people to be pleasant to each other and to discuss their shared interests - be it enthusiasts forums or facebook. The risk is being the place people go to be abusive and share their degradation of other people, and it's difficult to just take the rough with the smooth in that respect, when other communities are held to account for that sort of behaviour.
I would agree with you that the vast majority of the content is indeed good unfortunately the bad is often concentrated into a few sub-reddits and at reddit scale that still is a lot of bad unfortunately.
I think it's an interesting issue because the primary issue is what interests people, not the website itself. If a majority of people want to concentrate on the bad, then the bad shows up more. If the mods or admins make the site such that it's impossible to concentrate on the bad, then that would involve some kind of censorship that could be very biased towards someone's definition of good.
This is a great point. There is a demand for violence. It's counter-intuituve and non-PC but people pay good money to see it. It has nothing to do with redit, IMHO its more of a media phenomena. Look at the middle east.
i didn't know reddit data dump's were available, other than crawling with the api. I have plenty of hardware, would love to play around with the data. Could you make it available on an amazon s3 bucket or something ?
It's not like all forum-software-innovation stopped in June 2005 when the 2 of us launched reddit to the world.
The hard part is going to be quantifying "quality of submissions" in a scalable way. We thought a lot about this and while it's not perfect, the vast majority of content on reddit across those half million communities is indeed good.
It's a fascinating problem that I hope someone can solve -- improve on Steve's hotness algorithm!