It tests whether or not the language has cheap/easy concurrency. C only works at the system thread level, but the languages that beat it use built-in lightweight concurrency.
"At this writing Cheap Threads is beta software. I am releasing it without having yet used it in a real application, beyond the artificial drivers used for testing and development."
This is not a compelling alternative to what Erlang has to offer.
Supposing you do get cheap concurrency in place, you still have to implement the logic for your application, and you're going to do that in C. If you're writing something roughly as low-level as a kernel or a device driver, super. If it's a business application with rapidly changing requirements, good luck.
Well, now you're making a different argument. You've gone from "Erlang beats the crap out of C/C++ on a benchmark", to "Erlang beats the crap out of C/C++ in this particular circumstance, unless, of course, you happen to make a fair comparison to C/C++, but that would be hard, therefore Erlang is better."
What? It's not at all clear that you can reproduce Erlang's concurrency ability in C. Someone gave a link to an untested, unknown library that hasn't been tested in real situations hardly at all and that's supposed to be equivalent to 2 MLOC codebases with 99.9999999% uptime?
I was suggesting it to be fair. It has not yet been demonstrated because I have not seen any benchmarks to the contrary or anything approaching real data, and certainly no instances of this occurring in real production software.
Thus, the opportunity to make a fair comparison never arose, so you have some work to do before you say "thppt".
The point was it's easier to get correctness in languages like Erlang, irrespective of what speed is being achieved. Of course, if you're happy with buffer overflows and core dumps, more power to you.
It has been written _once_ in C. That does not mean that re-implementing it in C again would be a good idea or would get you the same set of benefits as the existing Erlang VM and its associated libraries. Just because one particular team has accomplished a task in C does not mean that any random collection of C coders have the skill or experience to do the same.
(A pedantic point perhaps, but the quote you reference said _reproduce_..._
It's not whether or not the language has cheap or easy concurrency, but the implementation. There's nothing inherent in C that dictates where threading works. These benchmarks just happen to use POSIX threads, which on a Linux system, happen to have a one-to-one mapping to kernel threads. In many circumstances, this is what people want. If they want something else, they can implement that, or use a pre-existing library.
What such optimizations are possible? One that comes to mind is transactional memory which is being implemented by Azul and Sun, but I don't know of any others.