Source: I'm an author of the research in question (but unaffiliated with this blog).
Note to mods: article title is "Audio Fingerprinting using the AudioContext API". Submitter title is "Sites are using audio (no permissions needed) to track users", which may violate the site guidelines.
Coauthor here. As it turns out, this is one of three papers released near-simultaneously that uncover the extent of tracking on TVs or IoT devices more generally. I've written up a survey of the three papers and what I thought were especially interesting findings, along with some thoughts on why targeted advertising as a business model for TV platforms is harmful to users: https://twitter.com/random_walker/status/1177570679232876544
Just curious, during your research did you face any issues due to the auto-update feature at the app/device level which would set you back. If so, how did you work around it other than disabling it ?
I noticed you include gstatic.com and a subdomain of cloudfront.com in the tracker domains list. Are these really known to be used for tracking or are they included because they're controlled by Google & Amazon?
I saved the paper for reading later, so this may be already discussed in it, but enough information leaks through the Referer HTTP header when browsing the web in a traditional browser.
I've never inspected Recaptcha (on gstatic.com), but it does some degree of tracking, ostensibly to detect unusual usage patterns and pick who gets to help train Google's ML models with distorted street objects, and who's never shown the captcha window.
OP here. A small clarification: there are some legitimate criticisms of the Register piece that I cited in the first tweet, but I merely cited it as an example of why I think the hype is calming down. The arguments I make are independent of that piece; the limitations I point out have always been there, rather than something new that happened in 2018.
I also wanted to add a couple of points that I didn't get to in the Twitter thread.
Economics. Blockchain technologists seem to overestimate the extent to which new insights in economics are needed to understand cryptocurrencies and blockchains, as opposed to applying basic principles from economics and game theory. For example, a recent paper shows that thinking about miners and attackers in terms of stock and flow exposes important limitations of the security of Proof of Work. [1] I learnt of many other such examples at a recent conference on the economics of blockchains. [2] So I think a lot of the "cryptoeconomics" hype is misplaced.
Privacy. It's often taken for granted that decentralized architectures will improve privacy. This seems obvious given everything we've learnt about Facebook, but a better way to think about it is that decentralized systems exchange one set of privacy problems with another. I coauthored a paper a few years ago skeptical of the "decentralization ==> privacy" story in the context of social networks [3], but I think many of the arguments in that paper apply to blockchain/dApps that are being built today.
As I mentioned in the Twitter thread, my chief criticism of cryptocurrency since the beginning is it's naivete of the history and politics of money.
The discussions about economics applied here focus too much on microeconomic rather than macroeconomic arguments. So they end up missing the point entirely which is this:
Money isn't a neutral instrument and it's not something that can be decoupled from societal/political concepts and structures.
You can evaluate it as though it is neutral, but it has not ever been and will never be a neutral instrument because of the historical tendency for resources to pool and therefore become power centers, among other things.
Therefore if you view cryptocurrency as having the primary intention to de-couple commerce from coercive power structures [1], then you'll see immediately the historical problem. Namely, history doesn't favor distributed power, especially not when it can easily be co-opted by the powerful systems in which they rely.
In the best case scenario for crypto-utopians, millions of people transact and do commerce on a decentralized currency. In that case, whatever sovereign is most impacted, will then either outlaw commerce by crypto by force or require parity with the sovereign currency and control of the crypto currency through tax payments. This has happened multiple times over the centuries with alt-currencies and they are killed off by the most powerful (see: Most powerful economy/military) organization.
Politically, there is nothing different about Bitcoin than there was with the confederate states dollar for example.
[1] From the original whitepaper: "What is needed is an electronic payment system based on cryptographic proof instead of trust, allowing any two willing parties to transact directly with each other without the need for a trusted third party." https://bitcoin.org/bitcoin.pdf
In that case, why do we have ~200 nations instead of a single world government? Clearly, sovereignty matters to groups of people.
Italy alone used to be a dozen+ independent city states. Divided the world 200 ways seems like a lot unless you look at how much it's been divided historically. And in comparison, that's extremely centralized.
The history of political organization is driven by the history of military technology. Historically, the earth has flip-flopped from centralized to decentralized power several times, eg. Rome (legion) => feudalism (mounted knight) => Mongol Empire (mounted archers) => Rennaissance city states (crossbow, musket) => nation state (industrialization) => global superpower (atom bomb, ballistic missile). Basically whenever the dominant military technology is expensive to produce or benefits from large highly-organized armies, the trend is toward centralization, while whenever it is cheap to produce and gives the advantage to independent guerilla fighters, the trend is toward decentralization. There's some evidence (break up of Soviet Union, Eastern Europe, Islamic terrorism) that we're headed toward a new era of decentralization, driven by microchips, the Internet, drones, and cheap encrypted communication.
I agree with your main point, but not the last sentence. The complexity and expense of state of the art military technology is beyond the means of smaller nation-states, let alone rebel groups, and that trend has been accelerating since WWII.
Stealth, space supremacy, global surveillance, carrier groups, strike wings, nuclear arsenals, reaper drones and all the other crazy battlefield robot tech in service or development can cut to pieces any obsolete nation state military, let alone some rebels armed with cheap 3d-printed guns.
"Stealth, space supremacy, global surveillance, carrier groups, strike wings, nuclear arsenals, reaper drones and all the other crazy battlefield robot tech in service or development can cut to pieces any obsolete nation state military, let alone some rebels armed with cheap 3d-printed guns. "
So why isn't the US (which has all this tech) winning in Afghanistan over some rebels with AK 47s? Likewise Vietnam after literally a decade of fighting.
Weapons superiority is one factor in winning wars. Likewise, to address the gp's point, the history of political organization is partly driven by military technology.
There are definitely other factors at play, but have no illusions about it: if the United States didn't mind killing and maiming a large number of innocent civilians, it could just knock down entire cities and wipe out entire populations if it wanted to. The main reason that those conflicts are tough is because the US tries to make a show that they care about human rights instead of just shelling the crap out of insurgents.
And the current struggles of the American military only result in hundreds or maybe thousands of American deaths, which historically, would be a mere rounding error. Afghanistan, Iraq, Syria, et. al. are a mess, but they don't come to a significant human cost to America.
The United States and its voting populace doesn't really care about whether or not developing countries are wartorn, and quality of life is decimated—America and other states have completely destroyed the backbone of several societies in the Middle East and caused the deaths and relocation of hundreds of thousands of innocents, but you don't feel the pain of that when you live states-side. So the country has little incentive to quickly resolve these conflicts besides the bad PR; ongoing conflicts mostly buy time for protracted, proxy diplomacy with other major powers to lay claim to natural resources, and aren't viewed as conflict with a tangible human cost.
In both those conflict it was mostly not the US being at war with Vietnam or Afghanistan but with them supporting one political faction within the country against another which is a much messier business.
Italy used to be a dozen city states... And before that, Italy, France, Greece, Egypt, Turkey, Israel, Lebanon, a bunch of Baltic nations, England, and a bunch of others were all one state.
The number of states waxes and wanes throughout history, there's not a clear trend.
> Italy alone used to be a dozen+ independent city states. Divided the world 200 ways seems like a lot unless you look at how much it's been divided historically. And in comparison, that's extremely centralized.
What's interesting is to consider that while the world is very centralized in the sense you described, "day-to-day" operations are even more decentralized than ever. That is, (very generally speaking) order is established and maintained, and people go about their lives without huge, burdensome micro-management from "above" in the important aspects of their daily life.
It probably turns out that the most consequential decision-making affecting the modern person's daily life is happening very locally in the social & spatial/temporal sense. So, in that way, authority is effectively decentralized.
The long run trend is very obviously toward a single government. Consider that until the Sumerians established cities less than 7000 years ago, humanity was just thousands small traveling groups of nomads with no defined borders.
Sovereignty gets a lot of lip service, but in practice most people don't really want it. It's a lot of work.
> The long run trend is very obviously toward a single government
Is it? I believe there are far more countries now than 30 years ago.
We seem to have some more powerful international organisations now than a century ago, but earlier than that we had world spanning empires, and Europe was less than 10 countries before WW1.
I want to believe that we're heading towards a star trek like unified post scarcity planet, but I'm unsure we're not just seeing a local trend.
It's misleading actually because unincorporated territories or completely unmanaged or ungoverned groups could be counted in a "sultinate" and it would effectively mean nothing.
Much how King Arthur, Lord of the Britains was not known to the members of the anarcho-syndicalist commune.
>The long run trend is very obviously toward a single government.
Hmm, I think you will have to justify that statement, rather than simply claim it is a very obvious fact. Generally, there were fewer governments in the past as compared to the modern day.
That depends very much on how you define government. If you are a band of hunter gatherers does the "Chief" or "Shaman" that your band looks up to for leadership count as your goverment? If the answer is yes then there were way more governments in the past than there are in modern day.
If the answer is no then your point stands. However I tend to think that the answer is in fact yes and the trend does indeed seem to be toward centralized world government given a long enough time period.
“The world under heaven, after a long period of division, tends to unite; after a long period of union, tends to divide. This has been so since antiquity.” - Luo Guanzhong, Romance of the Three Kingdoms
our world isn't under heaven though, increasingly a lot of bureaucracy can be automated away - it just ISN'T because a fair deal of automation means taking away control from politicians. add to that the technical illiteracy in general in the political sphere and there is no push for automation apart from authoritarian governments like china
all we need is an event that demonstrates the need for a world government - for eg a massive asteroid headed at earth or a planetary scale food drought to see these systems implemented.
That's the fallacy of technocrats. The Soviets used to think that with their centrally planned economy. You should probably see Adam Curtis' Hypernormalisation documentary. It's an eye opener.
AI will declare an independent government, all human will join that government eventually. We will be batteries in exchange of great life styles. No matter how hard we try, we can optimize the resources better than machines.
What's the difference between a single government and a decentralized government? Under a single government there would necessarily be many subgovernments competing over resources, just as we have now under global capitalism.
> why do we have ~200 nations instead of a single world government?
It's easier for powerful central organizations to influence smaller countries which are divided against each other, while believing that they are relatively independent.
Also: latency of information transfer and the speed of light, which lead to local variance and central filtering of changes which are allowed to diffuse elsewhere.
We already have 3 superpowers today with potentially 2 more developing. The others don't matter at scale.
It's more likely that the cities will become the dominant form of a citizen's home sovereignty rather than a nation, with only a few national boundaries around the world. People are inherently tribal and human behavior isn't changing anytime soon but countries are quickly becoming stretched too thin to keep up with all the varied and changing populations and their needs.
>In that case, why do we have ~200 nations instead of a single world government?
The correct starting question is "why do we have 200 nations, instead of millions of individuals and families living autonomously?" and the answer is: human beings are basically tribal people, and military conquest.
And then the answer to your question about why 200 is that military conquest never got any farther before the atom bomb eliminated direct warfare between major nations.
Why don't we have a single registrar for all TLDs? Why don't we have GitHub as the single repository for ALL software in the world?
My point is that sovereignty, innovation etc. are valuables which centralization can't deliver. "Monopolies will maximize profits and therefore, under-supply the product" is Economics 101. So, yes, economics of scale and network effects do encourage centralization but there are counter-balancing forces.
What we need is protocols that allow a sliding scale as opposed to non-interoperable silos. DNS, TCP/IP, Git delivered exactly that. While GitHub has the scale and network effects, plenty of people are happily using GitLab, BitBucket etc.
> why do we have ~200 nations instead of a single world government
Deltas in space and time plus the fact face to face conversations give you a better idea of the sender's mental state. Comms over the Internet are untrustworthy, if you don't have prior experience with the sender.
Also, language, which is related to deltas in time and space.
The EU will be less of a stretch after the U.K. Brexits out of membership in the next nuclear superpower, in order to complete their transition to an offshore banking haven in the City. Or am I reading the situation wrong?
Only because current level of information technology is not able to support word government (in practice, not in theory). We can see the process of further unification looking at EU, which seems to a be current maximum level of unification.
Further EU unification is already falling apart, because they're trying to do it by dictate rather than voluntarily. And then there are language and cultural barriers.
A subtlety bears mentioning: History favors both centralized and distributed power, at different times, in different places, in quasi-cyclical fashion. Examples of history thwarting centralization include the collapses of: the Roman Empire, the USA in the 1860s, the 3rd Reich, Yugoslavia, the whole Warsaw Pact, and the USSR. Some of these were later "re-centralized" and some weren't. When things get too decentralized, centralization kicks in. When things get too centralized, decentralization kicks in. It's a balance.
I read the fourth turning some years ago and it repeats effectively the same argument. Previously the histories from Polybius.
From my study of it, it's closer to a ratchet. So while dissolution might happen at the highest organizational level, the hierarchies trend toward growing over successive cycles.
Also, the number of sovereign nations is just one measure of centralization of power. Economic structures are now more centralized than ever (with, in many cases, a handful multinational megacorps controlling most of each sector).
> Politically, there is nothing different about Bitcoin than there was with the confederate states dollar for example.
The big political difference is that Bitcoin transcends borders and national governments.
While the United States could have outlawed the Confederate dollar, the United States cannot outlaw Bitcoin because it can be sustained by miners globally.
Interestingly, the United States didn't have to outlaw the Confederate dollar, or the Texas dollar, because both of them were destroyed by their creators through over-printing.
>> whatever sovereign is most impacted, will then either outlaw commerce by crypto by force or require parity with the sovereign currency and control of the crypto currency through tax payments
That's not possible. Even if all the governments of the developed world coordinated to ban cryptocurrencies or impose taxes on them, this will be an opportunity for developing countries to use cryptocurrencies as a competitive advantage to catch up financially.
Very curious about how you foresee "developing countries to use cryptocurrencies as a competitive advantage to catch up financially" working if the cryptocurrencies are unusable in larger economies.
Why the downvotes? "traditional" cryptocurrencies are more useful in absence of a stable and reliable banking system and political system.
As smartphones spread across developing countries, hypothetical non-minable cryptocurrencies that works well on intermittent or local networking might become relevant.
> Economics. Blockchain technologists seem to overestimate the extent to which new insights in economics are needed to understand cryptocurrencies and blockchains, as opposed to applying basic principles from economics and game theory. For example, a recent paper shows that thinking about miners and attackers in terms of stock and flow exposes important limitations of the security of Proof of Work. [1] I learnt of many other such examples at a recent conference on the economics of blockchains. [2] So I think a lot of the "cryptoeconomics" hype is misplaced.
Are people suggesting that new economics are needed? I don't think most people are suggesting that. They are suggesting that blockchains allow you to implement economic incentive schemes that weren't previously feasible.
> Privacy. It's often taken for granted that decentralized architectures will improve privacy. This seems obvious given everything we've learnt about Facebook, but a better way to think about it is that decentralized systems exchange one set of privacy problems with another. I coauthored a paper a few years ago skeptical of the "decentralization ==> privacy" story in the context of social networks [3], but I think many of the arguments in that paper apply to blockchain/dApps that are being built today.
Nobody serious says that decentralization == privacy. However, decentralization does allow you to to implement strong privacy systems, like Monero/Zcash.
To a certain degree, like with many wildly held beliefs, dispelling biggest cryptocurrency zealotry becomes a whack-a-mole game: of the thousand disprovable beliefs, any given person presented with any given disproof will claim "Well I certainly don't believe that, so you've not done anything against my specific (possibly shifting) 17 beliefs".
In particular, re: "Nobody serious says that decentralization == privacy" -- this is presented as tautology on regular basis, by people that certainly hold themselves as serious and I'm willing to take as such, on Hacker News and other IT-related forums, as well as many other venues.
"Are people suggesting that new economics are needed? I don't think most people are suggesting that" - a lot of more zealous cryptocurrency supporters are very much on the "this is like nothing we've ever seen before" bandwagon, and/or are explicitly desiring as a goal/hope a massive disruption of economic and financial systems.
Just like, if there are N Christians, there are in my experience N+1 Christian belief systems, so it appears to be the case for cryptocurrencies. People more patient than myself are trying to make a dent by tackling those N+1 arguments, hopefully cognizant that for any set of believes P, they'll be presented with those exclaiming set of beliefs Q which obviously has nothing to do with P, those people into P are deluded/irrelevant, and they would never agree with them (unless it offered them temporary advantage:).
> In particular, re: "Nobody serious says that decentralization == privacy" -- this is presented as tautology on regular basis, by people that certainly hold themselves as serious and I'm willing to take as such, on Hacker News and other IT-related forums, as well as many other venues.
I think maybe you're reading them uncharitably. Decentralization does enable privacy. But it isn't identical and equivalent to it.
I don't think those people are saying a new economics is needed. I think they're saying that crypto allows you to do new things, many of which traditional economics would have liked to have done, but didn't have the technical tools to do. Though maybe we're listening to different people.
I've been trying to tell people similar things for about a year now. Most people just think I haven't grasped how truly amazing and revolutionary blockchain is.
Thanks Arvind for all your work. You're doing some of my favorite work in CS. Every time I hear about an awesome project in privacy/security it turns out you're involved.
Full disclosure I started the intercoin.org project. So I am speaking from about 2 years experience working on and thinking hard about these problems, such as doublespending and distributed hash timestamping.
There have been very interesting PARALLELIZABLE distributed byzantine fault tolerant systems being built, like MaidSAFE and Holochain. They are based around Distributed Hash Tables and have no centralized bottlenecks unlike the miners in proof of work. This is much older than cryptocurrency, we are talking 2002 - Kademlia and Merkle Trees, used in eg BitTorrent.
That is the key. Any technology, like a blockchain, where the entire network has to store all the data, scales ridiculously poorly. To coin a phrase, it’s EMBARASSINGLY UNSCALABLE.
You can call the opposite of that sharding, but I will call it PARALLELIZABLE. The good news is future is simply to take EXISTING systems (eg the Web) and INCREMENTALLY make them end to end encrypted and make the backend into a DHT where each activity or token is watched by SOME but not all computers.
The current Web topology is scalable but TERRIBLE for security, as evidenced by the steady stream of hacks and leaks and untrustworthy behavior by platforms. We throw up our hands like after every school shooting, as if there is nothing we can do. We can, and blockchains are not necessary.
More details can be found here, including mathematical results from 2009 on the probability of a successful double-spend attack as a function of how many computers watch each token. The results are perhaps surprising:
https://forum.intercoin.org/t/intercoin-technology-the-ledge...
If you want to discuss architecture of DLTs in detail, you are welcome to post in our forum. (The guy arguing in that particular thread is the chief crytographer of Ripple, he signed up to debate ideas of the architecture.)
Anyway the short story is that any “blockchain” system with global consensus simply can’t scale, not enough to support actual transactions. Has a single ETHEREUM token been used for its intended purpose in daily transactions? (Cryptokitties is the closest and it nearly brought the whole system to a halt.) It’s all been relegated to speculation on upcoming things, and this is what killed the dream. The exchanges are centralized databases. The actual ledger they post to when you cash out can handle 10 transactions per second regardless of the number of computers on the network. The same is NOT true of almost any other distributed protocol (email, web, etc.)
> there are some legitimate criticisms of the Register piece that I cited in the first tweet, but I merely cited it as an example of why I think the hype is calming down.
On the one extreme, you've got the evangelicals who insist that everybody will benefit from cryptocurrency today and the only reason it hasn't replaced all of modern finance is because of a conspiracy. These people ignore anyone's attempt at a rational counterargument.
On the other extreme you've got people who are very strongly against the idea of cryptocurrencies. They see that tremendous harm can come to individuals who jump into this system unprepared and who aren't aware of the security or privacy implications (for example), because the previous group doesn't admit that there are any. They are also able to see through the greed and the overhype and as a result dismiss Bitcoin for being nothing.
The thing to recognize is that these are both extremes. They both contain some truth, but it feels that the public discourse is largely just a fight between the extremes rather than any meaningful discussion.
If you want to contribute something meaningful to the discussion, be honest. You have some good points in this followup and in your twitter comments, but your opening line is just blatantly dishonest.
> For example, a study of 43 use cases found a 0% success rate
So, apparently nobody is buying drugs with cryptocurrency anymore. Seriously! This line is demonstrated to be blatantly misleading if you give it TEN SECONDS of thought. Did you? In truth, I think you did give it thought, and then you dismissed it because it's a good line that supports the way you want things to be and supports the message you want to push.
> but I merely cited it as an example of why I think the hype is calming down.
You said a misleading thing; own up. Don't downplay it. Nobody expects an apology or anything, but just admit to yourself that it was to some degree dishonest, and think about whether that's really an attribute you want to preserve in your writing and in the way you present yourself, and whether exaggeration which doesn't present itself as such really does anything to advance the discourse.
I find it interesting that OP says public crypto failed because of both fundamental misunderstandings and immature tech (internet trying to compete with newspapers in the 80s). Eventually, the internet was able to compete with newspapers and I think eventually public blockchain will compete with centralized companies.
Networks can benefit from economies of scale without the rent seeking bottlenecks of centralized agents. I agree with op that in some cases centralization naturally happens but it doesn't follow that this is true in all cases.
On a diff note, has anyone read the Master Switch that OP recommended? Worth reading?
I'm scratching my head trying to think of any significant distributed and decentralised economic entities.
In fact I think this is the Big Lie of Disruption. The reality is that instead of breaking up existing monopolies, "disruption" simply creates new, bigger, even more centralised and powerful monopolies.
Economic systems are rather like cellular automata. They either splutter around for a while and die out, or they take over the board and assimilate everything.
Although it claims to solve this problem, blockchain doesn't - not even close. You can only solve it by consciously moving entropy around the system to keep it in a permanently metastable state - supporting smaller structures that generate new information, and splitting up large structures that threaten to engulf everything around them.
> Economic systems are rather like cellular automata. They either splutter around for a while and die out, or they take over the board and assimilate everything.
This metaphor is amazing. I wanted to reply that cellular automata can in fact produce highly-localized, stable systems (like glider cannons in Conway's GoL). Then I realized that while those systems are stable, they are also so brittle that a single flipped bit will destroy them incredibly fast. That sounds a lot like economics.
> I find it interesting that OP says public crypto failed because of both fundamental misunderstandings and immature tech (internet trying to compete with newspapers in the 80s). Eventually, the internet was able to compete with newspapers and I think eventually public blockchain will compete with centralized companies.
Keep in mind that both of those technologies took years. TCP started development in 1973. It didn't fully mature for another decade, and then it wasn't mainstream until 1994-1996.
And even then it was the web that drove the adoption of the internet, not the other way around. Until the web, the internet was more of an academic curiosity.
> Networks can benefit from economies of scale without the rent seeking bottlenecks of centralized agents.
But there are side effects of non-centralized agents -- everyone needs to agree upon running the same algorithm (forking, security holes, etc). And immature networks can suffer a majority attack.
Meanwhile from a security perspective, there's greater efficiency with a centralized system. One security hole can be fixed for everyone on the network in one spot.
It is not reasonable to compare Blockchain versus TCP. Only advocates do this because despite Blockchain technology being completely useless, it's a simple line that excuses all lack of progress / usecases, whilst simultaneously implying it will still take over the world.
It's a dishonest line to come out with and people on Hacker News should know better.
The point wasn't to defend bitcoin and blockchain, but just the opposite, to point out how immature the technology is to something vastly simpler, moving data from point A to B on the internet. And TCP took years to get right.
"The Master Switch" is a good book to read. If we apply the ideas from that book to the blockchain space, we'd expect large players to eventually dominate, as Apple,Amazon,Facebook,Google... dominate today. I.e. you might expect companies (such as the Consensys conglomerate that invests in the Ethereum ecosystem) to dominate if the space continues to grow. One possible counter to this line of thought is projects such as Ethereum, Augur, 0x, and Gnosis have non-profit foundations behind them and are working on protocols and APIs and thus do not have the same for-profit motive to grow large.
when we were talking about the internet competing with newspapers back in the day.. What we see now isn't what we wanted. It's still mostly the same entities in control. In a sense, the internet lost.
And yet here we are freely discussing the diverse opinions of many involved individuals around the world, around a Twitter thread. Back then the only relevant opinion you'd hear of was the journalists'.
The name for what you're doing here is "argument to moderation" and it's a fallacy.
The fact you're able to imagine two different positions and label them both "extreme" does not in fact mean the truth lies as you'd prefer somewhere in the middle.
"Four is an odd number"
and
"Four is an even number"
... can be portrayed as two extremes with the option to take some vague bullshit middle ground position like "Maybe four is sometimes an odd number" but actually the situation is just that one of them is right and the other is wrong.
> "Four is an odd number" and "Four is an even number"
How on earth is this a good metaphor for the positions being discussed here?
How about "all integers are odd" and "all integers are even"? That's much more similar to the views being discussed above, roughly "all blockchain applications are bullshit" v.s. "all blockchain applications are beneficial".
Is the statement, "some blockchain applications are bullshit" or "some blockchain applications are beneficial [to some group]" really a "bullshit middle ground"?
If your answer is "yes", then what you're doing is exercising willful ignorance -- there are examples of both bullshit and beneficial applications in this thread and that proves both extremes false.
If the answer is "no", then take a break from the ill-fitting metaphors. A metaphor should preserve the novel properties of the thing it mirrors while illustrating things in a way the reader can more easily understand. If it doesn't preserve the properties being emphasized -- say, if it reduces a continuum of possibilities down to a binary option -- then it's not a good metaphor. It's misleading; its author is pursuing some goal other than the truth.
So, it might seem that getting so much heat against this one component of your writing is just overreaction. Realize that I'm using the above comment also as a reply to the hundreds of people I see all over HN (and elsewhere) who align themself to one extreme or the other.
Extremism is ruining the discourse in the public sphere. It happens in politics, but now more in tech too. Everybody seems to have an opinion and their goal is to push that opinion at all costs, including blatantly disregarding truths. How is this possibly a good thing for communication? Increasingly if I want to have any meaningful conversation -- one where I might get something from it (e.g. one where I will understand more sides to an argument, one where I might change my opinions, one where I might improve my philosophies), I have to go hunt down the people who I know are truthful. That almost always takes me offline, or at least out of the public sphere. And it seems like an oversight that all the real discussion of a thing has to happen in private. It feels like so many lost opportunities that could have benefited more people.
OP here. The number and variety of special-purpose computing devices that existed before general purpose computers is astounding. The surprising (to me) conclusion is that the main impediment to the development of computers wasn't technology. After all, Babbage's machine could have been built in his time if funding hadn't run out.
Rather, the limitation was that people didn't have the abstractions, vocabulary, and mental tools to properly conceive of general purpose computers as a concept and to understand their usefulness. They couldn't see that devices as seemingly disparate as tide prediction machines[1], census tabulation machines, and loom controllers were all instances of a single, terrifyingly general idea.
From what I can tell, Babbage mostly understood this, but it was Ada Lovelace who grasped it fully. But her writings weren't understood in her time and had to be "rediscovered" a century later. For example, she wrote [2]:
Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.
This leads me to wonder: what abstractions are we missing today that will be obvious to future generations?
BTW I have a follow-up thread on the optical telegraph, a form of networking that long predates the Internet. [3] My long-term goal is to teach a course on computing/networking/information processing before computers, with a view to extracting lessons that are still applicable today.
No, that's a misreading of computing history. The big problem, all the way into the early 1970s, was storage, not compute power. Leibniz built the first mechanical multiplier in 1694. But significant amounts of storage (kilobytes) didn't become available until the 1950s, and memory was still a million dollars a megabyte in the early 1970s.
Read the history of the IBM 400 and 600 series, the business line from IBM. First machines that could add and count, then print, then multiply, then divide, and then a long struggle to get some memory. Machines had a few memory locations, then tens of memory locations, then a hundred or so. The people involved knew they could do more if they could store more info. It wasn't a conceptual problem. Finally they got to the IBM 650, with drum memory, the first business machine we'd recognize today as a computer. Knuth learned on one of those.
Read "IBM's Early Computers", von Neumann's report on the EDVAC, learn what Zuse and Atanasoff did, find an Analytical Engine simulator and learn what the machine could do, and read up on the history of punched card and tote machines.
Regardless of whether your correct or not, this is still a disrespectful comment.
Nevertheless, he's referring to Ada Lovelace's comments from 1843 and saying they weren't understood for ~100 years i.e. 1940s. That pre-dates most of what you're talking about.
In the twitter thread the OP clearly has read up on the history of punched cards and the history of IBM machines.
My point is that machine arithmetic preceded machine data storage. Early computing was a struggle to get something done with very, very little memory. Like 5 to 20 numbers. Babbage wanted to build something the size of a locomotive for storage - a huge drum of counter wheels, able to store 1000 numbers. Atanasoff had a rotating wheel of capacitors, refreshed on each turn, like DRAM, which was good thinking for 1939. The code-breaking machines of WWII were all very limited memory, more like a Bitcoin ASIC than a general purpose computer. IBM was plugging away with punched cards, which were their form of permanent data storage. Shannon wrote a paper on the minimum memory requirements for a telephone switch.
In the IBM punched card world, "sorting" meant ordering, and distributing into groups was "selecting". Sorting was usually required before selecting, but if you only needed two or three output groups, you could select cards using a collator, which was capable of moderately complex logic operations. You might do that for "customers who are past due", for example.
Building a stored program computer wasn't a conceptual problem. It was that, despite massive efforts, there was nothing to store the program in. The concept of algorithms dates back to at least the 9th century, and probably back to Euclid.
(Although there's a long history of programmable cam-driven machinery. I've seen the Jaquet-Droz automata. built in the 1770s, demonstrated at the museum in Neuchâtel, where they are operated once a month. That was the high point of cam driven machinery for a long time.)
First off, thanks for the comment—pretty interesting stuff. One thing continues to bother me though.
> My point is that machine arithmetic preceded machine data storage.
> The concept of algorithms dates back to at least the 9th century, and probably back to Euclid.
I don't think that's the issue at hand. Special-purpose computers implement algorithms and may use arithmetic—those are not the key differentiators from general purpose computation. Perhaps they were important first steps, but clearly what e.g. Turing/Church did was something else entirely. And I'm aware Leibniz was very much concerned with the problem earlier on, but where I get a bit fuzzy is whether general purpose computers really needed the more fleshed out results of e.g. Turing (or perhaps a predecessor) in order to really build a general purpose computer—and whether that was a major limiting factor even after we had access to the necessary memory, or if we knew conceptually that what we were shooting for all along was fully general purpose computation and we were really just waiting for the hardware to be possible.
See Eckert's work at Columbia University.[1] In the early 1930s, IBM came out with punched card equipment that could multiply. Eckert managed to kludge several IBM tabulating machines into a sort of programmable computer. "The control box had settings that told the different machines what they should be doing at a given time and these gave the broad mechanical programming of the problem. The punches on the cards were used to program the specific details." This was all electromechanical, with relays, gears, and clutches, not vacuum tubes. IBM eventually turned that mess into the IBM Card Programmed Calculator. Still not enough memory for programs. Eckert went on to design many more computing machines, including the ENIAC. Finally, no moving parts, but still mostly plugboard-wired programs.
Eckert is credited with inventing delay line memory.[2] That was the first useful computer memory technology. Suddenly, everybody in number-crunching was building delay line computers - EDVAC, EDSAC, LEO, IAS, UNIVAC I...
All were stored program machines. As soon as there was memory, people started putting programs in it. That's when computing took off.
Interesting—though I'm not convinced that the IBM approach to generalizing computation would have led us to general purpose computers as we know them now. Sure, it was more general than what had come before, allowing some parameterization of behavior—but 'more general' and 'universal' may be worlds apart (I don't know enough of the details of IBMs machines to say how much though...). Or even an architecture technically capable of complete generality, but not doing it so well, may be worlds apart from an architecture based on a simple theoretical grounding of that complete generality.
Eckert it seems was inspired by Von Neumann whom he worked with on the ENIAC, and Neumann was an early appreciator of Turing's work on universal computation(from 1936)—no surprise that his eponymous architecture mirrors a universal turing machine not only in power but in form (i.e. "the concept of a computer able to store in its memory its program of activities and of modifying that program in the course of these activities"[1]).
In view of that, I'm inclined to give similar importance to Turing's theoretical work still. Maybe I'm overestimating the importance of the architectural insights it led to—but I wonder if, say, we even had the Lambda Calculus, but no Turing—would we have fully generally programmable computers as we do now (which are as general in practice as well as in theory, if that makes sense)? I'm not sure...
Eckert was building working hardware in the early 1930s, long before von Neumann was involved.
There was a performance penalty to doing only one thing at a time, the stored program way. Stored programs have a lot of overhead - fetching instructions, decoding instructions, doing instructions that didn't directly do arithmetic - that were not present in the plugboard-wired machines. Those usually did multiple operations on each cycle. When clock rates were a few kilohertz, this mattered, a lot.
One result was a split between "scientific" and "business" computers. Scientific computers were usually binary, funded by the military, and were mostly one-off machines in the early days. Business machines were decimal, had to be cost-effective, and were mass produced. The two sides finally came together with the IBM System/360. By that point, both were stored-program.
As for an "an architecture technically capable of complete generality, but not doing it so well", that was the IBM 1401, a very successful machine with a very strange architecture. The Computer Museum in Mountain View has two of them working. It was a true stored program computer, quite different from any of the scientific computers. It had a much lower cost and parts count, and ran most of America's middle sized businesses for years.
It asserts authoritatively that is a misreading of computer science history - in history, there is rarely a single “correct reading”, but rather many interpretations that at time reinforce and at times weaken each other.
Using the imperative “read”, “learn”, etc is also heavy handed and unnecessary. It’s the kind of language bad teachers - the teachers who have to use their authority rather than mentorship skills - tend to use (“Check out those books for more” flows much better, for instance).
As a result, the comment comes a little too much across as “no, you don’t have the knowledge, I have the knowledge!” which is sadly can be a little too common in hacker circles and doesn’t really encourage open, fruitful conversations the way “You know stuff, I know other stuff, let’s put it together and see what we come up with” does.
> ... in history, there is rarely a single “correct reading”, but rather many interpretations that at time reinforce and at times weaken each other.
As I was reading the GP's interpretation of these historical events, it occurred to me how difficult it really is to get at a reliable interpretation of something like this: I've already read a couple books on the subject—but I can see room the cracks so to speak, where the GP's view may be more accurate.
That said, difficulty isn't the same as impossibility—and yes, it is technically impossible (or at least meaningless) for certain definitions of 'correct' interpretation, but if you're okay with correct up to "for all intents and purposes"—then it's a non-issue. For instance, in this case there is a pretty definite question we'd like answered: which innovation was the limiting factor on our original development of general purpose computation: increased memory capacity, or theoretical knowledge of general purpose computation—or was their role more alike than different?
Depending on how things actually played out, there are definite things you can say about that. For instance, perhaps we need at least 500 bits of memory for general purpose computation—did we have access to that while no one had yet thought to build the general purpose computer because we were still waiting for Zuse, Turing/Church? If so, we answer one way; if not, we answer the other way. The only case where it gets endlessly complicated, perhaps giving apparent grounds for claiming impossibility of correct interpretation, actually fits neatly into the last option I mentioned in the original question statement: that there wasn't a significant difference in the two innovations' roles as limiting factors.
It seems valuable to understand the difficult and limits of interpretation, but I think more harm is being done than good speaking so generally of "correct readings" being impossible.
> Using the imperative “read”, “learn”, etc is also heavy handed and unnecessary. It’s the kind of language bad teachers - the teachers who have to use their authority rather than mentorship skills - tend to use (“Check out those books for more” flows much better, for instance).
I really wish this was better known. It's a pretty consistent pattern and useful to be able to recognize. My read on it is that it's typically used as a defensive strategy—a way to prevent an interlocutor from questioning the speaker any further on the subject.
In case downvotes on the above confuse others, the comment Animats made is disrespectful but they are not some random Internet commentor, and in Jobsian fashion, disrespect is cool so long as you can back it up (and John Nagle can back it up).
Well I didn't read it as disrespectful, and that's my explanation of the downvotes (which doesn't have any value and that I only offer as a counterpoint here). I have no idea who that poster is, I don't care, and that shouldn't be a factor when considering rudeness tbh. As long as you can back up your reasonable claims and don't attack the person behind the keyboard, what's disrespectful ?
He's telling the OP to go and read up on various books with the implication that they haven't read anything on the topic. Where as from the twitter thread they clearly have read books on the topics mentioned (IBM and punch cards).
From the guidelines:
> Be civil. Don't say things you wouldn't say face-to-face.
I don't think the comment made by Animats meets this guideline.
Out of curiosity, if the comment started with "In my opinion that's a misreading of computer history...", with everything else left unchanged, would it still be uncivil in your opinion?
I'm aware of who Animats is but the OP is also a Princeton Professor of CS. Given that Animats has so much reputation on here, I'd hope he would be more likely to be civil and uphold the guidelines.
> what abstractions are we missing today that will be obvious to future generations?
I think we're missing a science of meaning, and (closely related) trust.
We have a science of raw data and bits (basic thermodynamics into information theory), a science of sending streams of bits (information theory), and provide rigorous numerical bounds for e.g. if you know the medium, how quickly one can signal.
We don't have a predictive, numerical theory of meaning, which is how interpretations of data or information cohere to the real world and how those meanings can be trusted and transmitted from one another in a robust way.
We're seeing the failure of meaning all around us today as we're drowned by real data and fake data and noise, as competing voices and groups spread selfish memes and narratives that ultimately obscure truth.
We're seeing the failure of meaning as our overcompetition in academia spawns a raft of fake or non-replicable studies and p-hacking, threatening to drown the signal with noise.
Anyway, there are hints of a 'science of meaning' in different fields: Graphical statistical models hint at how causality can be inferred from downstream facts, game theory tells us some (very limited) conditions under which selfish players can learn to trust one another, evolutionary theory tells us how groups can learn to cooperate and share models of the world. But none of these provide predictive and quantitative bounds, so as of yet, meaning is an art left for cultural leaders and manipulated by advertisers and politicians, and not yet a science.
A lot of the cultural and political issues that surround digital technology appear because there's a predisposition to consider all bits equally valuable and useful.
Another word for meaning could be quality. Socially and culturally, not all information is of equivalent quality and value.
In fact we use the profit motive and political power as metrics of quality, and in practice they're turning out to be a bad way to maximise long-term social value.
Rather, the limitation was that people didn't have the abstractions, vocabulary, and mental tools to properly conceive of general purpose computers as a concept and to understand their usefulness.
Maybe that was it or maybe society had not achieved a scale and a regularity where a more primitive and expension version of a programmable computer would have been useful (ie, where the cost-benefit exchange was obvious). I'd note that computers effectively came into use during WWII, where there was immense pressure to optimize the large-scale organizational activity as well as large scale industrial processes (the Manhattan Project esp).
There was also an era of “human computers”, and when someone ultimately decided to automate their jobs away, people gradually realized that a computer could do things we never thought of as computation before, and that information could be encoded as numbers, and a lot of wild things like that.
One thing I learned in college was a means of encoding a secret: n people all learn a separate point on the curve and the degree of the polynomial for the curve, and you encode the secret as the Y-intercept. That way any k of n of those people (depending on the degree of the polynomial) can recover the secret. I thought, that’s not so hard, and I asked why this wasn’t discovered centuries ago. It’s because encoding arbitrary information as a number is an idea that never occurred to anyone before computers, aside from things like bible codes and numerology that are usually lossy anyway. But you can do this with a pad and paper if you wanted.
Who was the guy who patiently explained IN THE 70S to an interviewer why general purpose computers would be needed? I remember hearing the interview and the guy genuinely couldn’t fathom why they’d be needed to replace regular libraries and so on.
This is really great stuff you are collecting. Would be awesome to read it in a more organized manner than a twitter posts: twitter really isn't so great for organizing knowledge. Don't you keep a blog or something? Maybe considering?
By the way, I remember you by your course on coursera, it was great. Thank you a lot for that.
The main reason I'm using Twitter for this is that it's a bit too preliminary for a blog post. I don't yet have as good an understanding of the history as I would like. This way, when I discover new stuff, I can simply add at a tweet to the thread.
But TBH I think Twitter is underrated as a publishing medium. For example, I've had probably 10x the number of responses from other people as I would have gotten in the form of blog comments.
In any case, I'm definitely planning to make this more organized once I'm happy with my level of understanding. At least a series of blog posts; probably a paper and/or online lecture.
> But TBH I think Twitter is underrated as a publishing medium. For example, I've had probably 10x the number of responses from other people as I would have gotten in the form of blog comments.
What was your impression of the quality of those comments, compared to the ones you get on blog comments?
> What abstractions are we missing today that will be obvious to future generations?
The first that comes to mind is rooted in the problem that our concept of "tech" is still incredibly mechanistic. Look at Foucault's "technologies of the self" and you realize that technology is not just something we create in teams at work using this or that set of tools. Rather than a "thing" outside of us, tech is very much already a part of us. We are compelled to develop it (externalize it; give it objective manifestation) simply by living.
A computer is us, but even the word computer is no longer sufficient; people needlessly mechanize humans when they think in those terms and this creates fear, fear of AI and fear of becoming mere batteries; meanwhile our fault-protection circuits have already accounted for this and we can trust them (while leaving them engaged).
We must zoom even farther out from the typical computer model in order to compute at the next level. "Tech" as an industry (in hindsight to be found a ridiculous and temporary construction) has turned too far inward. In fact there is nothing to fear outside, even in the foreign-to-tech irrational world. We can accept that world, live in it, give it the respect it seems now to unfairly demand, and learn it as a technological mode. We can bridge with it, build more technologies to abstract our relationship with it. At that point we become truly powerful, able to spin technology out of limitless resources and into limitless solutions.
Regarding externalized systems, a relevant quote from the classic computer game Deus Ex (2000):
> Helios: The checks and balances of democratic governments were invented because human beings themselves realized how unfit they were to govern themselves. They needed a system, yes, an industrial-age machine.
> JC Denton: Human beings may not be perfect, but a computer program with language synthesis is hardly the answer to the world's problems.
> Helios: Without computing machines, they had to arrange themselves in crude structures that formalized decision-making - a highly imperfect, unstable solution.
O'Neill likewise skewers the human-condition-social-upgrade-solution (referring to various -isms like Socialism) in his futurism work, _2081_. IMO it's the bipolar casting of the thought process there that is "wrong" if anything is. We are gradually becoming capable of merging and finessing subject and object using various technologies (I count therapeutic-environment-building and institutional life-examination-encouragement as examples of those technologies, but wouldn't exclude something more mechanistic than those) to where the question of "did a computing machine help or was it old fashioned social structuring" will seem pretty uneducated if it doesn't already.
If that was the original concept, I wish they did.
I mean, it's such an obvious one. I must have independently thought of it a dozen times as a teenager. Hell, the movie itself sort of makes use of it - after all, AFAIR, the Matrix is simulated in peoples' heads. Why would they go with this human battery nonsense is beyond me. I guess someone with influence over the movie script must have though people are too dumb for the plot to make sense.
It’s not important to the plot at all and they probably didn’t want to get into the task of explaining the processing stuff to the dumber half of the audience. Everyone in the 90’s was familiar with disposable AA batteries, but not everyone had used a computer much.
I asked the same thing several years ago and got an answer [0]: You're correct, they didn't care so much about that specific detail and let it be changed because it wasn't actually too important to the main plot.
Feyerabend explores the difficult process of scientific progress in the classic Against Method. He describes how society finds it difficult to develop new scientific theories to better explain new evidence because of our confirmation bias to conventional theory.
For example the earth was once understood to be at the center of the solar system, when telescopes could better observe planetary behavior it first lead simply to more precise epicycles.
In the book he points out that one way we tend to break through this hold on how we frame evidence, is by considering metaphors from the Arts. For example, religious writings helped inform the works of Copernicus and Newton.
So to generally answer your question, I think some insights will come to us from the Arts, like science fiction books/movies/games. For example, the show Black Mirror provides some ideas that we may at some point take for granted.
Near the end of your proposed course, I would heartily recommend including https://www.theatlantic.com/magazine/archive/1945/07/as-we-m.... It was the 1945 paper that introduced the ideas behind both hypertext and the science citation index. The two ideas famously got recombined some decades later by Google's PageRank algorithm.
Now reading that article, one might wonder how someone could have such a solid grasp of how computing and people could work together in practice when the transistor had not yet been invented, and stored program computers were first described only a month earlier.
The reason is that Vanevar Bush had been building and working with computers for close to 20 years!
> After all, Babbage's machine could have been built in his time if funding hadn't run out.
Funding ran out in large part because of cost overruns due to the fact that the technology of the time wasn’t capable of building the analytic engine design.
That's possible, but an alternative explanation for the cost overruns that I've read is that Babbage had terrible project management skills.
Wikipedia has this to say:
In 1991, the London Science Museum built a complete and working specimen of Babbage's Difference Engine No. 2, a design that incorporated refinements Babbage discovered during the development of the Analytical Engine. This machine was built using materials and engineering tolerances that would have been available to Babbage, quelling the suggestion that Babbage's designs could not have been produced using the manufacturing technology of his time.
(unfortunately overshadowed in Google Search results by a William Gibson book of the same name)
It gives a lot of color on Babbage, but yes the conclusion was that Babbage design basically worked, and could have been built. There were errors in his drawings that they had to correct, but nothing fundamental.
The group at the Science Museum spent over 6 years doing this! This is the group that holds most of his papers, drafts, and unfinished machines.
Although there are a couple things I want to follow up on. They weren't that specific about what computation they did. And does it still work today? It was extraordinarily finicky. It produced a lot of bit errors, as did mechanical computing devices that came later, which sort of defeated the purpose (it was supposed to calculate tables of logarithms and such with higher accuracy than humans.)
The examples built do indeed keep on working with non-prohibitive maintenance - the 2nd #2-design engine (built in the 2000s for Nathan Myhrvold), was on display at the CHM in Mountain View for 8 years with daily or twice-daily demonstration runs. It sadly went off display in 2016 (probably to go to Myhrvold's private collection) but I saw the demonstration a couple of times and can answer some of your questions:
1. The concrete computation performed was to use the Finite Difference Method (https://en.wikipedia.org/wiki/Finite_difference_method - hence the "Difference Engine" name) to calculate arbitrary polynomials of degree up to IIRC 10. By using Taylor Series, this method could be used to calculate arbitrary functions, like log and sine. This was in fact the same method used to construct logarithmic tables by hand at the time, and had similar nominal precision; the singular goal was to eliminate the bit errors rampant in the old, manual process.
2. The machine removed not just errors in calculation, but also in typesetting; about half the part-count of the original design was in its printer, which could be configured with all kinds of options for typesetting the results. It would output a "print preview" onto paper locally (this was not publicly demonstrated at the CHM because of the enormous mess of ink spills, but the machinery was run dry), and an identical wax mold ready for use in mass printing. This was because many of the bit errors in the existing log/sine/etc. tables were introduced not by the (human) computers, but by the multiple copying steps involved in transforming calculated values into printed pages.
3. Computation was quite reliable - the machine worked in base 10, and mechanisms were carefully designed to freeze up (and be easily resettable to a known-good state, as demonstrations showed) before introducing errors. As far as I know bit errors were unheard of in the demonstration runs. This reliability, like in later electronic computation, was the motivation for using digital rather than analog logic. (Finickiness was mostly limited to those halting conditions - it proved quite sensitive to clock speed (rate of crank turn), but only by the standards of the hand cranking used in demonstrations; connected up to a steam engine with 19th-century rate governors, input power could have been kept clean enough to run with long MTTF.)
They require tolerances that could be achieved in the day, but not reliably was certainly not standard practice. When the funding was cut they were basically running a research program to develop better machining. With the research program failed to bad project management is kind of irrelevant. My point was it was still in the area of being a research program.
I think an important difference between Babagge’s attempt and that of the London Science Museum is that, for the latter, it was much more economical to produce the parts to the required tolerances.
The difference between “takes an expert engineer a month to create” and “can be CNC’d in a day” is enormous, even if the end results are the same.
They still haven’t built the Analytical Engine, which was the general-purpose computer and significantly more complex.
Also, building something with the benefit of 150 years of technological progress gives you quite a leg up on the past even if you use techniques that would have technically been available back then.
Had anyone in the 19th century realized the generality of computing in binary there would have been successful mechanical computers of the age. Doing everything in decimal was part of the problem.
Charles S. Peirce in the 1880s understood that switching circuits embody Boolean logic, though he never published except in a letter to a student/colleague who was going to work on this but abandoned it. That Babbage and Peirce both had trouble dealing with people may have had a pretty big effect on history.
That's the trick of "autistic" personalities -- better with computers but worse with people. A person better with people but worse with computers wouldn't have succeeded either. Being better at both is rare.
According to the London museum of science, Babbage looked at binary, as well as many other bases and made an informed decision against it. The supremacy of binary is not as obvious with mechanical systems as it is with electrical semiconductors. These machines are built with gears, not switches.
Probably if Babbage had employed watchmakers, he could have got it done, but clockwork run on springs isn't as impressive as massive, clanking machinery driven by steam engines.
He would still have had the problem of driving typesetting equipment, but that seems like a smaller problem. Output on punched paper tape was the solution the later generation chose, and would have worked.
Amusingly, the original telegraph inventions (plural) output dots on paper, but accidents of finance left us with the inferior audio clickery. Sort of like how we are still using x86.
If you haven't seen it, physicists used physical lenses to compute fourier transforms before computers.
Also, although I didn't read closely enough, I think "collation" or "partitioning" is the right term for sorting things and then placing them into buckets. Sorting is just ordering, not partitioning.
Dictionaries are not authoritative sources for word definitions. Rather they are, and can only be, historical references.
Words mean what people use them to mean.
“Sort these pages” with no other instructions will almost always mean, to the average person in a general context, put the pages in ascending order using the most obvious sequence present.
“File these documents” will generally mean put these pages in their relevant folders, either in a filing cabinet or computer storage.
Well, that’s their common usage outside of computer specific applicant ions here in Australia.
Both are incorrect. "Filing" is bucketing according to an external value.
For example, when a doctor files some test results, the test results aren't being put in order with other test results. Those papers are first bucketed by patient, and the buckets (files) are what's kept sorted.
This leads me to wonder: what abstractions are we missing today that will be obvious to future generations?
Assuming that at some point we will stop writing code and machines will generate programs for us instead, then we need abstractions that make this possible: a way to describe the purpose of a program that is precise enough to generate code satisfying the purpose and that can be derived from normal human communication instead of having to be explicitly coded. In other words: a way of describing why the program exists, instead of describing what it should do.
Additionally, to avoid paperclip maximizers we would need a system of ethics and values robust enough that it becomes computable. Then it can be integrated into the code-writing code in such a way as to constrain the code it generates to only choose approaches that are in line with the value system, and to refuse to implement purposes at odds with the value system. As long as our best stab at ethics largely boils down to long lists of behavioral rules written in holy books with few underlying principles we aren't ready to build truly advanced AI.
Probably some philosopher has already figured out the right model of abstract ethics, but has gone mostly undiscovered so far.
> Assuming that at some point we will stop writing code and machines will generate programs for us instead, then we need abstractions that make this possible: a way to describe the purpose of a program that is precise enough to generate code satisfying the purpose
The industry-standard term for a specification precise enough to enable building a program from it is "code"
>> This leads me to wonder: what abstractions are we missing today that will be obvious to future generations?
That's a very nice question. But I tend to see it like this : if you'd know, it'd mean that somehow you predicted the future.
I'd say that abstractions are built everytime but only the good ones survive. So the only way to answer your question is to wait 50 years. Then we'll know what are those missing abstractions.
But the simple act of asking the question already opens the way to a critic of current abstractions, which will lead to new ones and maybe, one of the future :-)
Are there public places where such things are discussed ? I mean, places where you don't need to be inside a university to actually participate.
> what abstractions are we missing today that will be obvious to future generations?
And how many people have already recognized them, only to have been ignored because they're working at too high a level for their listeners to appreciate?
> Babbage's machine could have been built in his time if funding hadn't run out.
And if it hadn't, so what? The world would not have changed, we'd just have an absurdly impactical computer which no one had a use for. Had it been useful enough to keep making them and improving them, we'd have developed the abstractions.
When you have to build a device pretty much by hand, why bother building a general purpose machine?
And here we are, 181 years after Babbage first described his Analytic Engine and Aplication Specific Integrated Circuits and Graphics Processing Units are all the rage.
> When you have to build a device pretty much by hand, why bother building a general purpose machine?
Because you'd conquer many industries with it simultaneously. Instant path to richness.
> And here we are, 181 years after Babbage first described his Analytic Engine and Aplication Specific Integrated Circuits and Graphics Processing Units are all the rage.
It's a different thing. We usually first figure a description of a computation - one that could be run on a general-purpose machine - and then turn it into ASIC to make it more efficient, by performing only the work necessary for that one particular calculation. Back in Babbage's times, they didn't have a coherent notion of computation yet.
> Because you'd conquer many industries with it simultaneously. Instant path to richness.
I just mean to say, from my limited knowledge, the available applications were all (mostly?) bespoke anyways, so the inputs and outputs probably needed to be also. So perhaps a modular mechanical general purpose computer may have helped.
> It's a different thing.
Yes, you've made a good point, I agree. Perhaps, then, if you were to build a general purpose mechanical computer ~180 years ago you might use it for whatever application is required, observe it's most commonly used computational paths, then swap it out for an application specific computer. of (possibly?) reduced complexity. You can then take your whizz-bang-high-tech-expensive computer to the next client, rinse and repeat. Maybe.
> So perhaps a modular mechanical general purpose computer may have helped.
That's what I was imagining when writing that comment. And I suppose experience turning computation into work in one industry would let you design bespoke endpoints for another faster.
BlockSci is an academic research project at Princeton, but we're committed to maintaining it as open-source software, and we hope it's more broadly useful. If you're interested in using it or contributing to it, here's a list of ideas that we'd love to see implemented.
1. Create a Block Explorer. BlockSci would make a good backend for a block explorer
website, because it would benefit from the built-in analysis library, with features like address clustering and parsing multisignature scripts.
2. Support more blockchains. BlockSci supports several blockchains, but there are limitations detailed in the paper [1]. For example, currently we don’t support any script operations not found in Bitcoin. Supporting more altcoins/blockchains would make BlockSci more useful.
3. Identify cold wallets and associated usage patterns. Cold wallet addresses could be identified by various patterns on the blockchain such as infrequent large withdrawals. After identifying these addresses, there are many interesting questions to ask such as studying the rate of deposits vs withdrawals.
4. Improve clustering heuristics. BlockSci’s address linking is based on the two heuristics from the Fistful of Bitcoins paper [2]. These heuristics have known limitations, leading to false positives and negatives; there’s a lot of room for improvement here.
5. Extract hidden messages. There are many messages encoded into the Bitcoin blockchain ranging from Wikileaks cables to Rickrolls [3]. We can find them if we can guess how they are encoded. But can we automatically extract and decode these hidden messages, say, by looking for address strings that look non-random?
I am interested in contributing to making a Block explorer. You guys have something like this https://blockexplorer.com/ in mind? Is it Ok if I use React? I guess django for the backend.
One of the authors here. It can be used for realtime analysis. The data is updated using a parser program which performs incremental updates. Adding a single block is extremely rapid. Running this repeatedly (for instance using a cron job) will keep the data up to date. I'm planning on adding a daemon mode in the next version to simplify the process of live updates.
I'm the lead author of this textbook/lecture series. It's been a couple of years, and we've been thinking about an update. Let me know what topics you'd most like to see. Note that our goal is not to much to teach the details of specific cryptocurrencies as the concepts underlying them. For example, covering Byzantine Fault Tolerance and its application to blockchain protocols is high on my list.
Took the class on Cousera and it was very informative. I often reference the book as well. To be honest, it wasn't the technical aspects of the class but the "political economy" implications of crypto-anarchy that were particularly fascinating. At the time, the Isle of Man regulations had just come out as the first attempt to integrate bitcoin into the market. Prof. Felten's mild pushback against the consequences of a shadow, unregulated monetary system really moderated my own opinions. I would love to see a deeper historical framing of digital money as a way to think about the future.
The recent Filecoin ICO has also gotten attention. And this might be a good place to jump off for a second course: "Applied Crypto Entrepreneurship". How to package, market and sell your existing computing resources, knowledge and skills for financial gain and social good. How blockchains and DLTs can be used not just to encode trust, but transactions for all digital goods and data. Even a section on using machine learning for time series analysis and building algorithmic trading bots. And also including lots of practical exercises in smart contract development and introductions to third-party tools, APIs and BaaS platforms.
Good luck and looking forward to seeing the next version! I'd say you've really got your work cut out for you trying to stay up-to-date on this ever shifting foundation.
I'm utterly sick of the politics around the scaling debate, but it would be comforting to read an impartial design document analyzing the many legitimate technical approaches to adjusting the number of end-user Bitcoin transactions that can happen in a given unit of time. The technical question is not as zero-sum as the warring political factions want us to believe.
Such a discussion would be generally applicable to an abstract cryptocurrency, and thus pertinent to this lecture series.
Thank you so much for doing this and posting it online. I have learned a tremendous amount from your coursera course.
Whilst I appreciate that you are not seeking to teach the details of specific currencies, could you use some of the newer currencies to teach by example?
For example, could you discuss some of the concepts detailed in several of the white papers (and the legitimacy thereof) such as IOTA's tangle or some of the features of the new coins built on ethereum or the utilisation of XRP by banks?
Could you discuss future possibilities such as decentralized exchanges?
Could you cover the topic of SegWit? Since Bitcoin core has adopted that as their first scaling code change post 1MB blocks, I think it would be helpful to understand what's going on with that.
Seems that some new coins such as Ethereum have relatively different designs than those of the old coins. It would be great if you could explain those a bit more.
no, i was talking about the internals of https://github.com/bitcoin/bitcoin (bitcoin script might seem low level, but there's an actual implementation of it!)
I'm a coauthor of this paper. It was published a couple of years ago; pleasantly surprised to see it here.
The paper shows that programmers have distinctive styles in their source code that can be extracted to create a fingerprint of coding style. It's not just the obvious stuff like spaces vs. tabs -- parsing the code and looking at the Abstract Syntax Tree is what results in a powerful fingerprint.
I've been asked this a lot, especially since I also research and teach cryptocurrency technology [1]. I haven't personally tried to do that. Satoshi clearly wished to maintain their pseudonymity, and I'd rather respect that. I also think Satoshi's pseudonymity is a powerful statement about decentralized cryptocurrencies, namely that their viability rests on their technical merits, with no need to know or trust their creators.
That said, I don't blame people for trying to uncover Satoshi's identity, and it's possible that the techniques in our paper can help. The big caveat, of course, is access to a corpus that includes Satoshi's code labeled with their true identity.
Thank you for respecting that. I believe you understand.
I have to say I'm completely unsurprised at your follow-up result - I've always held that view myself. Code style isn't just about variable names and brace placement, it's also about the abstract design and how you choose to reduce the problem you want to solve to the method that you want to solve it with - and the choices made in that process carry through all the way to the object code, data structures, file formats and beyond. That stylometry may be possible to some degree from object code follows naturally from that viewpoint.
A little romantic as that might be, I've always felt that reverse-engineering can sometimes seem like, via the medium of what they've created, being a few steps removed from reading someone else's thoughts.
In response to your previous comment, my impression is that this may have come up due to a recent Medium article that was killed on HN.[1]
In response to your last remarks, the gist of the claim in the Medium article is that state agencies already know Satoshi's identity, and the method used was stylometric analysis of his prose (not code) cross-referenced with personal communications that NSA is purported to have access to. That's the claim, at least.
I've read the article and although entertaining it doesn't contain any substantial information nor sources. So all in all it was an interesting anecdote.
By the way the hacker news transparency looks really cool!
> Satoshi clearly wished to maintain their pseudonymity, and I'd rather respect that.
Actually that was just a joke. I don't need their identity but sometimes I wish there was a way of discussing some design decisions that went into Bitcoin (I did a little bit of technical analysis of Bitcoin and was involved in electronic crypto-based asset projects before Bitcoin was conceived). Unfortunately that would make it harder for Satoshi or stay anonymous.
You're far too accusatory toward the author here. Various LEA and other parties not associated with law enforcement have vested interests in pursuing this technology anyway. If it was possible, one of them was going to find it. It's very possible that similar discoveries were already made but kept secret.
At least now the rest of the world knows this is possible and can mount a defense if they feel it is necessary.
Even more tangent, this reminds me of a manga where there was a character who could guess your sexual personality based on your code (read right to left): http://imgur.com/a/FJPMS
Not the author, but I don't see why not. I'm sure you could apply some machine learning techniques that combine the stylometric features extracted with demographic information.
I've always suspected it was Hal Finney, but Hal passed away without clarifying whether he was or not. It would have been just like Hal to play such a prank :-)
Interesting. I can guess with high accuracy who wrote the code
at work, sometimes by which PEP8 recommendation they've broken; sometimes seeing an object mutated in two consecutive lines, leaving a white trailing space on a line, or a ^M character, etc.
But:
> We have a follow-up to this paper showing that surprisingly, coding style survives in compiled binaries
Now that is interesting.. Have you noticed similarities in coding styles for different demographics? Is there a Russian style, a Californian style? Or does it lend itself better to a "cohort analysis"? Surely this is one of the first applications that popped into your mind.
My "Surely this is one of the first applications that popped into your mind." and mentioning "demographics" was a priming attempt partly prompted by the authors' affiliations.
Another question is: how pertinent could this be and could it serve as evidence. Is it as "receivable in court" as fingerprints? If it can be "evidence", how to deal with tampering and entities trying to frame each other by mimicking each other's fingerprint to 'cover their tracks'? Could there be a "market" for fingerprints where you just download a set of fingerprints, write [and compile] your code, and then "apply" the prints to the code? Would there be tools to detect genuine, organic, fingerprints from synthetic or after the fact prints? Can you detect how "fresh" these prints are? Can there be a "glove" equivalent so your fingerprints don't show? Can there be a tool to remove your fingerprints after the fact? Can you apply multiple prints successively? This is indeed fascinating.
I'm just starting to dive into NLP and I'm curious if this type of technique could be viable for identifying the same users across multiple social media accounts.
If users have the slightest inclination to attempt to be distinct across accounts, wouldn't that defeat this sort of de-anonymization?
If for example I apply different linters or stylers, and don't write docs, and don't use camelCase, and don't write tests, won't I appear to be a different programmer? Can we link IOCCC entries with their author's regular day-job work?
Stylometry of regular text can be fairly reliability defeated even by people who don't know how it works. It remains interesting because most people don't do that.
The code stylometry in the linked paper is interesting because it also looks at things likely preserved by stylers and linters: they parse the code and look at things like the depth of the abstract syntax tree, or the frequency of certain AST-node bigrams.
Obviously there's opportunity for a lot more research here. But compared to other variants of stylometry, this approach might be fairly robust
I seem to recall a paper a couple of years ago showing that this can already being done quite successfully?!? Don't remember the details, unfortunately.
(It kind of seems like it would be "easier" in way since natural language doesn't have to obey any special syntax.)
Coauthor here. Here's some context for how this essay came about.
When we released a draft of the Princeton Bitcoin textbook [1], one piece of feedback was that we focused on cryptocurrency technology as it is today, and ignored the juicy and tumultuous history of how the ideas developed over the last few decades. So I invited Jeremy Clark, who's connected to some of this history, to write a preface to the book. If you're interested in the history, you might enjoy that chapter. [2]
Jeremy and I then got together to develop the ideas further, resulting in the present article, where we also provide some commentary on the current blockchain hype and draw lessons for practitioners and academics.
We need more people like you to write about and help demistify Bitcoin, to counter the hype surrounding blockchain technology.
I hope you get an opportunity to write about this for a lay audience too, because mainstream media, with few exceptions, has done a poor job at covering the technology.
When you were researching the ledger part, I'm curious whether you've come across a DAG-based ledger. I've been reading the byteball [1] paper and still can't tell whether it's baloney or really the DAG is a consensus that does not require PoW... I suspect it's neither, there are trade offs, but I could not find much anything good on the subject to read.
A great DAGchain paper is "SPECTRE - Serialization of Proof-of-work Events: Confirming Transactions via Recursive Elections". Its peer reviewed and contains rigorous security proofs.
One thing found interesting about the conclusion of OP's article is the role of academia vs practical implementation.
> Many academic communities informally argued that Bitcoin couldn't work, based on theoretical models or experiences with past systems, despite the fact that it was working in practice.
It will be interesting to see the Academically based SPECTRE competing with another DAG based coin such as Byteball. Well measured research and a peer-reviewed foundation against practical implementation, first to market and continuous improvement.
Academia and industry both have filtering problems, how to tell good ideas from bad ideas.
The industry solution tends to be to try things and see what works in practice. This is extremely expensive in time and only a small number of ideas can be tried. Furthermore the success or failure depends on the execution and marketing. If Bitcoin had not had the developer commitment in the early stage it would be dead and forgotten despite the great ideas.
The academic solution is that ideas should come with detailed arguments about why the solution works, what its flaws are and how it compares to other work. This allows ideas to be compared and judged more quickly at a lower expense. However constructing these arguments is hard, requires rare knowledge and is not always possible.
Academics dismissed Bitcoin because it did not have these arguments. They had no way to know if it would work when it was running with real money on the line. Distributed systems ideas are very hard to get right and Bitcoin had all sorts quirks that Satoshi didn't foresee, however PoW turns out to be a very robust mechanism.
Thank you, this is what I was looking for. It'll probably take the whole upcoming weekend for me to grok this, but the first thing that caught my eye was that SPECTRE still has PoW, while byteball somehow claims that is isn't necessary... OR may be I'm misreading something. Thanks again.
The surprise here isn't that Bitcoin isn't perfectly anonymous. There are two new findings. The first is the extent to which your Bitcoin payment details get leaked to third party trackers. I've been writing about the excesses of third party tracking for years [1], and I'm pretty jaded, but the extent of the leaks surprised me.
The second main finding is that CoinJoin isn't enough to protect yourself. We tested this on our own transactions, but also by coming up with a way to identity essentially all existing CoinJoins on the blockchain and analyzing their anonymity.
Even if you have a 100% anonymous cryptocurrency, you are spending it on a site that has information about you and your activities on that site. If you are dealing with physical items, it has your address.
Agreed. However, if someone is leaking your physical address, they really don't need bitcoin addresses at all. They can just join on your address :)
This made me curious about the feasability of an anonymous postal service. One which you pay for with x-coin, with no identity attached, and you get physical deliveries at that address (placed into a box only you can access, maybe with some sort of private key).
With the security cams and all it might be hard, but probably not impossible if enough people are using it.
If you want "100% anonymity", you make sure that sites have no information about your meatspace identity. You certainly don't share your address. If you're leasing a VPS or whatever, you invent some persona for the account, and SSH via chained VPN services and Tor.
And connect that through a quadracopter mesh network to your local starbucks ISP. Also make sure you don't pick up your delivery until a month after it's shipped, just in case it's being watched.
The point is you can take as many precautions as you want, but you'll never attain 100% anonymity. You get diminishing returns after a while.
The main point is that having stuff delivered is the major risk. Using Tor through nested VPN chains is easy. I can't imagine "quadracopter mesh network". That would attract too much attention, I think.
Doesn't matter, nobody would know where you are, just the quadracopters. You could use dozens of drones if you want obscurity and to strengthen the mesh network. I was also joking.
Yeah, that is what I keep saying every time these "bitcoin is not anonymous" stories pop up. Ditch it and use a cryptocurrency that was designed, engineered and built with anonymity and privacy in mind, like Monero.
dang, would be nice to change title/link to the paper (https://arxiv.org/abs/1708.04748 for non-PDF link). Too much commentary here is reacting to the article title.
It's been clear for years that Bitcoin transactions aren't anonymous, and that web tracking is pervasive. To use Bitcoin anonymously, one must use mixers, such as Bitcoin Fog. CoinJoin just doesn't cut it.[0] With mixers, you get totally unrelated Bitcoin. Also, one must use VPN services and/or Tor, to avoid tracking. It's rather misleading to write a paper like this without letting users know how to do it right.
Note that this fingerprinting technique exploits differences in the behavior of the AudioContext API, but does not (and cannot) actually record audio.
Paper: https://webtransparency.cs.princeton.edu/webcensus/index.htm...
Demonstration (test your own audio fingerprint): https://audiofingerprint.openwpm.com
Discussion from 2016: https://news.ycombinator.com/item?id=11729438
Full list of websites where audio fingerprinting scripts were found (in March 2016): https://webtransparency.cs.princeton.edu/webcensus/audio_fp_...
Source: I'm an author of the research in question (but unaffiliated with this blog).
Note to mods: article title is "Audio Fingerprinting using the AudioContext API". Submitter title is "Sites are using audio (no permissions needed) to track users", which may violate the site guidelines.