I still find the idea of CDNS repugnant. No matter how you slice it, you rely on an external resource for important parts of your application. "What it it goes down?" is one question. But you also should be asking yourself about what will happen if it gets hacked. There are also user privacy issues, which get completely overlooked in the chase for shaving off several milliseconds off request time.
A much better architecture would be to serve JavaScript from your server by default, but allow for distributed content-based caching. For example, your script tag could look like this
The hash would be calculated based on the content of the file. The browser then could fetch it from whatever source it wants. Users could cache stuff locally (across websites), without needing to dial into a CDN every time. You could even use a torrent-like network for distributed delivery of popular script libraries.
It's not just a few milliseconds though. For example at https://starthq.com, we are based in Finland, but host on Amazon in US East. A round trip to the US is 200ms+ whereas with CloudFront it's 8ms. Before we used a CDN our page took a few seconds to load - now it takes around 200ms.
I should also mention that all this happens only on first load. We embed etags in the URLs and use far off cache control expires dates, so subsequent page loads get the JS and CSS from the browser's cache.
I think there's confusion here about the use of the term CDN. There are public CDNs, like Google AJAX APIs, that allow a shared copy of an open-source library to be downloaded from a known-good location. This enables users to reuse the same copy their browser has already cached across multiple pages, but like romaniv and the OP have pointed out, you are then trusting Google to be good stewards of that resource.
Conversely, you control what shows up on your own private CDN, like CloudFront. Sure, there may be downside outside of your control, but nobody is going to be able to alter the resources there without your permission.
> Conversely, you control what shows up on your own
> private CDN, like CloudFront. Sure, there may be
> downside outside of your control, but nobody is going
> to be able to alter the resources there without your
> permission.
Well, CloudFront could, since they control the machines that your users are connecting to.
I don't want to imply that you personally shouldn't use a CDN, but the page you linked to loads 43 files. If you consolidated, removed links to or inlined some of them, the difference with and without CDN would likely to be much smaller.
Actually the core StartHQ app is less than 10 files: the libraries and application each have their own JS and CSS files, then there's Font Awesome and a couple of images loaded by Bootstrap. The rest is one third party analytics JS file & iframes for social media sharing buttons, which don't block page rendering and we don't have control over.
Isn't the concern of downtime and hacking only relevant if you have reason to believe that they are more likely to happen with the CDN than with your own servers?
If you host some stuff on server 1 and some stuff on server 2, and you need both to function, then you have two points of failures.
reply
This is kind of a simple argument, increasing the number of "points of failure" doesn't, or shouldn't increase your odds of "failing" more. Adding cache layers and CDNs may add "servers" to your architecture but should also be done in way to reduce overall downtime.
Increasing the number of points of failure absolutely does increase the chances of failure, even if each 'point' is more reliable. 1 server with a 98% uptime is more reliable than 5 servers with a 99% uptime, if all 5 have to be working for everything to 'work'.
While I mostly agree with what you're saying... The truth is, why worry about Google getting hacked and having the attacker modify their CDN version of jQuery (or leaking user behaviour and identity), when I'm already letting them load unknown (to me) code in ga.js?
To me, there's a trade-off I've chosen, and while I'm not 100% comfortable with having the availability of my site depend on Google - "repugnant" is _way_ to strong a word to describe the downside to the pragmatic choice I've made.
that's a great idea. the src should be the cdn though, but the browser should download the file into something like local storage and make it available for the future.
No, if you want to protect the user's privacy, the source has to be your site, otherwise you're giving the CDN info about your customer. With the hash mechanism, 500 sites could share the same library, but only the site the user is visiting can ever know the user visited that site. Sure, one of those 500 gets the hit to performance on initial cache load, but averaged over all visitors to all sites, that's probably a comfortable trade.
I think it would take Mozilla or Apple to push this. Google probably has too much skin in the CDN-info-gathering game.
A much better architecture would be to serve JavaScript from your server by default, but allow for distributed content-based caching. For example, your script tag could look like this
<script src="some.js" hash="ha3ee938h8eh38a9h49ha094h" />
The hash would be calculated based on the content of the file. The browser then could fetch it from whatever source it wants. Users could cache stuff locally (across websites), without needing to dial into a CDN every time. You could even use a torrent-like network for distributed delivery of popular script libraries.