It always amazes me how people are so free to judge other's situations.
It's great that your web app only costs $10/month but others may have web apps that are more computationally intensive e.g. video processing or ML inference etc or simply can't join everything they need at runtime.
And it's great that you're willing to deny those 50k users a day access to your service when that cheap VPS inevitably falls over. But others may be monetising that traffic and will want a HA solution so their revenue isn't impacted.
All of those add complexity and cost to an architecture.
TekMoi is right. I have an Alexa top 6k site which is vastly more complicated (media hosting, load balancing, multiple VMs, DDOS protection, transactional emails, automated backups) which costs $200 / month on AWS.
The fact that this person is spending nearly as much to support 50k users a day as I do to support more than 4 million cannot be hand-waived away by "people are so free to judge other's[sic] situations". The matter is worsened by the fact that the application is so simple that it doesn't even support user accounts. There is room for discussion here about efficiency in application architecture. More importantly, an article billing itself as "Costs of running a Python webapp for 55k monthly users" is silly because there is no way this is representative of anything. I'm afraid new hackers will be scared by the high costs listed here and be discouraged in their own efforts.
If you support 4 million monthly visitors on a media site, and have multiple EC2 instances running, I'd love to see a cost breakdown structure, because in my (obviously incomplete and possibly naive) calculations, the bandwidth alone would cost more than $200/mo.
CloudFlare covers the media bandwidth costs for a mere $20 / month. The uploading and media conversion is the difficult and costly (in terms of CPU) portion.
>CloudFlare covers the media bandwidth costs for a mere $20 / month.
Your media files must be extremely small.
I'm guessing they're less than 20MB on average, because a) CF hasn't shown you the door yet b) they don't even cache anything bigger than half a gigabyte.
Similar situation (sub $1k/month) to the setup im doing for a start up in indonesia (top 7k in alexa, top 150 in indonesia), except we have to pay extra for media because we need to process the logs (long tail pdf/data/docx hits, and we need to control the dns and how other domains route to use so we cant use cloudflare). And still plenty of room to cost optimize.
Another consultant that came in and try to do this, made our costs go up by 10x per month… so i'm not surprised when I see stuff like this here…
The knowledge of ones tools available at ones finger tips and the relative costs of such seem to make the difference for these things.
Let's leave this kind of snark out of this community. It's comments like this that slippery slope a community from helpful to harmful. You see it each time a reddit community gets too large.
No, what's harmful is "oh, just spend $50,000 on managed Kubernetes to run a Django web app". That costs real time and real money and makes young engineers think that a phpBB forum is impossible with a five-digit AWS bill.
Both you and TekMoi should probably value your own time higher. The cost savings are great, but once you spend 4 hours on configuring a database server, that'll probably be $200+ worth of your time and, thus, wipe out most of the savings.
It doesn’t cost anything to spend 4 hours of your time doing anything, so it doesn’t really wipe out any savings. Reducing a bill from $400 month to $200 is real money, not some theoretical time/value judgement.
And, as others have mentioned, there is enormous value in knowing how to operate on a fairly lean tech stack. It makes it so much simpler to scale effectively while keeping costs down.
This is only true, if you do not value your free time. In your example, you've spent 4h to save 200$, so your work was worth 50$/hour. A freelancer with a 100$/hour rate might do 2 hours of work instead, spend the money, and gain 2 hours of free time.
There are other factors of course, but in general, many people come up with a rate for their own free time (which is often higher than the actual rate they charge clients)
It’s not only true if you don’t your free time. It’s only true if you’re a freelancer that is turning down hours at a higher rate than what you’re saving.
Many people working on products, both on their own and within a company, aren’t turning down other profitable work to optimize existing solutions.
If I watch a two hour movie instead of spending two hours saving $100/month, it doesn’t matter how much I value my time, no one is paying me $100/hour to watch a movie.
There's a difference between the initial example and watching a movie. That's what I summed up with "there are other factors (than money)". If I'm only interested in the _outcome_ and I treat the way to achieve it as work, I definitely weight the time cost vs benefit (and I am not a freelancer). If it is something I enjoy doing (which may or may not be true for the initial example), I'll take this into account as well. Time is a limited resource, and I treat it as such.
Except that now you know how to configure a database server, and know how your database server is configured. So if you ever get a problem with the database in the future, you'll know how to solve it faster. Which will save you time and money in the future.
And you're less likely to make mistakes like upgrading your server instance to try and solve a problem that can't be solved like that.
And the more you do it, the cheaper and faster it gets. That knowledge and skill has value.
I value four hours of my time at considerably more than $200 and the reason I'm able to do that is because I know things like how to configure a database server.
But your argument is nonsensical because a one-time investment of time (much less than four hours for me, but I've been doing this for 20 years) can save you several hundred dollars a month. AWS AuroraDB, for instance, (which I also know how to configure, by the way) has much higher latency than a hand-rolled instance and will cause bottlenecks throughout your code in a DB-driven application. If I hadn't experienced the difference firsthand, or had failed to profile my app's performance adequately, I might assume I need to solve the problem by spinning up more ec2 instances to distribute the load. I've had the misfortune of working with a company that had exactly that problem and knowing how to spin up a new DB server saved the company thousands of dollars a month and took considerably less than four hours. Transferring a 3TB database to a new server without downtime did take considerably longer, however, but I was being paid hourly anyway, and it was still a worthwhile investment for the company which saved considerably more than my fee.
Any tradesperson should know their tools. A programmer is no different and if you don't know how to use your tools because your "value your own time higher" then thank you: you're the guy who ends up getting me called in to fix things at a much higher hourly fee.
the main issue I have with the DIY mentality is that, for me, it's endless, and even more so, it's an unpaved path, ad-hoc, with few integrated well defined paths. we've got lots of open source software, but the ops of being online is hard fought experience.
* `apt-get install postgres-server` would have worked fine for my needs on my VPS. oh but i need roles, so let's start an ansible playbook. maybe let's tweak some settings. fine, still all short, easy to do.
* ahhh i should probably have backups. how am i going to manage that storage, where is that going to live?
* then i introduce a new feature & my database is running slow. explain query helps, but i also could use some metrics for these boxes, so probably need to start thinking about prometheus & node-exporter, &c.
i am radically in favor of a) personally facing these challenges and b) open-sourcing the operational knowledge & tools for setting up AND OPERATING systems.
yet at the same time i also think spending $171/mo for a year is an exceedingly wonderful option to have on the table. running my own servers is, to me, a lifelong project, something i want to deeply invest in. there's plenty of ways to go about it that aren't so arduous (k8s+postgres-operator+rook+tbd monitoring+tbd directory-services), but that willingness to keep engaging, supporting, maintaining, scaling things can be a very serious concern that extends well past the time it takes to set a database up: it's an ongoing "giving a shit" burden even when (seemingly) working fine.
being willing and able to hack through is great, and i am all for the coallition of the willing who elect to march through, hopefully not getting bogged down along the way. but wow if you are trying to start a business, it sure is nice being able to pay someone to spin up, back up, monitor, scale some services for you.
i hope some day "we" are better at such things, systematically, i hope open source ops helps give us better paths to doing these kind of things easily, safely, observably, resilliently. we're not there yet. but wow, this challenge to me- how we move open source from an older "software" model to an online service model, that empowers people to set up online systems as easily as opening an editor, that's the challenge at the heart of open source today. it's one that needs a lot more effort, a lot more work, such that we have good ways to stand up & keep up a database server.
Being well versed in setting up your own database server bereft of cloud provider hand holding easily pays for initial time investment over time. I don't think most understand just how much a mastery of the basics is capable of generating in value when you're essentially vendor lock-in proof. There is so much blindness to voluntary hanging of one's arose out the window created by vendor overreliance.
It says in the article that the author gets 34k daily users, and 50k unique users/month. It would have been clearer if the author had talked about sessions (which are therefore > 1M/mo) for sure, but you're still making a very big (and invalid) assumption.
EDIT: Please disregard the above. I need an eye test, or maybe just to put my glasses on! Daily users are 3.4k (3400), not 34k. My apologies, I take it all back!
I'm such a klutz - sorry. This is what I get for reading HN when I'm still in bed and without putting my glasses on first. This is a terrible habit that I need to break.
Or 0 per minute and 25,000 per hour for two days a month. Traffic can be bursty; don't assume that X/month means they're getting exactly X/30 per day..
What are the right resources, you would suggest someone if he had to setup his servers properly. Will really appreciate if you can refer some books/videos/articles. Thanks.
If you're talking about serving many requests cost-effectively, then really the problem is not the servers (except over-provisioning, which is rampant. Learn to use tools like AWS' auto-scaling system instead) - it's the code.
If you understand the basics of algorithmic time complexity (that's your Big-O notation) and profiling your code then you're ahead of 98% of other developers in practice. I'm constantly amazed at how many developers think adding more libraries, newer frameworks, or more layers of tooling will magically speed up their code because "it's so fast". If you actually time things you'll find out doing it the "slow" way is frequently an order of magnitude faster.
To be fair, the post show that about 2/3rds of that spend is from the decision to run twice the needed capacity to be able to do green blue deployments, and to cloud host their metabase analytics.
And it explains there's currently zero revenue.
So it seems fair to judge the situation there based off those pieces of information we've been given.
As I posted elsewhere, the OP's choice to run dual redundant green/blue capable instances and cloud hosted metabase might have good reasons, but right now those reasons are not "wanting a HA solution so revenue isn't impacted"...
That's not how green/blue deployments work. You don't keep both colors up unless you have completely failed to understand the concept.
Green/Blue is all about saving resources and costs, not keeping them around. You misread the cause here. It has nothing to do with deployment strategies.
That's not how I read what the OP's doing in the article. Sure, maybe he's not doing "proper blue/green", but that is what he uses to explain running a duplicated pair of web/app servers full time...
I have a web app that's struggling if there are more than 2 concurrent users per CPU core. It's displaying incompressible large resolution images with <100ms latency
EDIT: not sure why I'm downvoted, I'm just presenting my use case. They are multigigabyte images encoded with custom wavelet compression that are cut into tiles (think google earth), each user needs 5-10 tiles every second
My guess would be that your webapp is serving images in a blocking fashion, meaning every time a use requests an image it will fetch the image for the user AND block the http serving thread until the image is uploaded. Can you provide more context (eg: tech stack)?
What's blocking is number of http connections between browser and server. Most browsers only allow 6, each tile takes about 100ms so getting 10 in under second doesn't always happen.
Do you know if the old open street map trick of having multiple tile servers (that are each just aliases of the same server) still works? I think this was how they tried circumventing the 6 connection limit
Your webserver still has to spin off a thread for each request if you want to do substantial CPU work for each request, but rest assured you'll get all the requests at once from the browser. Not 6 at a time like in the dark ages
If you want to provide uninterrupted service to your clients you’ll have to spend some $. You want to have redundancy, machines hosted in different different locations, backup prod servers, monitoring, analysis tools. Even if it is for 1k monthly users, if you want reliability - it will increase the costs.
In my experience, the complicated setups that are justified by the argument of "reliability" have more downtime then a single VPS. The reason is probably that there are more moving parts and more has to be maintained / can go wrong.
These days, a single VPS in the right datacenter has excellent uptime.
Agreed. I'll probably be downvoted, but these setups strike me as people who prefer to drink the cool aid than be pragmatic and only use what they need.
I've also had very high reliability rates with a single VPS. They've actually given me less downtime than AWS services at times.
At work, I aim for four nines. (We put three nines in the legal paperwork).
I can't hit four nines reliably with single VPS platforms on my typical workloads, I need load balancers and redundant app servers. I could quite likely hit three nines using single VPSes. But if a client wants 99.9% SLAs, they'll be paying for HA and I'll deploy redundant ec2 instances, multi region RDS, and an ELB. And charge them 3 or 4 times what the OP is spending for it. (And I'll almost always deliver 99.99% availability.)
For my stuff or friends or people I'm doing cost saving favours for, I'll explain how much extra it costs to guarantee less then an hour of downtime a month, the realistic expectations and historical experience of how much downtime an non-HA platform might have in their use case, and often choose along with them a single VPS (or even dirt cheap cpanel hosting) while understanding and accepting the risks associated with saving upwards of a couple of hundred bucks per month.
I think ec2 gives 99.99% availability in their SLA, no need to scale across regions or even AZs. Multi AZ RDS is 99.95%. We have a simple ELB/EC2/RDS/S3 stack on us-east-1 and need high availability for a very small amount of users and run very cheap.
A single VPS set-up might be OK for serving content over web, but in my experience, the pain begins when your software starts doing async processing - long-running cron jobs, queue processing. If you're doing it on your web server machine, there will be downtime.
I know this, because I have gone through these issues with each of my projects. Just recently an infinite loop bug in a cron job ground my "single VPS" setup to a halt (and took the web server with it).
> In my experience, the complicated setups that are justified by the argument of "reliability" have more downtime then a single VPS. The reason is probably that there are more moving parts and more has to be maintained / can go wrong.
> These days, a single VPS in the right datacenter has excellent uptime.
Again, maybe in your experience but that's not universal. There's literally no redundancy with running everything off a single VPS and if that datacenter has network or hardware problems, then your service is down.
Is redundancy necessary for the scale of OP's app considering it provides 0 income? Most likely not, but that's a decision they've decided on and there's nothing wrong with that.
What does excellent uptime mean in your book? With Digital Ocean's AM2 region I had regular downtime every few weeks and while I'm alright with it, if I had another VPS in another datacenter it would've had next to no effect on the customer experience. But an hour or more of downtime every two weeks isn't excellent.
https://aws.amazon.com/message/41926/ this lasted hours and affected almost everyone using us-east-1, large portion of internet was unavailable because they had no multi-region setups.
Two Hetzner CPX31 boxes sounds like it'd do just fine here too, providing the redundancy you mention for a fraction of the cost. Or get the boxes from different companies, for the same sort of overall price.
Yes some of the other tools could arguably be worth paying for, but if the author's concern is that he's short on money and $140 is a lot, why didn't they KISS and only use what they need? Then scale as and when needed in the future.
And $140/month is pretty good value there probably... Even if that's just being able to point potential employers/recruiters at this blog post as evidence of experience building and running an HA website with more-advanced-than-free-Google-Analytics user behaviour tracking.
If your 55k MAU want uninterrupted service, they need to be paying for it (in dollars or monetisable attention and/or privacy).
On a site currently generating zero revenue, I hope the OP is happily enough paying most of that $145/month as a learning experience or for resume bullet points (which are perfectly valid was to spend your money). They've admitted elsewhere in the comments that the two $40/month droplets are way oversized (from an attempt to solve a problem that turned out not to be droplet size/resource related) - so without redundancy and without AWS hosted metabase, this would be about $100/month less expensive to run.
I still think that's over provisioned or under engineered. Like others have commented, I'd be surprised if the features you can see on the site require any more than the $15/month the FAQ claims it costs to run, plus perhaps the $10/month Discus expenses. That seems about where a hobby/side-gig project should sit for a lot of devs before you start thinking about how to make it pay for itself... YMMV, especially if you're not comfortable earning at least junior dev salary already in some reasonably well paying part of the world.
I get what you're saying, but I think you're being a bit harsh.
In my mind, choosing a dual redundant prod platform so you can do blue/green deployments is _totally_ unnecessary for a 55k MAU site generating no revenue. Same with cloud hosted metabase. You could run that on your own hardware - a spare laptop or probably even a raspberry pi for effectively nothing.
On the other hand, a hobby project/side gig where you can demonstrate real world experience in those two things could _easily_ pay off in the first week of a new job it helps you land.
If it's _just_ that extra $100/month they're spending there, it seems difficult to justify. If it's commercial experience doing that which is helping him try and land a job paying 20k or more extra per year? That's totally money well spent, in my opinion...
Given that he's complaining about the price I don't think the wasteful costs here are justified. I'll also note the article was just edited and the costs are now up to $171 / month.
This app could easily be run on the AWS free tier, although even in the paid tier he could probably be managing a lot more than this workload for under $40 / month. (That's for two servers - which as has been pointed-out is wholly unnecessary for an app of this size.) The price he's paying is presently listed at $95.
Maybe I read his conclusion differently, but he said "would be peanuts" about the cost and "The bigger issue is that on the revenue side there’s a big fat zero."
Seems to me he's acknowledging he's built a thing that requires generating revenue to support itself, but that he's neglected the revenue generation part of his project, rather than complaining too much about the price of running it?
I'm mostly agreeing with you (and tristanperry and TekMol), but I'm probably being more sympathetic and ascribing un-supported motivations for why he's happy enough building it this way and spending this much money to run it. (Probably because I've been there before myself, and sometimes that expensive hobby project has paid off, sometimes it hasn't. I've never spend so much I've seriously regretted any of my failures though...)
If we were on a site called "CV News" that teaches people how to get jobs in dull cooperations, I might agree with you. On the other hand, I would not frequent such a site. So we would never have had this discussion.
You can apply the skills that you learn in any job, not just the dull ones. The skills that you learn by implementing this kind of setup are valuable to lots of interesting jobs I'd say.
Not to sound mean, but if there aren't posts/blogs/whatever explicitly telling people how to minimize costs... then they'll continue to follow the ones that make them pay $100+.
Don't do blue-green deploys when you have no revenue. Downtime costs you nothing and at 3,400 visitors a day you'll be lucky to drop a single request while deploying.
Pre-compute and cache to reduce your need for beefy servers.
When you can run something on your machine or in the cloud, choose your machine.
EDIT: It's less about saving money and more about not spending it.
Caching is, in general, a good thing. But it's worth thinking about how you do it. I recently sped up a large system by dropping the use of redis entirely - because it was being badly used.
During a single request there might be 50+ cache-lookups, each taking a round-trip to a remote redis server to fetch a single key at a time. Batching those up to a set/hash would have been more efficient, but the codebase had evolved in such a way as to make that difficult.
Instead of making 50+ redis fetches it turned out that just fetching all the stuff from the database was faster.
(There will be refactoring to batch up the key fetches, but for the moment there was a measurable increase in performance under current loads just by removing redis.)
Perhaps, yes. It was the latency of the network calls that was killing performance, so something local, or even redis on localhost, would have been better.
My postgres process doesn't come close to using up enough resources to push me out of even the cheap VPS tiers and I don't have to worry about locking if there's a heavy write load.
I think the point here is, SQLite setup would provide satisfactory results at this scale.
If the choice is betweeen a PAAS database offering and SQLite, you can pick SQLite. If you have skills / are prepared for managing dbserver yourself, then do that.
If he serves hundreds of thousands of monthly users and makes a million a month with a single VPS, you certainly won't have scaling issues when you start out or just have tens of thousands of users.
Step 2:
If you really scale beyond that, resist the urge to bloat your stack. Think long and hard about every piece you add to it. Really understand each piece you add to it. Don't fall into the trap of paid services. Don't fall into the trap of "best practices".
Haha, the man is clearly a beast of a product designer and executor. I don't think I would succeed with the same tools.
Honestly, spending $200/mo is insignificant to me. And I'm pretty happy to answer the question of "Why can this guy build this thing on a single VPS and you can't?" with "Well, because he's better than me".
I can be up in 15 mins on Heroku with a Rails+React web app. And in an hour have a thing. Or for a static no-login thing, faster with Netlify.
But it doesn't matter. Because I never made a product as nice as the one he made. If the outcome is I'm -$200/mo that is irrelevant to me. If the outcome is I'm +$50k/mo that is very relevant. So I'm going to optimize for how I can do the latter.
I don’t know how many times I have seen “best practices” used as excuse to avoid thinking. Usually by people who don’t even have the problem that the supposed “best practices” is meant to solve.
While this article has steps in the right direction, it's a lot of "use this thing". I operate a simple Flask/PostgreSQL webapp for CS Education through the luxury of my university. If/When I graduate from my program, what are the aspects I can do to minimize my costs for hosting the app? Of course I'll look for best approaches but why does that need to be forbidden knowledge known to a select few?
Its an aside, but the mentality of "let them figure it out" is a major issue in education. Foundational knowledge should be easy to acquire so I can worry about higher level thinking issues. Literally spending hours trying to figure out how to set things up through hours of Googling doesn't really help that, nor does it promote the "figuring it out" people think it does - its just stumbling upon the right set of commands that let me move past this particular hurdle.
From the devops perspective, what about telling someone how to set up their own server to do/minimize X is so taxing?
- cloudflare free tier for caching, DNS, page rules, etc
- run everything on one VPS(digital ocean, linode, etc) pick cheapest that has specs you need
- any non-trivial storage (media, big files) move to Backblaze B2 it's cheap (you can use free tier cloudflare workers to redirect to B2 for free bandwidth due to Bandwitdth Alliance)
- free static page from Netifly (I can redirect to this with cloudflare in case my VPS falls over or something to provide info/links)
- If I want to look at logs or something I rsync it my local machine (if I cared I could set up a process to push logs/backup etc to private B2 bucket)
You may not need exact same setup, I am optimizing for caching and cheap storage because my site stores/serves lots of media files.
>Literally spending hours trying to figure out how to set things up through hours of Googling doesn't really help that, nor does it promote the "figuring it out" people think it does - its just stumbling upon the right set of commands that let me move past this particular hurdle.
That's basically all of software development for your entire career. Never not had a day or a week not like that.
That is true and I recognize the purpose of being a proficient Googler; however my concerns stem out of the idea that learning many of those skills are not taught at all or expected to be learned in situ through programming assignments.
Here are examples of what I mean:
- An undergraduate Networking/Security course may not provide practice on appropriately salting passwords. It is merely discussed as part of some larger conceptual model. Students are browbeaten in earlier courses to not simply copy/paste code they find on the internet
- Debugging practice has to come from the student's own generated code, but if they made a mistake, they already are showing they do not fully grasp the material. There have been efforts to explicitly train debugging [1] but they are still in early stages of researching their benefits.
I don't think they need to be explicit courses, since that means making credits and charging students more. Rather, my research is about providing exercises specifically targeting those lower level skills. Since many of them are only a fraction of the "programming problem", they are do not require the expected hours and can be completed quickly. My hypothesis is that doing these types of problems will help reduce the time on task for coding.
If I have a "million a month" business (which is coming from rich recruitijg fees on a trivial website, not amazing tech), no way am I sweating $171/mo in server bills.
basically most people don't do that for saving costs, but just because it makes sense to them. And there's lots of tutorials explaining how to run your own webapp.
4+k daily visitors on one of the biggest fan sites for a popular mobile game, and I'm still running it with sqlite as DB backend and no caching on a shared VPS.
The additional expense in OP had nothing to do with scale; it arises from: redundancy, analytics, off-the-shelf integrations.
It is a common misunderstanding amongst software engineers that infrastructure serves "performance". Most of the complexity comes from redundancy and analytics (, realtime especially).
Right, but how much do these nice-to-haves add to the revenue? Is it truly worth it to bake them all in from the very beginning? E.g. five minutes of downtime in morning hours will affect only a handful of users, and if so, why bother with blue-green deployments other than out of professional interest?
the servers are oversized for the load we're currently seeing. The reason for that is that we tried to solve a production issue by increasing the server specs. It didn't solve the problem, and now we can't down-size the servers without re-provisioning them ️.
I have found the same with digital ocean it is hard to down size but that is only true because I have so much unicorn data, and a move has become very painfull, so I'm stuck paying for 2x the storage cost.
You should really learn to add new servers by provisioning them and remove the old servers. Your app seems a perfect fit for it for moving between servers.
User count matters less than the amount of data and the amount of processing power required per user. Web servers are extremely efficient, so if most of that occurs within the context of requests, it's easy.
I have web applications running that reliably serve 50k users per day and cost me $10/month, running on a single, cheap VPS.