Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So there are 2 kinds of big tech monorepos.

One is the kind described in the article here: "THE" monorepo of the (mostly) entire codebase, requiring custom VCS, custom CI, and a team of 200 engineering supporting this whole thing. Uber and Meta and I guess Google do it this way now. It takes years of pain to reach to this point. It usually starts with the other kind of "monorepo":

The other kind is the "multirepo monorepo" where individual teams decide to start clustering their projects in monorepos loosely organized around orgs. The frontend folks want to use Turborepo and they hate Bazel. The Java people want to use Bazel and don't know that anything else really exists. The Python people do whatever the python people do these days after giving up on Poetry, etc... Eventually these might coalesce into larger monorepos.

Either approach costs millions of dollars and millions of hours of developers' time and effort. The effort is largely defensible to the business leaders by skillful technology VPs, and the resulting state is mostly supported by the developers who chose to forget the horror that they had to endure to actually reach it.



As a former IC at a large monorepo company, I preferred monorepos over polyrepos.

It was the "THE" monorepo, and it made understanding the company's service graph, call graph, ownership graph, etc etc. incredibly clear. Crystal clear. Vividly so.

Polyrepos are tribal knowledge. You don't know where anything lives and you can't look or discover it. Every team does their own thing. Inheriting new code is a curse. Code archeology feels like an adventure in root cause analysis in a library of hidden and cryptic tomes.

Polyrepos are like messages and knowledge locked away inside Discord or Slack channels with bad retention policies. Everything atrophies in the dark corners.

If monorepos cost millions, I'd say polyrepos do just the same in a different way.

Monorepos are are a continent of giant megafauna. Large resources, monotrophic.

Polyrepos are a thousand species living and dying, some thriving, some never to be known, most completely in the dark.


Every time I've been monorepos compares with polyrepos it's always "monorepo plus millions of dollars of custom tool engineering" vs "stock polyrepo"

Why can't we add millions of dollars of tool engineering on top of polyrepos to get some of the benefits of monorepos without a lot of the pain? E.g. it wouldn't be too hard to create "linked" PRs across repos for changes that span projects, with linked testing infrastructure

And I don't see how discovery changes significantly from browsing through loads of repositories instead of loads of directories in a repository


> Every time I've been monorepos compares with polyrepos it's always "monorepo plus millions of dollars of custom tool engineering" vs "stock polyrepo"

The costs of the infra/build/CI work are of course more visible when there is a dedicated team doing it. If there is no such central team, the cost is just invisibly split between all the teams. In my experience this is more costly overall, due to every team rolling their own thing and requiring them to be jack-of-all-trades in rolling their own infra/build/CI.

> And I don't see how discovery changes significantly from browsing through loads of repositories instead of loads of directories in a repository

If repository permissions aren't set centrally but every team gets to micromanage them, then they usually end up too restrictive and you don't get even read-only access.


Great call out. Amazon has an extremely effectively polyrepo setup and it’s a shame there’s no open source analog. Probably because it requires infrastructure outside of the repo software itself. I’ve been toying around with building it myself but it’s a massive project and I don’t have much free time.


The Amazon poly-repo setup is an engineering marvel, and a usability nightmare, and doesn't even solve all the major documented problems of poly-repos. The "version set" idea was probably revolutionary when it was invented, but everyone I know who has ever worked at amazon has casually mentioned that their team has at least one college-hire working 25%+ time on keeping their dependency tree building.


This really shouldn't be the case as of about 5 years ago, a massive effort was done to get all version sets merging from live regularly and things were much healthier after that. For what it's worth I suspect the usability of Brazil before then was still on par or better than the usability of a unkempt monorepo (which is unfortunately all too common).


Sounds interesting. Are there any public articles available describing this more?


I think one reason is that there are various big companies (Google, Microsoft, Meta) who have talked about the tech they've deployed to make monorepos work, but I at least have never seen an equivalent big successful company describe their polyrepo setup, how they solved the pain points and what the tech around it looks like.


>equivalent big successful company describe their polyrepo setup, how they solved the pain points and what the tech around it looks like.

I've worked at big successful F500 boring companies with polyrepo setup and it's boring as well. For this company, it was Jenkins checked out the repo, ran the Jenkins file, artifact was created and stuck into JFrog Artifactory. We would update Puppet file in our repo and during approved deploy window in ServiceNow, Puppet would do the deploy. Because of this, Repos had certain fixed structure which was annoying at times.

Pain Points that were not solved is 4 different teams involved in touching everything (Jenkins, Puppet, InfoSec and dev team) and break downs that would happen.


I keep meaning to write a blog post...

The short answer, start with a package management system like conan or npm (we rolled our own - releasing 1.0 the same month I first heard of conan which was then around version 0.6 - don't follow our example). Then you just need processes to ensure that everyone constantly has the latest version of all the repos they depend on - which ends up being a full time job for someone to manage.

Don't write your own package manager - if you use a common one that means your IDE will know how to work with it - our custom package manager has some nice features but we have to maintain our own IDE plugin so it can figure out the builds.


> Then you just need processes to ensure that everyone constantly has the latest version of all the repos they depend on - which ends up being a full time job for someone to manage.

One full time job equivalent can buy a lot of tooling. Tooling that not only replaces this role but also shifts the feedback a lot closer to dev introducing the breaking change.


I realize this is more than a week ago and nobody will see it, but...

Every large project has this position. Smaller projects it isn't a full time job and so they distribute it amoung the team members and don't track the cost. Larger projects it is too large for a full time person and so they are forced to distribute the costs of this and just cannot track the costs. We happen to be in the sweet spot where a full time person can do the job and so we can track that cost. However make no mistake everything this person is doing is done on every project.

I agree with tooling being important. the person we have doing this job is a great engineer who automates everything he can, but there is still a lot of work that needs to be done. Some of it cannot be automated. (many of the problems are people problems)


I also think a lot of it is quiet for a reason. There aren't interesting problems to solve. A lot of it is boring. It isn't without pain, but most of the pain consists of lots of little papercuts rather than big giant showstopping injuries. A lot of the papercuts are just annoying enough itches that aren't worth scratching. Or are solved with ecosystems of normal, boring tools like Jenkins or GitHub Advanced Security or SonarQube or GitHub Actions or… Boring off-the-shelf tools for boring off-the-shelf pain points.


My company has millions of dollars in tooling for our polyrepo. It would not be hard to throw several more million into the problem.

If you have a large project there is no getting around the issues you will have. Just a set of pros and cons.

There are better tools for polyrepo you can start with, but there is a lot of things that we have that I wish I could get upstreamed (there is good reason the open source world would not accept our patches even if I cleaned them up)


a) At least with Github Actions it is trivial to support polyrepos. At my company we have thousands of repositories which we can easily handle because we can sync templated CI/CD workflows from a shared repository to any number of downstream ones.

b) When you are browsing through repositories you see a description, tags, technologies used, contributors, number of commits, releases etc. Massive difference in discovery versus a directory.


Curious how you do the sync - do you just git include and occasionally pull from upstream, or another mechanism?


Exactly. Take your monorepo, split it into n repos by directory at certain depth from root, write very a rudimentary VCS wrapper script to sync all the repos in tandem and you have already solved a lot of pain points.

> E.g. it wouldn't be too hard to create "linked" PRs across repos for changes that span projects, with linked testing infrastructure

Bitbucket does this out-of-the box :)


That sort of directory-based splitting almost never works in my experience. The code between those directories is almost always tightly coupled. Splitting arbitrarily like this gives the illusion of a non-tightly coupled code base with all the disadvantages of highly coupled dependencies. It’s basically the worst possible way to migrate workflows.


> Take your monorepo, split it into n repos by directory at certain depth from root, write very a rudimentary VCS wrapper script to sync all the repos in tandem and you have already solved a lot of pain points.

Then you lose the capability to atomically make a commit that crosses repoes. I'm not sure if there is any forge that allows that, except Gerrit might with its topics feature (I've not gotten the opportunity to try that).


You could also use git submodules in an overarching separate repo, if you want to lock down a set of versions. It doesn't even have to affect the submodule repos in any way. That would simplify branches in the single repos and enable teams to work independently on each repo. Then you only deploy from the overarching repo's main branch for example, where you have to create PRs for merging into the main branch and get it reviewed and approved.


That's not a nice workflow from pipelines/CI point of view.

Let's take for example a service 'foobar' that depends on in-house library 'libfoo'. And now you need to add a feature to foobar that needs some changes to libfoo at same time (and for extra fun let's say those changes will break some other users of libfoo). Of course during development you want to run pipelines for both libfoo and foobar.

In such 'super module' system it gets pretty annoying to push changes for testing in CI when every change to either libfoo or foobar needs to be followed by a commit to the super repo.

In a monorepo that's just another Tuesday.


> In such 'super module' system it gets pretty annoying to push changes for testing in CI when every change to either libfoo or foobar needs to be followed by a commit to the super repo.

Again, tooling issue. CI can easily pull required changeset across multiple repos. We are in a subthread under "monorepo plus millions of dollars of custom tool engineering" vs "stock polyrepo"


> Every time I've been monorepos compares with polyrepos it's always "monorepo plus millions of dollars of custom tool engineering" vs "stock polyrepo"

Not quite - it's "vs stock polyrepo with millions of dollars of engineering effort in manually doing what the monorepo tooling does".


> Why can't we add millions of dollars of tool engineering on top of polyrepos

I don't think the "stock polyrepo" characterization is apt. Organizations using polyrepos already do invest that kind of money. Unfortunately, this effort is not visible because it's spread out across repos and every team does their own thing. So then people erroneously conclude that monorepos are much more expensive. Like the GP said:

> Polyrepos are a thousand species living and dying, some thriving, some never to be known, most completely in the dark.


Hey, do you think Gitlab should do anything except running after the next trend and develop shitty not-solutions for that? Why, that could improve Gitlab. We cannot have that!


Monorepo monoliths make it hard to experiment. Getting something as benine as a later version of .NET becomes a mammoth task requiring the architecture team and everything stays old. Want to use a reasonable tool? No chance.


I don't see how it immediately has to follow from monorepo usage, that its parts cannot have separate runtimes and dependency versions. Perhaps the monorepo tooling is still that bad, idk, but there seems no inherent reason for that.


I mean monoliths specifically. If your mono repo is just storing many repos in different folders and aims to keep all that in lockstep it is a bit different.


But I think you're the first person to introduce the concept of a monolith to the conversation. How you structure your repo is an orthogonal question to how you break up your deployments, and this conversation is about the former not the latter.

A monolith that's broken up into 20 libraries in their own repos also prevents experimentation with new runtimes just as much as the monorepo version does.


Monorepo very often means bazel for tooling (rbe and caching tests) and that means one WORKSPACE with common versions of libs.

Monorepo also means a team 'vetting' new thirdparty libs, and a team telling you your CI takes too long, and a team telling you to upgrade your lib within 23 minutes because theres a security issue in the korean language support...


Monorepo doesn't mean any of those things, nor does a polyrepo setup prevent any of them except for bazel.

It sounds like you worked in a dysfunctional organization that happened to use a monorepo. Their dysfunctions are not inherent in the monorepo model and they would have found other ways to be dysfunctional if not those.


At my current $dayjob, there is a backend that is split into ~11 git repos which results in a single feature being split among 4-5 merge requests and it's very annoying. We're about to begin evaluating monorepos to group them all (among other projects). What would the alternative to a monorepo be in this case, knowing that we can't bundle the repos together?


Is a mono repo the answer, or is the real problem you just have a bad repo split.

I can't answer that question, and there are reasons to go monorepo anyway. However if your problem is a bad polyrepo split going to monorepo is the obvious answer, but it isn't the only answer. Monorepo and polyrepo each have very significant problems (see the article for monorepo problems) that are unique to that setup. You have to choose what set of problems to live with and mitigate them as best you can.


"11 repos with 4-5 merge requests" doesn't sound like Google-level, so I don't see why a monorepo wouldn't work without much work.


The general rule is that things should be versioned together that change together. Separate repositories should be thought of similarly to separately versioned libraries. Dependencies between repositories should have stable interfaces. Design decisions that are likely to change should be encapsulated within a module, so that these decisions are hidden from other modules (a seminal paper about that is [0]). These considerations should guide any split into separate repositories.

[0] https://wstomv.win.tue.nl/edu/2ip30/references/criteria_for_...


Yup, at work I have a few projects split across several repos in like four languages. A completely new feature implemented across the whole stack involves PRs in up to 8 different repos. Potentially more.

To be totally honest, yes this is an unbelievable pain in the ass, but I much prefer the strict isolation. Having worked with (much, much) smaller monorepos, I find the temptation to put code anywhere it fits too much, and things quickly get sloppy. With isolated repos, my brain much more clearly understands the boundaries and separation of concerns.

Then again, this results in a lot of code duplication that is not trivial to resolve. Submodules help to a degree, but when the codebase is this diverse, you're gonna have to copy some code somewhere.

I view it sort of like the split between inheritance and composition. You can either inherit code from the entire monorepo, or build projects from component submodules plugged together. I much prefer the latter solution, but clearly the former works for some people.


At mine we ended up with two very comparable webapp products due to an acquisition.

One is built as a monorepo and we have a shared dev server where each user can run their own copies in a home directory.

The other is built as a collection of Docker containers that devs run locally. Nobody from the monorepo team likes dealing with it. Resyncing requires a much more elaborate Git process than a single fetch and pull. A simple task can spawn five merge requests to digest. We have loads of extra effort just making the QA team is in the same place as devs.

If nothing else, there's huge simplification from "I can access your copy of the code base and see the same error you're seeing" without trying to screenshare or remote-desktop.


I think you might just have a badly architected backend. get rid of your microservices first and then we'll see how you're feeling


As an asside, I've found IntelliJ very helpful in this situation as it can load many repos into one project then doing commits / pushes / branches etc across various repos at the same time just seemed to work the way I wanted without much thinking about it.


Do these 11 repos end up in separate binaries?

Because it sounds like you just need flag based feature releases.


It ends up with 6 deployables that are coupled together (let's say micro-services). There are surely better ways to structure the project but our CI/CD pipeline doesn't allow us to do so and it is not handled by our team anyway. I haven't seen any good way to make my life easier for merges, tech reviews, deployments, etc…


use git subtree - first to concatenate the minor repos into one major repo, and then subtree split from that point forward to publish subtree folders back to the minor repos, if needed (e.g. open source projects to github). works for us with about 8 minor repos, eliminated submodule pain entirely. only the delivery lead has to even know the minor repos exist.


I have already briefly looked at git-subtree. From what I can gather, it doesn't help much with my use-case. You still need to manually pull from each subtree and push branches individually to each project. The end result is still 4-5 merge requests to handle on Gitlab for a single new feature.

I might have missed something.


I believe dustingetz is suggesting making a monorepo for the code itself, but copying the subdirectories of the main repo into subrepos to solve your CI issues.

This means that developers have a monorepo for day to day work, but the CI/CD issues are isolated in their own separate repos, and can be handled separately.

Dunno if that's 100% of what they mean but it seems to be a solution to what you describe in another message ("our CI/CD pipeline doesn't allow us to do so and it is not handled by our team anyway")


Only 11 repos? I am at 76 repos for one backend. lol It's madness.


There's no good orchestration system that is both easy to implement and has the core features that make a monorepo pleasant to use that is language agnostic.

Bazel is complex and isn't the easiest to pick up for many (though to Google's credit the documentation is getting better). Buck isn't any better in this regard. Pants seems easiest out of all the big ones I've seen but its also a bit quirky, though much easier to get started with in my experience. NX is all over the place in my experience.

Until recently too, most of these monorepo systems didn't have good web ecosystem support and even of those that do they don't handle every build case you want them to, which means you have to extend them in some way and maintain that.

It also doesn't help that most CI systems don't have good defaults for these tools and can be hard to setup properly to take advantage of their advantages (like shared cross machine caching).

As an aside, the best monorepo tooling I have ever used was Rush[0] from Microsoft. If you are working in a frontend / node monorepo or considering it, do take a look. It works great and really makes working in a monorepo extremely uniform and consistent. It does mean doing things 'the rush way' but the trade off is worth it.

[0]: https://rushjs.io


It's worth noting that most monorepos won't reach the same size as repositories from Google, Uber, or other tech giants. Some companies introduce new services every day, but for some, the number of services remains steady.

If a company has up to 100 services, there won't be VCS scale problems, LSP will be able to fit the tags of the entire codebase in a laptop's memory, and it is probably _almost_ fine to run all tests on CI.

TL;DR not every company will/should/plan to be the size of Google.


I do think the 'run all tests on CI' part is not that fine, it bites a lot earlier than the others do. Git is totally fine for a few hundred engineers and 100ish services (assuming nobody does anything really bad to it, but then it fails for 10 engineers anyway), but running all tests rapidly becomes an issue even with tens of engineers.

That is mitigated a lot by a really good caching system (and even more by full remote build execution) but most times you basically end up needing a 'big iron' build system to get that, at which point it should be able to run the changed subset of tests accurately for you anyway.


There are also so many types of slow tests in web systems. Any kind of e2e test like Cypress or Playwright can easily take a minute. Integrations tests that render components and potentially even access a DB take many times longer than a basic unit test. It doesn’t take very many of the slow group to reaaaly start slowing your system down. At that point, what matters is how much money you’re willing to pay to scale your build agents either vertical or (more likely) horizontally


Well no, it's just not build agent size; if you have 10 tests that take 3-4 minutes each, you're not gonna go any faster than the slowest of them (plus the time to build them, which is also typically bad for those kinds of tests, although a bigger build agent may be faster there). Having a system that can avoid running the test for many PRs because it can prove it's not affected means in those cases you don't have to wait for that thing to run at all.

Although, time is money, so often scaling build agents may be cheaper than paying for the engineering time to redo your build system...


I have hundreds of tests that take 15-30 mintues each. These tests tend to be whole system tests so there is no way useful way to say it won't touch your change (75% will). Despite an extensive unit test suite (that runs first) these tests catch a large number of real production bugs, and most of them are things that a quicker running test couldn't catch.

Which is to say that trying to avoid running tests isn't the right answer. Make them as fast as you can, but be prepared to pay the price - either a lot of parrell build systems, or lower quality.


It's a bit of a tangent and I agree with your point, but wanted to note that for one project our e2e tests went from ~40 min to less than 10, just by moving from Cypress to Playwright. You can go pretty far with Playwright and a couple of cheap runners.


I appreciate the point, but I've heard this kind of thing several times before - last time around was hype about how Cypress would have exactly this effect (spoiler: it did not live up to the hype). I don't believe the new framework du jour will save you from this kind of thing, it's about how you write & maintain the tests.


I wish I had hard evidence to show because my normal instinct would be similar to yours, but in this case I'm a total Playwright convert.

Part of it might be that Playwright makes it much easier to write and organize complex tests. But for that specific project, it was as close to a 1 to 1 conversion as you get, the speedup came without significant architectural changes.

The original reason for switching was flaky tests in CI that were taking way too much effort to fix over time, likely due to oddities in Cypress' command queue. After the switch, and in new projects using Playwright, I haven't had to deal with any intermittent flakiness.


Or spend on time building test selection systems…


I think that discussions in this area get muddied by people using different definitions of “rapidly”. There are (at least) two kinds of speed WRT tests being run for a large code base.

First, there is “rapidly” as pertains to the speed of running tests during development of a change. This is “did I screw up in an obvious way” error checking, and also often “are the tests that I wrote as part of this change passing” error checking. “Rapid” in this area should target low single digits of minutes as the maximum allowed time, preferably much less. This type of validation doesn’t need to run all tests—or even run a full determinator pass to determine what tests to run; a cache, approximation, or sampling can be used instead. In some environments, tests can be run in the development environment rather than in CI for added speed.

Then there is “rapidly” as pertains to the speed of running tests before deployment. This is after the developer of a change thinks their code is pretty much done, unless they missed something—this pass checks for “something”. Full determinator runs or full builds are necessary here. Speed should usually be achieved through parallelism and, depending on the urgency of release needs, by spending money scaling out CI jobs across many cores.

Now the hot take: in nearly every professional software development context it is fine if “rapidly” for the pre-deployment category of tests is denominated in multiple hours.

Yes, really.

Obviously, make it faster than that if you can, but if you have to trade away “did I miss something” coverage, don’t. Hours are fine, I promise. You can work on something else or pick up the next story while you wait—and skip the “but context switching!” line; stop feverishly checking whether your build is green and work on the next thing for 90min regardless.

“But what if the slow build fails and I have to keep coming back and fixing stuff with an 2+ hours wait time each fix cycle? My precious sprint velocity predictability!”—you never had predictability; you paid that cost in fixing broken releases that made it out because you didn’t run all the tests. Really, just go work on something else while the big build runs, and tell your PM to chill out (a common organizational failure uncovered here is that PMs are held accountable for late releases but not for severe breakage caused by them pushing devs to release too early and spend less time on testing).

“But flakes!”—fix the flakes. If your organization draws a hard “all tests run on every build and spurious failures are p0 bugs for the responsible team” line, then this problem goes away very quickly—weeks, and not many of them. Shame and PagerDuty are powerful motivators.

“But what if production is down?” Have an artifact-based revert system to turn back the clock on everything, so you don’t need to wait hours to validate a forward fix or cherry-picked partial revert. Yes, even data migrations.

Hours is fine, really. I promise.


You are of course entitled to your opinion, and I do appreciate going against the grain, but having worked in an “hours” environment and a “minutes” environment I couldn’t disagree more. The minutes job is so much more pleasant to work with in nearly every way. And ironically ended up being higher quality because you couldn’t lean on a giant integration test suite as a crutch. Automated business metric based canary rollbacks, sophisticated feature flagging and gating systems, contract tests, etc. and these run in production, so are accurate where integration tests often aren’t in a complicated service topology.

There are also categories of work that are so miserable with long deployment times that they just don’t get done at all in those environments. Things like improving telemetry, tracing, observability. Things like performance debugging, where lower envs aren’t representative.

I would personally never go back, for a system of moderate or more distributive complexity (ie > 10 services, 10 total data stores )


All very fair points! I think it is perhaps much more situational than I made it out to be, and that functioning in an “hours” environment is only possible as described if some organizational patterns are in place to make it work.


yeah i realized as i wrote that out that my personal conclusions probably don't apply in a monoservice type architecture. If you have a mono(or few) service architecture with a single (or few) db, it is actually feasible to have integration tests that are worth the runtime. The bigger & more distributed you get, the more the costs of integration tests go up (velocity, fragility, maintenance, burden of mirroring production config) and the equation doesnt pencil out anymore. Probably other scenarios where im wrong also.


My company has been moving towards having monorepos per language stack. Decent compromise


That sounds worse than either option. At that point put it all in one repo with a directory for each language.


And then at some point your Rust people write a Python module in Rust via pyo3, and it has to be integrated into Python build system and Python type checkers, but also needs local rust crates as build dependencies and local python packages as runtime dependencies... hm.


Coupling and Cohesion likely has nothing to do with the language.


This will start to become a problem if the stacks need to communicate with each other using versioned protocols.


Why can't you just use versioning in your external-to-the-monorepo APIs and use HEAD within the monorepo? Nothing about combining some projects into a monorepo forces you into dropping everything else we know about software release cycles.


The point is that it is more work.


More work than what? More work than sharing HEAD in the monorepo, certainly. But it's definitely not more work than versioning across multiple repos because it's literally the same thing. When you're exporting code from a monorepo you follow all the same patterns you would from a small single library repo.


Maybe I miss the point here, but it seems to me that versioning the protocols is the specific solution to maintaining interop between different implementations.


And for small teams, what we want/need is the "all deps" mono-repo.

I wanna link other repos I depend on, but that repos can be read-only. And then all the tools work without extra friction

P.D: This could be another wish for jj!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: