We manage a codebase that is well over a million lines of code, and has a history dating back >5 years.
One of our answers to this problem is extreme amounts of standardization. We might have 1mm LOC in platform services alone, but it is spread across 50+ types and each looks almost identical. Everything uses the same persistence mechanism, migration technique, error handling, configuration provider, etc. Dependency injection + reflection + standardization (interfaces/abstract types) is where you can get into some really powerful leverage regarding keeping things organized and sane. Ultimately we have ~8 "flavors" of thing that developers usually need to worry about.
Our end game answer is to get away from the code altogether. We are starting to view code as glue between what would ideally be configuration-based implementations and the nasty real world which must be mutated in icky ways. So, instead of writing code for a module every time you need to implement it, make it once and in a generic way, have it take a configuration object, and then expose a web UI around configuring that thing. Then, all that code is reduced to JSON being passed around. When you are dealing with pure data, you can get away with the most ridiculous things. Cloning objects, versioning, validations, relational queries, et. al. becomes trivial. If you have 1 stable domain model throughout that is 3NF or better, you can use SQL to do basically everything.
Edit: One more thing I would note is that a big part of why we are able to support this codebase is because we have adopted a sort of "hive mind" developer mindset, where everyone tries to role play this ideal of a developer who would best be suited for the task. We acknowledge that our codebase is not a place for much "fun" and the best analogy I could come up with is its like doing something in a nuclear power plant control room. You just gotta do it by the book every time, and then you get to go home to a safe and happy community. It's not like we employ volunteers.
sounds like an ideology lock-in. Let's hope you never get a problem which does not fit your current architecture well, or else you'll end up spending weeks or even months solving an otherwise trivial problem.
I've worked with "configuration-based implementations" and in my experience they are hard to work with (no debugging, incomplete documentation and implementation, little flexibility), require an staggering amount of infrastructure, are hard to test and will approach a programming language over time.
I agree with the concern, but we have had a very long time to refine our architecture. Some would call it an ideology lock-in, I would say we solved our problem domain in a deep and meaningful way and would prefer to stick with these proven approaches. Our entire codebase was rewritten approximately 4 times before we got to the point of being confident enough to push forward with a data-driven/configuration approach.
When you are writing the same business logic hundreds of times and only 10-20 discrete things are different between each implementation, it starts to make a hell of a lot of sense to expose those things as parameters to be configured. It's simple economies of scale at this point for us. Despite our small size, we are trying to get out of a "move fast & break things" startup mindset into a more stable "lets take this to 1k customers now" mindset (we provide a B2B application in a small market, so 1k is a huge target).
For us, our company doesn't become profitable until we can scale our operations by 5-10x without any more headcount. The only thing we could come up with that would allow for this is configuration-driven techniques in which entire customer implementations can be cloned as simple JSON contracts for purposes of bootstrapping the next customer. Developers are removed from most of the product implementation process, and can focus more on core product value which is now levered hundreds of times over due to being exposed as configuration contract.
I am NOT arguing that one should seek out to build a configuration-driven system from day one. That would probably be the biggest mistake you could make. You have to already have a mostly-functional product that people already want to buy/use before you can even consider this approach. Even then, you should probably expand your target market and inject a few more use cases & rewrites before you jump over that chasm. Having a squeaky-clean domain model that addresses all potential use cases is the bare minimum prerequisite, IMO.
How was the culture of the “hive mind” developed and maintained in the organisation? I can imagine there are challenges you’ve faced to keep it working
Start small and grow carefully. Not every developer is a good fit for this type of approach and the amount of discipline we require.
We actually started looking at an approach where new hires would come in on a 6-12 month contract basis. The whole idea would be that there would be no hard feelings either way at the end if it didn't work out. If both sides felt like this was a good fit, we explore longer-term options with more benefits.
The way we do software is unconventional. We are in a very constrained environment from a security perspective. No containers, nothing can be in the cloud, all data must live on the same physical host, software delivery is tricky, etc. These constraints make the work we do somewhat unappealing to a certain crowd of developer who seeks to maximize their exposure to shiny new things.
Put differently, we use boring old technologies (with a few exceptions) and set expectations that we are going to continue to use those indefinitely. Any hopes of "mixing things up" should be reserved for future endeavors on our roadmap and personal side projects (which we encourage). I don't think any of this is unreasonable or unrealistic. We are in the business of selling software to other businesses in a sensitive market. We are not making DLC for AAA videogames.
One of our answers to this problem is extreme amounts of standardization. We might have 1mm LOC in platform services alone, but it is spread across 50+ types and each looks almost identical. Everything uses the same persistence mechanism, migration technique, error handling, configuration provider, etc. Dependency injection + reflection + standardization (interfaces/abstract types) is where you can get into some really powerful leverage regarding keeping things organized and sane. Ultimately we have ~8 "flavors" of thing that developers usually need to worry about.
Our end game answer is to get away from the code altogether. We are starting to view code as glue between what would ideally be configuration-based implementations and the nasty real world which must be mutated in icky ways. So, instead of writing code for a module every time you need to implement it, make it once and in a generic way, have it take a configuration object, and then expose a web UI around configuring that thing. Then, all that code is reduced to JSON being passed around. When you are dealing with pure data, you can get away with the most ridiculous things. Cloning objects, versioning, validations, relational queries, et. al. becomes trivial. If you have 1 stable domain model throughout that is 3NF or better, you can use SQL to do basically everything.