The longer I work, the more NIH I get. There are two reasons for this:
1. The ease of creating a library these days is resulting in a proliferation of utterly crap libraries, so bringing on a new library is more and more of a liability, both to future reusability of code and to security.
2. As I get better at programming, I more and more often realize that I can write a better version of the library, or just one that better fits my needs.
These combine to create a situation where I'm less and less likely to want to import something.
1. The proliferation of libraries enables devs to get very, very picky about their libraries - the slightest wart or lack of understanding will cause them to label a library "utterly crap". But anything they create to replace it will run afoul of another programmer's picky tastes even worse! (And to what degree they may have a point, exactly 0 devs will be previously familiar with the warts and misfeatures of this new code.)
2. As I get better at project management, I realize that myself and my fellow devs are constantly underestimating the ongoing maintenance burdens of new code. It's hard enough getting us to make sufficiently pessimistic time estimates for the initial implementation that PMs don't have to smile and multiply by large integers - nevermind accounting for the next few years of adding additional debug logic, logging, fixing edge cases, supporting legacy decisions, adding a full set of fuzzing and regression tests, etc.
I wonder if we're regressing toward the same mean, or away from it...
> 1. The proliferation of libraries enables devs to get very, very picky about their libraries - the slightest wart or lack of understanding will cause them to label a library "utterly crap". But anything they create to replace it will run afoul of another programmer's picky tastes even worse! (And to what degree they may have a point, exactly 0 devs will be previously familiar with the warts and misfeatures of this new code.)
Ah yes, the "it's all subjective" argument. The thing is, it's not subjective. Sure, it may be hard to objectively measure the qualities of a library, but that doesn't mean we should give up and assume there's no underlying objective truth.
My "picky tastes" are for libraries that are battle-tested, performant, and scalable. If that runs afoul of someone else's tastes, frankly, I don't care.
> 2. As I get better at project management, I realize that myself and my fellow devs are constantly underestimating the ongoing maintenance burdens of new code. It's hard enough getting us to make sufficiently pessimistic time estimates for the initial implementation that PMs don't have to smile and multiply by large integers - nevermind accounting for the next few years of adding additional debug logic, logging, fixing edge cases, supporting legacy decisions, adding a full set of fuzzing and regression tests, etc.
I definitely agree that the ongoing maintenance burden of new code is consistently underestimated, but it doesn't follow from that that we should prefer third-party libraries to our own code. A new third party library is still new code, and it's not often a reliable assumption that someone else will maintain that code forever. Further, general-purpose third party libraries aren't tailored to our needs, so they often contain more code than we would write. The result? We often end up with a larger maintenance burden by using a library. And finally, even if someone else maintains your library, that doesn't mean they will maintain the integration point between your code and theirs. Keeping up with changes in a library that doesn't value reverse compatibility highly can be just as costly as maintaining a library yourself.
> Ah yes, the "it's all subjective" argument. The thing is, it's not subjective.
Some things are, some things aren't - and even some of the subjective objections can be reasonable. If your entire dev team is a bunch of crusty C programmers with no exception handling experience, introducing a new library that throws exceptions has drawbacks, no matter how reasonably they might be used. They're simply not in the right mindset for it. If it's just 1 member of your dev team, however, perhaps it's time for them to learn to deal with exception handling if the library is otherwise good.
Then I see devs getting very passionate about vanilla XML vs JSON, CamelCase vs snake_case, and making kneejerk reactions to well documented edge cases that their replacement implementations would also have (minus the documentation.)
Your tastes are a bit better, but even there I can think of scenarios where all three aren't vital to me - easily fungible dev-only debug tools come to mind.
> Sure, it may be hard to objectively measure the qualities of a library, but that doesn't mean we should give up and assume there's no underlying objective truth.
Agreed. But I find that the best way to measure is to have at least one person try - or have previous experience with - using the library.
Obscenely convoluted dependency chains that can't be simply checked into VCS? Consistently unstable? Insecure by design and defaults? Constant public interface churn? I agree that there are reasons to kill a library with fire - sometimes before you decide to integrate it in the first place, sometimes after the fact when you've realized it was a mistake.
> The result? We often end up with a larger maintenance burden by using a library.
I see this rather rarely, which does make me think we've spent time on opposite ends of the NIH spectrum. It does happen though, and those are libraries that I will still avoid.
> ... but it doesn't follow from that that we should prefer third-party libraries to our own code.
Agreed. The big tragedy of Open Source is that after its initial wave of success it good flooded with freeloaders. If you allow me to repeat myself: The one flaw of the GPL is that it did not offer provisions to prohibit distribution in binary form. "You want free? Better you know how to unpack the tarball and build the project from scratch... all by yourself (No build scripts beyond Makefile permitted, thank you very much)".
Of course, such overzealous GPL would need to also make provisions to accept dual licensing with nice, commercial, royalty seeking licences. That way, everybody gets what they value the most: You want freedom, you pay with sweat; you just want access, you pay in cash.
And what are those goals? Provide free labor to corporate interests? Give away free products to cheap people that cannot care to recognize for one second the personal effort invested by the producers?
Creators are going to create, they cannot help it. I do not say that you cannot provide altruistic value to others, but my opinion is that you should care about your own people first. And the big mistake of programmers as a profession is that we do not see other fellow programmers as our own people.
I upvoted your comment and the GP's, as you both make good points.
It does not have to be one or the other, though. The obvious solution to have the best of both camps is to find an open source library with a (mostly) sane architecture and either support it actively or fork it (depending on the bullshit-to-effective ratio in the code and user base).
If you end up forking it, you may end up outcompeting the original if you band together with like minded devs/users from the previous incarnation... instead of keeping it behind your org's firewall.
#2 I feel like is the trap. As beings with limited amounts of time on this planet, the test should not be about absolutes, but if what we would build is better given the time we would have to devote.
And this isn't that strange of a phenomena when you think about it. Construction workers choose hammers rather than forging them on their own, we buy food from supermarkets instead of growing it on our own, etc. One of the principle elements of humanity is our ability to stand on their shoulders of giants, which shouldn't be lost
> #2 I feel like is the trap. As beings with limited amounts of time on this planet, the test should not be about absolutes, but if what we would build is better given the time we would have to devote.
I don't think it's really about the time you have to devote. Integrating a library can take longer than writing the code yourself, and that doesn't change with respect to the time you have.
> And this isn't that strange of a phenomena when you think about it. Construction workers choose hammers rather than forging them on their own, we buy food from supermarkets instead of growing it on our own, etc. One of the principle elements of humanity is our ability to stand on their shoulders of giants, which shouldn't be lost
This fact isn't lost on me. My point is that there are actually relatively few giants who have solved an average arbitrarily-chosen problem, and even fewer who have shared their work. If you have 10+ years of programming experience, it shouldn't be hard to find an area where you are the giant.
This is not just a defense of NIH but also a good rule of thumb for how to do NIH "right". The more narrowly you define the problem the more likely that the DIY approach will be better in both time and quality.
There's a high end pizza place in my city. They buy their flour, but they grow their own arugula. They didn't build their own building from scratch, but they did build their own brick ovens from scratch.
This is something I've come to appreciate over the years as well. The more libraries a project has the harder is it to maintain, despite the libraries supposedly being used to ease maintenance.
This goes triple if the libraries in question are commercial closed source things. One company decides to stop supporting that product or the company goes under or gets bought out and suddenly your project is anchored to something that may or may not stay working as the OS changes underneath it.
Every library provides a benefit especially on the front end with reducing development time, but also brings liabilities. So you need to strike a balance between getting enough utility out of it and not loading your project down with so many liabilities that it will be impossible to maintain or port.
The maintenance overhead of keeping your own code working is usually far less than the maintenance overhead of chasing API changes in the libraries or discovering new bugs that crop up in minor version updates. Even if you go and pull the libraries directly into your project (so you're fixed on specific versions), you still have problems with having a far more complicated build process that fails far more often thanks to problems with the library builds. Or worse, you have version conflicts with installed libraries.
In sort: Keep your list of dependencies as short as possible, but no shorter.
#1 has a big influence on my argument. I think it was Python and NodeJS library systems that made it super simple for people to publish libraries. There are a lot of trivial libraries out there, and many of questionable quality. Sometimes using a library ends up as more work than writing the bits of code myself.
It's not like I'm going to decide to write a new 20k line library for my project, but if something is only a few hundred lines of code it's definitely easier to just rewrite.
I find myself trending the same way, but for another reason. I've gone back and forth on my viewpoint over my programming career, and now whenever I decide to revisit an old project, I find the "more NIH" I wrote it the more likely it is that I can load it up in a current version of the compiler / IDE and hit "Build and Run" and it will Just Work. Even if I included the external library in the repository, the library still contains extra code I don't need which may fail to build.
"As I get better at programming, I more and more often realize that I can write a better version of the library, or just one that better fits my needs."
That's a good sign. As you get better and more experienced as developer you start to realise the benefit of thinking things through before launching in to writing code or reaching for an existing library. Some times you want to do the first, and other times you want to do the second. There isn't a straightforward "one size fits all" rule to say which approach is best.
> 95% isn't "only". That's pretty darn high. I'd wish that in-house code covered 95% of requirements more often.
> If we're talking about open-source (I can't see why not), you can just fork the library in question and adjust it to fulfill the missing 5%.
The percentage isn't relevant if working around the design of the library to fulfill the missing 5% takes 30x as long as just writing the code yourself. Additionally, libraries often do a lot of things you don't need, so you get a bunch of bloat along with the stuff you want.
> Obviously noone sane would pick third party libraries based on how fancy some web page is. That's a strawman argument.
We can argue about the sanity of people, but the fact remains that people on teams I've worked on have chosen libraries based on the fanciness of their webpage.
> The percentage isn't relevant if working around the design of the library to fulfill the missing 5% takes 30x as long as just writing the code yourself.
Of course, but one should be aware of a tendency to overestimate the effort required for tweaking third party code (understanding somebody else's code is always more difficult), and underestimate the effort required for writing the thing from scratch.
It's closely related to the infamous "Big Rewrite" problem.
> the fact remains that people on teams I've worked on have chosen libraries based on the fanciness of their webpage
I believe you, but I find it difficult to believe these would be the same people who tend to introspect on their profession a lot, eg. by following software engineering blogs, so the author is kind of preaching to the choir in my view.
> Of course, but one should be aware of a tendency to overestimate the effort required for tweaking third party code (understanding somebody else's code is always more difficult), and underestimate the effort required for writing the thing from scratch.
It's not just the cost of tweaking, it's the cost of keeping it up-to-date, too. I agree with the general gist of what you're saying, just wanted to point out there's more to it.
> I believe you, but I find it difficult to believe these would be the same people who tend to introspect on their profession a lot, eg. by following software engineering blogs, so the author is kind of preaching to the choir in my view.
On the contrary, I think people who don't introspect about what they're doing are more likely to read tech blogs. How else would they find out what the latest fad libraries are so they can treat them as best practices? :)
I agree that understanding someone else's code is typically easier than writing that code from scratch. But that's just part of the equation. Another part is how much of that code you have to understand / write. Understanding 1 MLOC framework with all its quirks may be much harder than writing 1 thousand lines of code implementing a feature that you want.
So if you need only 5% of functionality of a huge general-purpose library, it may be still actually faster to write that 5% functionality by yourself, than to understand, tweak and later maintain the library.
> If we're talking about open-source (I can't see why not), you can just fork the library in question and adjust it to fulfill the missing 5%.
Depending on the situation this can mean having to understand lots of gory details of the implementation of the library. It is often easier to write an own implementation from scratch than having to understand these details and subtile interactions.
Typically "open source" just means that the source code is available under an OSI-compatible license and not that there is also a documentation available that guides you from barely understanding the library to understanding every implementation details that is necessary to do internal changes.
> If we're talking about open-source (I can't see why not), you can just fork the library in question and adjust it to fulfill the missing 5%.
You're not wrong, but you are assuming some things:
1. That the library is written in a language suitable for the task. E.g. if you're doing real-time programming you probably don't want Visual Basic and if you're doing security programming or text processing you probably don't want C.
2. That the library is written well enough to read. You'd be amazed how much truly bad code is out there. It runs, but … it's not good. OpenSSL is an example.
3. That the library is architected in such a fashion that the 5% functionality can be added without major rework. The functionality may be small, but adding it might require quite a lot of architectural change and refactoring.
One reason of "invented here syndrome" is proliferation of crappy developers. I've seen that many times in the companies are worked for (as senior dev / architect): the code that is required by some web framework or library is quite clean and well structured, code that is at core, business logic is a mess.
Tell some junior to mid-level developer to write any significant piece of application and the host of bugs and NullPointerExceptions will follow. Finding a library for everything and only leaving simple gluing them together as a task for developers seems like the only way of ensuring minimum required code quality.
Invented here and not invented here are equal signatures of crappy developers, in my opinion. I'll admit there's some skill involved, sometimes, in the engineering efforts of the NIH minded developer. But I've also seen NIH developers insist on building things like CI servers, company wikis, and message queues. And none of them have even been tangentially better than the alternatives. They're both mindsets that speak to immaturity and lack of judgement, but man, I've seen way more of NIH than IH, and in a lot more business-destructive senses
If all they are allowed to do is gluing code together they will never learn and will never get better at anything than .. gluing code together. Sounds like a bad idea.
If gluing code together is sufficient to solve the business problem (and 99.9% of the time it is) then it's probably the most efficient thing to be doing, and absolutely what they should be focusing on.
Maybe part of the issue is that I personally never work as full-time employee. I'm more like consultant, hired on per-project basis, so training less experienced developers is neither in my job description nor in my schedule.
I would tend to consider training up of any permanent staff on the team to be part of my job. Perhaps not on basic technology, but certainly on what we're doing. That way, when I leave, there's still knowledge in house.
This may seem to limit business opportunities, but I consider it part of a job well done.
On the other hand, I've just been at a place where "invented here syndrome" masked crappy developers to the detriment of the business. One tester spent 3 days tracking down a simple python dependency bug - but in the meantime he was productive because Travis is so easy to set up, and his Ruby tests worked.
Travis is definitely more appealing to me than maintaining my own Jenkins instance. But I can't help but feel that we would have caught this guy's crappiness much earlier if he had to set up his own Jenkins job instead of a travis.yml file.
I take a more selfish approach on this debate. I nearly always write my own code to do something, even when a library already exists... just so I can understand the problem well. Then, when I finally decide to use a specific library, I understand why they're doing something in a specific way more clearly.
Maybe not the most advisable route for commercial work, but one that keeps me satisfied.
You're going to need to understand how the system works regardless. So you can spend the time learning how the internals of someone else's library works, or you can learn how to solve the problem and write your own. In my experience, there is rarely a difference between the two. But when I write my own code, I know how to make it integrate with all my other code better.
I do the same thing all the time and I'm trying to break this habit since it might be holding me back. I'm trying to realize that I can learn a lot from someone else's work and by diving in and learning the ins and outs and even reading the library's source code I can learn a lot more than experimenting around trying to produce similar functionality myself. Reading code is far more tedious though than being able to creatively experiment with something. The creative learning is important I'm just trying to use it in other applications than re-inventing a wheel that already works well
Naturally, like all engineering, its a balancing act with tradeoffs. We are responsible for all the code that runs in our application, but we are also responsible for shipping. It doesn't make sense in this day and age to write every line myself, but at the same time it is irresponsible to include buggy dependencies.
One of the first things I do when asked to solve a problem or implement feature X is to check around to see if there is an existing solution. Its very likely that I will find something that at least is close to what I want, and from there I evaluate the library, and ask these questions:
1) how complex is the problem that this library solves? How difficult would it be to implement it in my app?
2) How close is this library to what I want? Is it easy to integrate or do I have to jump through a lot of hoops?
3) How bloated is the library? Do I have to include a bunch of stuff in the production build that I'm not using? (not always a problem depending on how the language imports libraries)
4) Is it well written? Is it actively maintained? Github stars and issue backlogs don't say everything but can provide good heuristics.
5) Do I expect that I will need to customize or optimize the underlying behavior in the future?
Ultimately, I think the biggest question here is "Is the library working for me, or am I working for it?" If the latter, maybe its worth considering writing it yourself.
It also depends on the language and ecosystem for me, not to mention individual libraries. I'm working with javascript primarily right now, and there are a lot of npm libraries that are so small it makes more sense to just copy/paste the code into a utility (after doing due diligence it of course) and iterating on top of it. But it doesn't make sense to rewrite, say, JQuery, or React.
In addition that the code you consider including should have high enough quality, it is also important to consider external dependencies. If the library require 35 external dependencies, it is likely to be a maintenance burden in the future, so you better avoid it. Otherwise you may end up in the situation where A upgrade to use version 2.0 API of library C, while library B continue to use version 1.3, and now you sudenly have a compatibility problem. Other factors can be if the library is hard to build, or is only for windows while your product is for windows, mac and linux.
The real question is what will create the least maintenance burden in the future.
I think a related, contributing problem is an inability to create and describe abstractions.
Time and again, I see projects where the developer(s) is/are unable to do anything beyond one line after another of calls into low level library primitives, or maybe filling up a bunch of directories named after design patterns which plug into a framework.
Beyond that, though, and there seems to be a widespread inability to recognize repetition and then refactor it out, or to make layers to separate domain/business logic from low level details.
Also, the fairy tale of "self documenting code" is widespread, vs say "literate programming", and even if many of your in-house staff made their own libraries, the interfaces would be hard to understand. It's amazing how the act of trying to document an interface forces you to keep it clean and orthogonal.
Mixing developers with different "styles" (inline vs abstraction) is going to be frustrating for both extremes.
I tend to agree with the author in his opinion, but much like "The Big Ball of Mud", you need to understand the forces that cause these patterns to emerge.
Iron law of open source: you either use a solution with extensive community support, or you forge your own path, assuming total responsibility for maintenance and support of any code that comes from off the beaten track (whether developed in-house or not).
For code that falls outside the company's core competency, is it any surprise why companies are more inclined to use an external library?
Certainly a big problem of mine, or at least in the past. Just as the not-invented here syndrome people will eventually learn that there is good code out there, people like me, eventually learn that you need to code some modules yourself, and sometimes the 400 line if-else-for crap is exactly what is needed to just get a step further and realizing that the customer actually wanted another feature to begin with.
I came to this realization when I had to make too many concessions to avoid reinventing the wheel. Between the huge list of dependencies and having to change a major part of how I do business, I decided to trade one "good enough" for another.
My first exposure to anything programming related was some web publishing courses I took at a community college when I was a teenager. The classes were intended for people with a graphic design, rather than developer bent, so the classes were mostly focused on HTML, CSS, and web design-relevant Photoshop tricks.
I credit those classes with sparking my interest in programming, but they didn't actually cover it to any extent. Instead there was this attitude that you could just find a Javascript file or Perl CGI script that would do what you need, and when necessary you could shoehorn it in, despite having little to no understanding of the involved programming languages. GitHub didn't exist yet, so there wasn't a centralized, trustworthy source of free Javascript modules; instead you had to find them on sketchy ad-riddled sites and hope they did what they said they did. I remember there were even sites that would sell scripts, a concept that feels totally alien to me now in 2016.
tl;dr Invented here syndrome is a big thing for web designers who aren't developers
The library will seem to withstand a quick and dirty test on performance and scale-ability, and then collapse when near release production puts pressure on it. Then you will rewrite it, and to distribute the crunch, will do so from then on.
Do you think this guy could pass the hazing rituals of bay area companies? Reverse this binary tree for me, that'll tell me if you're a good programmer.
1. The ease of creating a library these days is resulting in a proliferation of utterly crap libraries, so bringing on a new library is more and more of a liability, both to future reusability of code and to security.
2. As I get better at programming, I more and more often realize that I can write a better version of the library, or just one that better fits my needs.
These combine to create a situation where I'm less and less likely to want to import something.