The Hidden Costs of Software Dependencies

Screencast.txt

Transcribed by Rugo Obi

When I think back about the things that made the biggest difference to my productivity as a programmer over the last 10 years, one of them is reducing my dependency on dependencies.

There is nothing wrong with software dependencies, but it's too easy to have too many of them.

It's too easy to call npm install one more time. I want to recreate a worst-of from my previous dependency graphs and show you the ways I've been bitten by this in the past.

Have you ever worked with another programmer and then opened up their dependencies file, be that the package.json in JavaScript or requirements.txt file in Python or whatever, and then see an ungodly number of dependencies.

For example, if you look at the bottom right corner, there are 70 lines just in the regular dependencies.

Here there are a ton of React components, indicating perhaps that this programmer doesn't really know how to use React or at least how to do anything complicated in it. And just glues together these libraries like LEGO.

This is a bad way of programming for a plethora of reasons I'll get into later. If that sounds a bit harsh it's because I have programmed like this in the past and I know exactly how it will come around to bite you.

An over-reliance on libraries makes your knowledge weak and short-lived.

All you know is how to call the given library but you don't know the fundamentals of the programming platform in question. And these fundamentals tend to stick around a lot longer and be more generally useful.

jQuery has been around since 2006, and until about two or three years ago it was immensely popular in the JavaScript world. This came at the expense of people actually learning how to write raw JavaScript.

For example, the most popular use-case of jQuery was to grab certain elements. Here are all the p elements, here is the container element.

By comparison, I can create my own very simple function in pure JavaScript using _ instead of $.

javascript function _(selector) { return document.querySelectorAll(selector ) }

And then this function uses document.querySelectorAll from the browser and the selector.

Let's see this in action.

Call with p and there you go. And all the container elements.

Here is another damning example from my past of a completely unnecessary dependencies.

I called in this library slack-notifier in order to send messages over a Slack webhook. You can see underneath the API that the gem provided.

Here is how you can implement it without any gem.

That for me is a 100x simpler and also more instructive.

This approach can be taken for more complicated domain concepts.

For example, I had a state machine in my seller object and I included that state machine via this gem in my Gemfile. (A gem is a Ruby library). And it seems quite advanced and it has this rich DSL for defining events and transitions and after transitions. So that’s the use-case of the gem.

Despite how complex the underlying implementation was for the full state machine gem, I was able to create everything I needed for my codebase in about 78 Lines.

The only thing that changed the implementation was this inclusion of this mixin here. Otherwise, exactly the same.

This was quite educational because I got to learn how to implement this kind of really interesting DSL and think about the problem space of state machines.

If we go into state machine gem, you can see that there is an absolutely huge amount of code here. Like one example file here is 270 lines. Can you imagine the difficulty of modifying that for my needs compared to modifying my 70-liner or whatever it was over here.

I'd much prefer to be in that situation.

Some dependencies are so large that people can and do write books about them. When you've included them or more than one or two of them into your codebase, you massively increase the amount of RAM - mental RAM- somebody needs to understand your codebase. And your job descriptions are going to turn into buzzword bingo.

So for example, here I have Rails, jQuery-rails, and jQuery UI rails. Meaning that somebody working in this codebase had to know jQuery at some point.

Similarly, I brought in the Neat CSS framework and the Bourbon library that provided some helpers.

God knows why I brought it in through Ruby instead of directly through CSS, but that added more stuff that somebody needed to know.

In fact, previously I had a different CSS framework, the Foundation framework.

Then I have some test framework. it's actually very very good but I don't think it offers anything over the Rails built-in. Again I read a whole book to understand that.

And then Spree, another huge library. It was very useful at the time but it required a lot of knowledge about that library in order to be productive.

This ends up sucking for onboarding new programmers because they can’t be productive until they know this stuff. And when it's a legacy code-base, they don't want to learn it because often you’re using legacy tools. For example, imagine I still had jQuery in here, what JavaScript programmer would want to use that today?

When you install a library or plugin, you never expect dependency hell. Yet often as the weeks turn to months to years, this is what you end up having to deal with.

In my case I had to deal with this nine times. All of these Gems just in the Ruby land caused me dependency hell at some point.

You can see that how I dealt with it was forking the repo and then deploying from this repo instead of the original. I have since deleted all of these dependencies as part of my crusade against having too many of them.

So in order to demonstrate what was going wrong, I'm going to use this Gem payday as an example. I have another minimum Gemfile here that created the original problem.

So if I try to execute this with Bundle-that's the dependency manager in Ruby- it's going to grab those libraries then try to find compatible sub-dependencies. And you can see from the readout there, the bit in the Gemfile column, payday was resolved to 1.0.0 beta 6 which depends on money 3.6. Whereas spree core depends on money 5.1.1.

There is no resolving that.

The way I dealt with this was to flock the repo and create my own Gemspec with the correct version numbers.

Of course that isn't enough, you have to look at the difference between money 5.1 and 3.6. And then modify all the code in this payday gem that uses the old version of money to the code that uses the newer version of money.

Libraries and Plugins tend to beget more libraries and plugins.

For example, in the early stages of Oxbridge Notes, I used the Spree gem which is basically an open source e-commerce site. And they also fostered their own plugin ecosystem where I could get a sitemap generator as well as authentication, social logins, PayPal integrations etc.

So I jumped on board, thinking "oh I can just slot this stuff in and it would be really easy".

Of course, there is not huge time save with each of these because there is already a sitemap generator gem in the Ruby world and I could also do it myself. With the spree-auth-devise, there is also the devise gem, or you could just do it yourself in 30 lines of Rails to manage authentication.

For social OAuth probably 30 Lines, PayPal Express, you could just use the PayPal API.

In the end, what happened was that it became very difficult for me to remove Spree because I became dependent on all of these and they had to be extricated.

Compare that situation to where if I’d just used that and had custom stuff for the rest, then the migration away from spree when I outgrew it would have been a lot easier.

This whole problem tends to get amplified in the JavaScript world. I don't really know why, but it does.

Here I have just the dev-dependencies from a codebase of someone I know, and I've sorted all the dependencies by the overarching ecosystem of plugin.

Here we have semantic-release, here we have a bunch of Babel stuff; Nearly 10 plugins. Here we have some Eslint stuff; An absolute ton of them, etc. There's Jest, there’s stylelint, Serverless, Typescript.

Now can you imagine how difficult it is going to be to maintain all these and how much busy work that is going to create throughout the lifespan of your codebase?

I really question whether it is worth having so many of these plugins to hyper optimize something instead of having just one or two for each or just the core default that you hope will work okay...

How do we get there?

Part of the reason is that once you have one library in place, it eases the road for the next one.

For example I remember having jQuery in my website and then wanting some sort of auto-complete solution both on the front-end and back-end.

When I compared the options open to me, the ones that relied on jQuery and jQuery UI looked a lot better out of the box and so seemed to save me work. Therefore I went with them. But at the same time this cost me to be more tightly coupled to jQuery because now I had multiple reasons to keep using the plugin.

A nightmare situation with regards to dependencies is when they’re abandoned completely or deprecated.

One case where that happened to me was with CoffeeScript Rails.

So CoffeeScript was this language that transpiled into JavaScript that was popular in the Ruby world a couple of years ago.

Here is an example of what it looked like.

However, eventually it lost favor, so I had to rewrite the code as regular JavaScript.

Thankfully there were some converters . I still feel annoyed about the effort I put into writing CoffeeScript and learning how it worked.

Returning to the Gemfile here, there was another Ggm, Paranoia. I never actually wanted Paranoia so much as I was forced to take it in because I initially used Spree and Paranoia was a dependency of Spree and I was scared of removing it.

So here's basically what it does.

If you start off with a product in the database and you grab it, when you call destroy which normally deletes something from the database in Rails, this time it just sets the deleted_at column of that particular product rather than removing it.

The idea is to soft delete items to keep around for database referential integrity or accounting purposes.

Of course when I visit GitHub it's not recommended anymore.

The last one I wanted to show you here was Paperclip. This got removed because Rails offered similar functionality eventually.

So it was for uploading binary assets like images and PDFs. Because my website deals with digital documents I was very heavily coupled to this particular gem and it was extremely painful to extricate.

There’s a tension between the maintenance of libraries and those using them, in the sense that the maintainers want to change things over time and do what in their opinion will improve them. Whereas the people who have those libraries inside their codebase for the last 10 years value stability a lot more.

So for example, I use Rails. I'm happy with it despite it being a large library, the benefits outweigh the costs. But my God, some of the upgrades were painful.

For example, here is one of the many many things that changed. You can see at the bottom here there is this conditions key mapping to some where clause SQL and they changed this in Rails 3 I believe, to have this DSL that gets executed, this weird DSL that gets executed in a kind of deferred way using something like a closure.

Now here is an example of one of the commits where I had to make this change. You can see there are a bunch of things as well as has_many that were modified. For example the scope method, the default_scope method and so on.

Now changing syntax is, you know, not that hard, but what is difficult is avoiding bugs especially when -as was the case with me- not all of these methods had unit tests.

Another issue I’ve encountered is that when you include a gem or a library, it's sort of like adding a big red button within your code base. And sooner or later you or someone else working within that codebase will be tempted to press that button or to use some function call that perhaps they shouldn't.

For example, over in Oxbridge Notes I have this class called CommissionAmount. This corresponds to a database record that contains the commission I charge in a particular geographic location at a particular point in time.

I validate the location of this which I call store because I have multiple stores in different countries and ensure that the store attribute is present and that it’s included within.

Here we have the bit from the country gem: ISO3166::Country.all.map(&:alpha2).map(&:downcase)

This essentially says that the country code should be one of any country code in the world.

Unfortunately I'm only active in six different countries. The UK, US, Australia, Canada, New Zealand and Ireland. Yet, I'm allowing any country in the world here.

In fact this led to some problems in production because I wasn't expecting there to be commission amount for these random countries.

What I should have had here is a reference to my own homegrown config bit.

You might very well be curious about how many libraries I have in Oxbridge Notes, a 10 year-old codebase.

The answer is, more than I'd like to have.

Part of this is due to an accumulation of cruft that was never really economical to remove.

So let's first look at the Gem file. I'm going to do a regex for start of line and then either a comment or white space (as many as possible- including zero), and then the word gem.

So if I hit that and then I press n, you can see on the right hand side, 52. So 52 Gems.

For reference, a fresh install of Rails has about 20.

Let's do the same thing for the JavaScript libraries.

Here you can see in the bottom right corner that there are 16 regular dependencies and then there are 15 dev dependencies. That’s 31 in total. Up from perhaps 5 in a fresh install of Rails.

That's all I have to say about dependencies for today.

See you next time.

Semicolon & Sons

Semicolon & Sons

Timeless Programming Paradigms

Episode #4

The Hidden Costs of Software Dependencies

Show Notes

Screencast.txt

Get Episode Alerts and Freebies