Episode #4

Continuous Integration Testing: Basics + What to Test

A key factor in reducing my coding time for Oxbridge Notes down to a few hours per month was adding comprehensive integration tests. Today I demonstrate how these tests work using the test-browser's NO HEADLESS mode, which lets you actually see the browser executing your tests. Next I show how to write such tests using tools like factories (touching on how I test tracking code). Following that, I show how to set up a continuous integration server (using docker containers), and how to run your CI tests locally to verify they work before pushing them to the cloud. Lastly, I finish with a discussion about what we should test, given limited testing budget.

July 05, 2020

Show Notes

No notes available for this episode.

Screencast.txt

Transcribed by Rugo Obi

Another major force multiplier in reducing my coding time for Oxbridge Notes down to a few hours a month, was creating thorough integration tests.

To start off slow, let me run a sample test.

I'm running this in NO_HEADLESS mode, so you can actually see the browser execute the tests. It went very quickly, and you saw how the browser fills in all the form fields and navigated from page to page.

It’s essentially simulating a real user. Normally, you would run such a test in HEADLESS mode, so that it doesn't get in the way of your ability to code, and distract you.

Furthermore, you'd probably be running it on a continuous integration server, which I'll cover later in this episode.

You can see on the screen that a bunch of tests passed.

For example, Creating A Seller does not create an invalid user when an empty form is submitted.

And a Normal seller with cookie notice tracking can apply to itself, etc. etc.

If it seems to you like more tests have passed here than we actually saw in the browser, then that's correct.

That's because the HEADLESS variable is only for my JS tests, not for my HTML-only tests. I'll go more into that distinction later in the episode.

Let's go over to the code and see what all of this corresponds to. I'm going to open up a particular spec; this create_seller_ spec. Yeah, there we go.

Now, inside this file, the first thing you'll see is this before block.

This gets executed before every single test in this file and all it does is use some factories to create some necessary database context.

For example, a certain commission_amount should exist in order to display that page, what commission each seller gets and there should be some discipline_names. Disciplines that I’m selling, in order to populate the form where the seller chooses what discipline they'd like to sell on the website.

There's also this email which the prospective seller is going to type into that form. I share this throughout all the examples here.

Now let's get down to the test itself.

It’s js: true, this essentially means that we’re going to use a JavaScript driver as opposed to a purely HTML driver.

What actually happens in the test?

We visit some particular URL, the new_seller_url. This is a named URL which turns into the actual URL. We accept some advanced tracking, this is defined over here. It just means "click on I accept in the JavaScript thing 02;25".

Then we fill_in_seller_form_fields. This is also a custom piece of code because I share this throughout multiple tests, and it gets modified here with some extra fields.

I fill in seller_description with the text SMART Study.

I attach a file to the seller_doc id, and that file is found in Rails.root plus some particular fixtures - the robots.txt. I use that particular file because it’s extremely small and the test will run more quickly.

I select the current year from seller_exam_year, and so on and so forth.

Once all that's done, I start running my sort of test assertions to check that everything worked.

The first thing, the main thing I expect is that a Seller has been created in the database, and I check here that the last seller’s email is equal to the email which I provided up above: "seller@example.com".

Then I check that theh1.title on the final page has the content thanks for applying, i.e that the seller gets moved on to the next page.

And then I do some sort of JavaScript testing. I evaluate some custom JavaScript and I check that the trackingsMade object has googleConversion: true.

What I’m essentially doing here is checking that my tracking code works. I find this is very fragile, so therefore I test this in my integration tests.

All of this is really nice I find. It reads almost like English, especially the stuff for visiting URLs, filling in form fields, attaching files. It's one of the easiest things you could possibly code, and it takes very little time to do it once you've learned the basic syntax.

And don’t be fooled into thinking that this is a Ruby or Rails specific thing. I've used almost the equivalent library in PHP Laravel - it’s called Dusk. And there’s something very similar in JavaScript and Python too. I believe this exists in every programming language out there.

Before I had these tests in place, working with my code base was a bit like playing whack-a-mole. I change some feature, and then randomly or seemingly randomly, in some far away part of the code, something would break and I’d find out about that two days later and then be up in the middle of the night trying to fix it.

Then one winter about five or six years ago when I was in Rio at the time, I spent every single morning adding tests to my existing code base.

Lets see how many of them I added in total using ripgrep.

So I’m going to ripgrep for the starter line and then a space or multiple spaces possibly (I don’t care how many) then the word ‘it’ and I’m just gonna run that little search within spec/requests.

That looks about right, you can see that all these tests are occurring. For example, it shows a checkout upsell. It adds bundle to cart and removes single products from cart.

Now let’s see how many of them I have in total.

I'm going to rerun that command with this -stats flag. And you can see there that I have 159 matches. 159 of these integration tests. I love having these.

Now that I have them in place, I have so much confidence in my code. And I very rarely get surprise bugs in production for existing features. New stuff always has a few kinks to iron out. That’s a given in software development, I think. Basically, this allows me to sleep at night.

Before I ever deploy to production, I run this entire suite of tests and ensure that they are passing.

I do this on a continuous integration server, because I don't want to slow down my machine.

You can see here, the test succeeded, for the last couple of commits, and they failed here.

I’m very happy that I detected this failure at the testing stage rather than in production where the failures could have potentially caused havoc.

These tests get run once for every commit I push to GitHub. That way, I never miss one.

Let's look at how an individual test run works.

First you can see here that the whole thing, building the environment and running the test took 7 minutes 49 seconds. This was once in the region of 30 minutes when I had even less tests, but I’ve gotten better at writing fast tests in the interim. The tools have also improved.

What basically happens here is, we start off by spinning up an environment. The important part here is that there is a CircleCI Ruby container with a particular Ruby version I want, Nodejs for running Webpack and the browsers for that sort of integration testing stuff you saw, especially the JavaScript stuff.

Then I also pull in my other major dependencies; Postgres, ElasticSearch and Redis.

I use the real life versions of this in my tests because I want them to have parity with my actual production code.

Some people would fake these things, but I think that’s bad practice for your integration tests.

Next, I check out the code. This essentially is pulling the code from GitHub into the testing container. There’s some debugging stuff here that’s not important.

The next important bit here is bundle install. Bundle is the Ruby package manager, this is basically installing all the Ruby libraries I need. You’ll have the equivalent in every other language.

This can take a long time to download, so I cache these by taking a hash of the lock file. I do the same for installing and caching the JavaScript packages. Then I migrate my database, then I start running the tests. There's a couple of JavaScript unit tests, that’s not important here. The most important bit is this: rspec test.

Here's the very bottom. You can see all the tests that get run. There’s 316 examples. That is more than you would have seen previously because I'm also including my unit tests here. And you can see all these tests passed. That’s very cool.

For completion’s sake, I want to show you how I configured that continuous integration server.

So I'm going to open up my CircleCI config.yml file. And you can see here that I include this Docker image that you saw previously, with Ruby and the Node browsers.

Interestingly here there are some environment variables. For example, I tell it where Postgres is and I tell it what environment to use in my Rails app. The test environment rather than the production environment for example.

Then I pull in the Postgres container and set the Postgres user and database name and password. That has to be synced of course, with my database configuration file in Ruby-land.

As you can see it’s done over here, then I pull in ElasticSearch and also Redis. Next, I check out the code. That's built into CircleCI, I don’t have to define that.

Then I run a custom command to print the Chrome driver version. Then I install my package manager. This is to set up the cache with the check form of my Gemfile.lock file. Then I install all of my gems if they've not been cached, ditto for my JavaScript dependencies with yarn.

Then I wait for the database to be ready. I migrate that database, and then I start running my JavaScript unit tests. And finally, my Ruby or rspec tests.

That's it.

Creating that config file can be a lot of hassle if you're not familiar with the .yml file format, It’s quite strict. Or with Docker. And the development process will be very annoying if you were to constantly push to that remote continuous integration server to check if it was valid. But luckily, you can validate that locally with this command, to check if you happen to have any syntax errors there.

The second tip is that you can also run all those tests locally on a localized Docker instance. This assumes that you have Docker already running in your machine.

Be warned, this is often much more time consuming than the test in the remote server. But the plus side is, you don't have to make a ton of commits with messages like, "try fixing Circleci -Attempt 8", "Attempt 9".

Let me end this episode by talking about what to test.

Tests have cost. There's an initial upfront payment, in that you have to first write that test, and then down the line there’s a bunch of maintenance costs.

While more tests will lead to -will usually lead to- more robust code, the price you pay for that extra modicum of reliability, may not be worth it in your case.

I'm going to share my personal rules about what to test.

The first is anything that's going to cause you economic harm. For example, I want to make sure that all my revenue-generating flows keep working. I don't want to lose a day's worth of revenue. Most businesses don’t either.

And another thing that’s a little controversial here is, I literally test against the PayPal window that gets popped up. This is controversial in that I'm testing someone else’s code, a third party vendor.

But the advantage is that I get to issue refunds against the sandbox PayPal API and transfer payments to sellers and all that kind of thing. And it gives me a hell of a lot of confidence that everything is working as it should.

The next thing that’s very important to test is anything that can have legal consequences if you get wrong.

In the case of most websites, you certainly want to be testing that your unsubscribe links work, lest you fall foul of the 'Can Spam Act'.

If you’re a software vendor, you're gonna wanna make sure that your uptime is whatever you’ve promised in your contracts.

And if you’re selling to consumers, like I am, you’re gonna wanna have to check that your VAT is calculated correctly because you can do some serious jail time for mistakes in that department.

The next important thing for me to test is public landing pages. In the modern internet, it costs a lot of money to acquire users, to get them to your website for that very first time.

It could be 30 bucks, it could be 100 bucks, it could be more, therefore you want to leave a very good first impression.

In my case, I do things like check that my various static pages work, check that all my kind of sample pages work, my product pages, that kind of thing.

The next thing you want to test is any Heavily-used core features. This is basically an application of the 80/20 principle. For example, you want to check the your login continues working, all that sort of stuff. This is going to depend a lot on your particular use case.

The next thing to test is anything that if broken could cause irreversible damage to business.

This won't necessarily be a browser test, but it also might be. In the case of my code, I could tell you one thing that can cause irreversible damage for sure. I'm going to give you an example here:

So authors upload these subjects that they sell on my website. For example, here's Chinese Law on sale. Let’s go into the management page here, you can see that there are a bunch of files which the author uploads and they can add more files here.

This is the corresponding product on sale in the frontend, random customers will buy that.

Now, I also allow sellers on the website to delete any product they’ve uploaded. But you can see a problem here. What if some customer -like this one here- has already bought those Chinese Law notes but the seller has deleted it ever since? I still need to honor that contract to the purchaser of those notes.

Therefore I can delete everything else that that seller has uploaded... If you look here, all these other subjects that they have, but I can’t delete that Chinese law. I have to keep that present in order to be able to honor those downloads. Therefore I test against this particular error case within my code.

The final thing I find important to test is whenever there are Systems that go across language boundaries.

This is true integration testing. For an example here, I actually test that webpack works. I checked that polyfills work.

I found in the past when updating webpack, that sometimes due to me not understanding the massive changes in configuration they have between updates. - that things like my polyfills that I use to support IE 11 and other weird browsers - stopped working.

Therefore I do little checks to ensure that the polyfill continues being present within my generated code. I also check for things like auto-prefixing of CSS.

Alright, that's about it. I’m going to continue this next week with some pro-tips about how to do integration testing well.

See you next time.