Episode #5

Integration Testing Best Practices Part I

In this episode I talk about my personal best practices when doing acceptance testing in web-development. Firstly, how to reduce brittleness by using stable test-IDs to interface with your tests. Next I discuss why you should use non-JS test-drivers where possible for speed. Then I talk about the benefits of making your integration tests fail on ANY JS exception -- even when only tangentially related to the system under test. Lastly I give you the reasoning behind why I like to automatically capture screenshots of any test failures.

July 12, 2020

Show Notes

No notes available for this episode.

Screencast.txt

Transcribed by Rugo Obi

Tip1: Use Stable IDs To Interface With Your Tests

Now that we’ve covered the basics in the last episode, I'm going to share more pro-tips I've built up over the years.

The first one will deal with reducing brittleness.

One of the main complaints people make about tests like these is that they are brittle.

This means the tests often fail even though the code remains correct. These false positives create work for you and reduce you and your team’s trust in the test suite, with the eventual result that you run these tests less often and your code starts to degrade.

Let's make this concrete with an example.

Here I have a Password Reset spec. I'll talk briefly about how it works. We first create a user record.

Then we visit the reset_password_url in the browser, we fill in the user_email form fields with that user's email. Then we click_button on Reset my Password.

Next, we check the last_email sent by the system, which would be the password_reset_email, and we extract the password_link, visit that password_link, enter in a new_password, confirm it, and update. Then eventually we log in using that new password and confirm that it works.

Something I’ve glossed over previously was this perform_enqueued_job bit.

Why do I do this?

To understand, let's take a look at the code that's responsible for resetting the password.

Here we go, UserEmail.reset_password_instructions. This basically just means what it says, that there's a user email for resetting password instructions... .deliver_later - here's the important bit. This email gets sent asynchronously. It gets added to a queue and runs a little bit later.

Why do we do this? Because we don't want to leave the user waiting after they fill in the reset password form. Sending an email can take a few seconds, also we can fail. We don't want the request to fail, therefore we return quickly and run that email in the background job.

That's all well and good in production. However, within the test environment, there's an issue. If this is run asynchronously, then this will get executed milliseconds later, before this has had time to run, ie before the email gets sent. If the email hasn't been sent yet this is going to be nil, therefore the test will fail.

We avoid this by making that synchronous. Therefore the email would have actually sent by the time we check for the last email. Let's run that test real quickly to confirm that it passes. I’m doing that from vim.

And you can see it's green, the test passes.

So what does this test correspond to within the web app? Here you can see a bog-standard, Reset Password form. Somebody enters their email here and then clicks “Reset my Password”, nothing too special.

Now, let me get to the point about brittleness. The text here has “Reset my Password”, imagine that I wanted to change that button to “Reset Password” or that someone else on my team wanted to do that, doing that should not change whether the test fails or passes.

So, let me actually go to the form field and change that text. Now it says Reset Password. Let me run that test again…. and RED, it fails.

If you read the error, it says, Unable to find button “Reset my Password” that is not disabled. What does it look like within the browser? Everything's working, all that changes is the button text.

Now what I want to do is find a way for the test to continue passing, even when that text changes.

The way I do that is by adding an unchanging test interface. What I like to use is HTML ids, I never use these ids for CSS styling. And I also have a rule never to change these or at least very rarely to change these, therefore this is stable over a very long period of time.

I'm going to add one here. I'm going to call it something like reset_password_button. Next I'm going to modify the test to use that instead of the text.

So… Let me run that test again. And it works. Excellent. That’s going to be less brittle.

And just to prove the point, let me radically change that text, check it out in the browser, and then run the test again. And it continues to pass.

This is a much better way of doing things, far less brittle, far less likely to break over the long term.

Tip2: Use Non-JS Test-drivers Where Possible For Speed.

The way I see it, integration tests come in two varieties: those that require JavaScript drivers for rendering, and those that do not.

Tests that can render with pure HTML, HTML alone, are much, much faster. They get to interface more directly with the back-end server.

By contrast, tests with JavaScript are usually far far slower.

The test you're looking at right now runs with rack-test, a HTML-only driver. Let's see what its speed is.

You can see here that the whole thing ran in 1.04 seconds, Let's say one second, and that the files took, let's say 0.7 seconds to load. So, 1 minus 0.7.

Now let's run that same test again with the JavaScript driver. I can do that by adding js: true as a tag here.

You can see this run in 4.43 seconds, of which let's say, 0.6s was load time.

Let's do a bit of arithmetic here. This is going to come to 0.3s and this is going to come to 3.7s.

Now let's divide the two in the Vim expression register. And you can see that the JavaScript test takes 12 times as long as the rack-test test.

Therefore, it's a good idea to use these HTML-only-non-JavaScript tests as much as possible, because your test suite will run more quickly, and you'll get feedback more quickly.

Tip3: Make Your Integration Tests Fail On Any JS Expression

One of the worst things about the modern web is how everyone's JavaScript seems to be broken. Even on the sites of major companies, a button often stops working, or a form often stops submitting.

And when you go to look at the JavaScript console, you see something like this.

Something I noticed when adding integration tests to my website, is that there is a very very quick way, a very cheap way to check for these kinds of errors.

And here I have about 32 lines of code, a little module I include into my integration tests, that basically causes a JavaScript integration test to fail whenever there are JavaScript errors.

Roughly how this works is, I define some constants with some errors I generally always want to ignore. For example, ones caused by PayPal.

Yes, they have a whole section of errors, that's how bad PayPal are, and some other random errors like one from Rails, and one from Google Analytics. I can’t remember what this is but it's certainly not mine.

Then I have this piece of code that gets executed after every single test, if that test is a JavaScript test.

What it does is it gets all the errors from the browser logs, it filters out errors to ignore based on the stuff I defined up above, and also some custom stuff that might vary from test to test.

Then, if the errors are present, essentially it checks if the error was SEVERE, which for example is an exception. And if it's severe, it causes the test to fail. Otherwise it prints a warning to the screen.

So let's look at a test that's currently working just fine.

I'm going to run this one real quick. And you can see it's all green. This means that a seller is able to apply with cookie notice tracking.

Now, let me add a JavaScript error.

This corresponds to the new seller page which is part of that test flow. I'm going to paste in a little bit of code here, Typical Behavior of an SPA in 2020. I'm going to save, and now run the test again. Bear with me.

Now the test fails and you can see that JavaScript error printed there. Let's go over to the browser for a sec, and refresh that page.

This is the page where I added the JavaScript error. I'm going to open up the JavaScript console here. And you can see that error is printed.

This is an example of intentionally adding brittleness to an integration test. Normally you don't want brittleness, but in this case the brittleness is helpful. It helps me to ferret out JavaScript errors that might otherwise go unnoticed.

The idea is that I don't directly integration test everything, and for the stuff I don't directly test, the presence of a JavaScript exception is probably a strong indicator that one of these things is broken.

So by adding this automatic failure on JavaScript exceptions, I get a sort of smokescreen for things that could be wrong with my code.

Now something you might notice is that this will only execute in the tests that use a JavaScript driver, and it won't execute in the normal ones, since they don't execute JavaScript at all.

However, there is a possibility of running your entire test suite, from time to time, with a different default driver.

So instead of having rack-test here, I could use my JavaScript driver, Selenium Chrome, and then run my entire test suite.

If I do that once in a while, for example, before major releases, I'm able to ferret out all these JavaScript errors, just by running my integration tests.

Tip 4: Capture Screenshots Of Test Failures

Given that your JavaScript tests can take a rather long time to run, and also that they usually run headless without a browser you can look at. Therefore it's really useful to have a way to visualize what the error is, automatically after each test failure.

Let me give you an example here. So here's the test where a seller of notes can apply to Sell notes on my website.

And if you look down here towards the end of the test. I look for the h1.title, and it should have, "Thanks for applying" as its text.

Basically after someone fills in the form to apply to be the seller, I redirect them to a page that says "Thanks for applying".

Now, in the right hand pane, you can see that particular h1. I'm going to make an error here on purpose by deleting that line, saving, and then running this test again.

We should expect it to fail, to be red. And, as expected, it is. Unable to find css “h1.title”.

Now, one way to debug this, the kind of old school way is to laboriously fill out this form with all sorts of valid fields, and then get to the next page and see what's going on.

However, I have a system in place where I automatically take a screenshot whenever there is a JavaScript test error.

Let's view that particular screenshot right now.

Those screenshots get saved to my /tmp/screenshots folder. So what we want to do next is to just view that.

Here we see a test_failure_creating_a_seller_spec. That's the name of the spec that just failed. So this seems about right.

Now, I'm going to run fc, and create a wrapper around this. I’m going to feed the output of that command to imgcat, which will allow me to view the image on the command line.

I'm going to run that, it'll take a second. And here you go. You can see an image of the failing page and you'll notice here that there's a paragraph here and no h1. And this will alert me to what has gone wrong.

So I can open Vim back up again by foregrounding it, and then I'll go to the Thanks for applying page and see that the h1 was missing.

Now I run that test again. And you'll see that it will pass.

So how is this done? In some development environments, I believe in Laravel Dusk, this is provided out of the box. Whereas in others, you have to add it yourself.

So, the first thing to be aware of is the window size of the browser you're using. Initially I had this window size to be quite small. And therefore, I wouldn't capture all the text on my screen. This ended up being confusing.

Then you have some actual code for creating those screenshots. What I do first is, clear the screenshots from previous rounds, otherwise they might be confusing. And elsewhere, I create that screenshot. That happens after JavaScript tests, If there's a failure.

Then I name that test according to the file path of that particular test. I save it in the folder you saw previously, I ensure that that folder exists. This is for example necessary on CircleCI. Then I save that screenshot.

Note that saving screenshots is not available in all drivers. It is however available on the JavaScript driver.

Lastly, I print that the screenshot was taken to stdout out as a sort of reminder to myself or anyone else on the team.

That’s all I’ve got for now.

Tune in next week for more tips on integration testing.