In this episode I talk about my personal best practices when doing acceptance testing in web-development. Firstly, how to reduce brittleness by using stable test-IDs to interface with your tests. Next I discuss why you should use non-JS test-drivers where possible for speed. Then I talk about the benefits of making your integration tests fail on ANY JS exception -- even when only tangentially related to the system under test. Lastly I give you the reasoning behind why I like to automatically capture screenshots of any test failures.
July 12, 2020
No notes available for this episode.
Transcribed by Rugo Obi
Tip1: Use Stable IDs To Interface With Your Tests
Now that we’ve covered the basics in the last episode, I'm going to share more pro-tips I've built up over the years.
The first one will deal with reducing brittleness.
One of the main complaints people make about tests like these is that they are brittle.
This means the tests often fail even though the code remains correct. These false positives create work for you and reduce you and your team’s trust in the test suite, with the eventual result that you run these tests less often and your code starts to degrade.
Let's make this concrete with an example.
Here I have a
Password Reset spec. I'll talk briefly about how it works. We first create a user record.
Then we visit the
reset_password_url in the browser, we fill in the
user_email form fields with that user's email. Then we
Reset my Password.
Next, we check the
last_email sent by the system, which would be the
password_reset_email, and we extract the
password_link, visit that
password_link, enter in a
new_password, confirm it, and update. Then eventually we log in using that new password and confirm that it works.
Something I’ve glossed over previously was this
Why do I do this?
To understand, let's take a look at the code that's responsible for resetting the password.
Here we go,
UserEmail.reset_password_instructions. This basically just means what it says, that there's a user email for resetting password instructions...
.deliver_later - here's the important bit. This email gets sent asynchronously. It gets added to a queue and runs a little bit later.
Why do we do this? Because we don't want to leave the user waiting after they fill in the reset password form. Sending an email can take a few seconds, also we can fail. We don't want the request to fail, therefore we return quickly and run that email in the background job.
That's all well and good in production. However, within the test environment, there's an issue. If this is run asynchronously, then this will get executed milliseconds later, before this has had time to run, ie before the email gets sent. If the email hasn't been sent yet this is going to be
nil, therefore the test will fail.
We avoid this by making that synchronous. Therefore the email would have actually sent by the time we check for the last email. Let's run that test real quickly to confirm that it passes. I’m doing that from vim.
And you can see it's green, the test passes.
So what does this test correspond to within the web app? Here you can see a bog-standard, Reset Password form. Somebody enters their email here and then clicks “Reset my Password”, nothing too special.
Now, let me get to the point about brittleness. The text here has “Reset my Password”, imagine that I wanted to change that button to “Reset Password” or that someone else on my team wanted to do that, doing that should not change whether the test fails or passes.
So, let me actually go to the form field and change that text. Now it says
Reset Password. Let me run that test again…. and RED, it fails.
If you read the error, it says,
Unable to find button “Reset my Password” that is not disabled. What does it look like within the browser? Everything's working, all that changes is the button text.
Now what I want to do is find a way for the test to continue passing, even when that text changes.
The way I do that is by adding an unchanging test interface. What I like to use is HTML ids, I never use these ids for CSS styling. And I also have a rule never to change these or at least very rarely to change these, therefore this is stable over a very long period of time.
I'm going to add one here. I'm going to call it something like
reset_password_button. Next I'm going to modify the test to use that instead of the text.
So… Let me run that test again. And it works. Excellent. That’s going to be less brittle.
And just to prove the point, let me radically change that text, check it out in the browser, and then run the test again. And it continues to pass.
This is a much better way of doing things, far less brittle, far less likely to break over the long term.
Tip2: Use Non-JS Test-drivers Where Possible For Speed.
Tests that can render with pure HTML, HTML alone, are much, much faster. They get to interface more directly with the back-end server.
The test you're looking at right now runs with rack-test, a HTML-only driver. Let's see what its speed is.
You can see here that the whole thing ran in 1.04 seconds, Let's say one second, and that the files took, let's say 0.7 seconds to load. So, 1 minus 0.7.
js: true as a tag here.
You can see this run in 4.43 seconds, of which let's say, 0.6s was load time.
Let's do a bit of arithmetic here. This is going to come to 0.3s and this is going to come to 3.7s.
Tip3: Make Your Integration Tests Fail On Any JS Expression
Something I noticed when adding integration tests to my website, is that there is a very very quick way, a very cheap way to check for these kinds of errors.
Roughly how this works is, I define some constants with some errors I generally always want to ignore. For example, ones caused by PayPal.
Yes, they have a whole section of errors, that's how bad PayPal are, and some other random errors like one from Rails, and one from Google Analytics. I can’t remember what this is but it's certainly not mine.
What it does is it gets all the errors from the browser logs, it filters out errors to ignore based on the stuff I defined up above, and also some custom stuff that might vary from test to test.
Then, if the errors are present, essentially it checks if the error was
SEVERE, which for example is an exception. And if it's severe, it causes the test to fail. Otherwise it prints a warning to the screen.
So let's look at a test that's currently working just fine.
I'm going to run this one real quick. And you can see it's all green. This means that a seller is able to apply with cookie notice tracking.
This corresponds to the new seller page which is part of that test flow. I'm going to paste in a little bit of code here,
Typical Behavior of an SPA in 2020. I'm going to save, and now run the test again. Bear with me.
However, there is a possibility of running your entire test suite, from time to time, with a different default driver.
Tip 4: Capture Screenshots Of Test Failures
Let me give you an example here. So here's the test where a seller of notes can apply to Sell notes on my website.
And if you look down here towards the end of the test. I look for the
h1.title, and it should have, "Thanks for applying" as its text.
Basically after someone fills in the form to apply to be the seller, I redirect them to a page that says "Thanks for applying".
Now, in the right hand pane, you can see that particular
h1. I'm going to make an error here on purpose by deleting that line, saving, and then running this test again.
We should expect it to fail, to be red. And, as expected, it is.
Unable to find css “h1.title”.
Now, one way to debug this, the kind of old school way is to laboriously fill out this form with all sorts of valid fields, and then get to the next page and see what's going on.
Let's view that particular screenshot right now.
Those screenshots get saved to my
/tmp/screenshots folder. So what we want to do next is to just view that.
Here we see a
test_failure_creating_a_seller_spec. That's the name of the spec that just failed. So this seems about right.
Now, I'm going to run
fc, and create a wrapper around this. I’m going to feed the output of that command to
imgcat, which will allow me to view the image on the command line.
I'm going to run that, it'll take a second. And here you go. You can see an image of the failing page and you'll notice here that there's a paragraph here and no h1. And this will alert me to what has gone wrong.
So I can open Vim back up again by foregrounding it, and then I'll go to the
Thanks for applying page and see that the h1 was missing.
Now I run that test again. And you'll see that it will pass.
So how is this done? In some development environments, I believe in Laravel Dusk, this is provided out of the box. Whereas in others, you have to add it yourself.
So, the first thing to be aware of is the window size of the browser you're using. Initially I had this window size to be quite small. And therefore, I wouldn't capture all the text on my screen. This ended up being confusing.
Then I name that test according to the file path of that particular test. I save it in the folder you saw previously, I ensure that that folder exists. This is for example necessary on CircleCI. Then I save that screenshot.
Lastly, I print that the screenshot was taken to
stdout out as a sort of reminder to myself or anyone else on the team.
That’s all I’ve got for now.
Tune in next week for more tips on integration testing.