This post is a quick series of tips and hindsight gained from spending 3 excruciating weeks bug bashing on a software project. While I sincerely hope that you never have to do bug bashing, I think these are tips you might find useful when working around bugs. So stay a while and listen …
What is your program supposed to do when attempting X? What are its required inputs? Its outputs? Are there any validation constraints? Can you do this and that logged in as well as logged out? […]
If your project is weak on the specs side of things you might want to forego testing for now. Simply put, testing relies on you finding crashes, yes, but unintentional behavior as well. How can you assert that your project is behaving adequately without a spec?
If you continue with testing when you don’t have a spec, you are at great risk of making everyone on your team lose time:
- You will be unsure of what happens during testing.
- This means that you will not open issues where you should have, and you may open issues where you shouldn’t.
- Your Project Manager will have to establish priorities based on those erroneous issues—or worse, initiate a release where a bug is clearly in the system and unnoticed.
- Your colleagues will pick up tasks and waste time trying to understand the situation, only to close the issue as “won’t fix.” Or worse, the “bug” will be “fixed” and correct behavior will change—resulting in project members, or worse, users, noticing only when it’s too late.
Whenever you are in a situation where it’s unclear what your program should do, stop everything and sit with key people: domain experts, Product Owner, and other colleagues with useful context, and start drafting a spec. It will save you time in the long run. Only then can you start testing.
Unless your project is severely deficient in coverage, you should have a decent test suite protecting (at least) your happy path. You can do a quick pass for good measure, but effective bug testing time should then be spent attempting something else, like covering edge cases.
Don’t be afraid to take some time to analyze the existing testing in place. Can you peek at the existing tests and find a blind spot on a specific class when input is XYZ? Or maybe your visual tester is working on a 1080p resolution. What happens if you set your screen to 1280x1024? This is valuable information that will save you time and avoids heartbreak attempting to find bugs where there are none.
Lastly, don’t hesitate to attempt silly things, like roleplaying an elderly user living with Parkinson’s disease, caps lock on, and triple clicking everything in the interface. Anything that gets you closer to what a real user would do is beneficial when testing.
Testing is a cognitive process that can really take a toll on your brain.
You are actively trying to break something, thinking about combinations, pathways and inputs, while at the same time in this process you are highly focused on the system’s reactions.
If possible, alternate between testing and another task. Set a timer if you need. I don’t recommend doing more than one hour of real, mindful testing. Stand up and take a walk or go make coffee, but whatever you do, DON’T test for 8 hours straight.
Easy one to get us going. Have you ever opened an issue on the tracker and attempted to work on a bug with very scarce information on how/why/when/what happened? Rhetorical question. Of course you have!
And I’m pretty sure that you despised that experience, or at the very least you wasted time trying to find the steps to reproduce what happened. So, when the time finally comes for you to create an issue on the tracker, you decide that you want to lead by example. Here’s a helpful list of things that you should include in every issue. Your team—and your future self—will thank you!
This may sound obvious but tell me how many times you encountered a title like “the cart crashes”? What does it tell you except that it crashed? Nothing, that’s right. You can supplement this title with two broad types of information bits: contextual and mechanical.
Contextual bits are pieces of hindsight regarding the state of your testing environment at the moment of the crash. For instance, say you are testing a cart system on a shopping website, contextual information may represent the items present in your cart at the moment of failure. Or maybe you just hit the delete button on an item when it crashed. If you have multiple environments, it is wise to indicate in which environment this happened, either explicitly or with tags.
Mechanical bits are any information that describe parts and interactions of your system. In that same example, you might have some logging in place indicating the caching system failed to save your last item, or that the delete button, somehow, is attempting to send the action to the wrong environment.
It would make sense then, to have a title like:
- The cart crashes when attempting to delete the item on staging with item id XYZ.
- The cart crashes when double-clicking on a delete button.
- The delete button in the cart does not follow the global ENV variable and hits staging instead.
- The cart crashes when redis returns an error.
This makes it easy to scroll the backlog and get a sense of the project’s state without having to open every issue in order to understand.
🎶Video killed the radio star! Video makes you a bug bashing star! Oh, oh oh oh…🎶
There are so many benefits to having a recording of the actual bug, you wouldn’t believe it until you get to experience it. Other than the obvious steps to reproduce, you can also extract some hidden information useful to resolve the issue:
- You are certain that the steps to reproduce are complete, that the reporter didn’t forget something.
- You can see things that the reporter may have not noticed during the bug.
- You can see things that the reporter did not deem useful for the bug to happen, yet it was. In the shopping cart example, the reporter might have failed to mention he had 3 items already in the cart when he attempted to add a new item.
- You can get a broader view of the system; the resolution, the browser, the open applications, or anything that could interfere with proper functioning of your project.
It is also easier than ever to record your own screen for this purpose. On MacOS, you can use the built-in Screenshot.app to record your screen or part of it. On Windows you have the Xbox game bar, which despite its name does really serve more purposes than recording your play of the game. And lastly, Linux folks using GNOME can use the built-in GNOME recorder. For other Linux users it might depend on your general setup, but there are a few FOSS options available like OBS, and SimpleScreenRecorder.
State. Everything. You. Know. Period.
Steps to reproduce, environment, git revision number, browser, log window output, thoughts on what happened (even if it may be wrong, just write it down), test card number used, test user, etc. … You get the point.
Even if it may sound like it’s too much, it might be that the last thing you thought of jotting down is the key to solving the problem. You just never know!
You might have realized by now, but as a bug fixer you want and NEED as much information as possible to reproduce the problem. Think carefully about what resource you might have used or received in the process:
If you uploaded a picture, attach it.
If you wrote content on a rich text editor, attach it.
If you got a dump file, attach it.
If you received a malformed pdf from the server, attach it.
If you used a specific artifact, attach it.
Whatever it is, you know what to do. And repeat after me: Attach it.
“I found a bug in the release, but all checks in the CI are green. How is that possible?” Because you have a coverage blindspot is why!
Once you identified, and fixed the bug, you need to create a test for it. Why? Apart from the very obvious “so it won’t happen again” answer, with the appropriate coverage you get to rest safely that any other modification of the same code won’t re-introduce the bug. Having a test for it also simplifies the job of your QA testers, and enables them to focus their efforts on other parts of the program.
A bug is never truly fixed unless it is proven fixed. Simple as that.
A pull request is kind of like a sales pitch if you really think about it. And a PR titled “fixes #213” with no description and 15 files changed is a bad way to sell your work to your colleagues. Just like when you created an issue in the tracker, you want to create this PR the same way you would like to see it if you were a reviewer on it.
So take the extra 5 minutes to:
- Craft a meaningful title: “Fixes XYZ when Attempting ABC on DEF #213.”
- Write a summary of your changes. “This bug happened because we could pass null to this function and it trickled down until a subscriber didn’t expect that value, so I rewrote it to […].”
- Record an interaction with your program that confirms the bug has been fixed. This increases confidence in the code if your colleagues are not able to synthesize all the changes immediately.
- Write a basic list of steps for your colleagues that want to test your PR and assert that the fix is working.
Do the extra due diligence.
I’ll be very blunt: Simply reading a PR’s code change is rarely sufficient to assert that a bugfix actually fixes the bug. Test the changes! This might not be what you want to hear, but bear with me for a minute or two.
So the PR is up, all checks are green, your colleague created a test for the bug, and the code looks like it makes sense. Why bother checking further, you may ask?
I like writing lists, so here’s a list of good reasons to make the extra work and make sure everything is truly fixed:
- “All checks are green.” All checks WERE green when the bug was discovered, weren’t they?
- “Your colleague created a test.” Good. But you know as well as I do that tests can be unintentionally set up to pass. Or maybe the test doesn’t cover every aspect of the bug. Or the bug was effectively fixed but it created another one inadvertently and it slipped through the cracks again.
- “The code looks like it makes sense.” Well it looked like it made sense when the bug got merged initially. 🙁
As daunting as it may seem, remember that you are doing this now to save yourself a world of pain later. You might just prevent a rollout that would require an urgent fix on a Saturday morning.
This is the last piece of advice that I can give you today: note down anything worth improving that you notice during testing:
- We should put some behavior driven testing in place.
- It doesn’t crash but I noticed that class A is missing tests on methods M and N.
- We should create an issue template to encourage good bug reporting.
- Our releases are very buggy lately, should we consider a feature freeze?
- Can we hire QA engineers to help us?
- We should spawn a test environment for any outstanding PR so that key actors can test themselves to acknowledge the issue has been resolved.
- Most of our tests stem from user stories not being fleshed out enough and missing criterias around edge cases.
Don’t be afraid to bring these talking points to your colleagues and promote discussions of these issues. After all, no one is happy when bugs are aplenty, so adapting continuously prevents bugs proactively and keeps the programmers from becoming groggy.
If you’re feeling like you and your team need to improve, but are unsure where to start, Test Double can get you started. Don’t be afraid to reach out. :)