I've been trying to get better about writing more complete tests, but often feel I'm being somewhat redundant. Sometimes the tests I'm writing feel like they're just verifying that the language works as anticipated (if I pass in a parameter, is it accessible in the local scope? Does this math function that accepts integer types reject non-integer inputs?), or that the framework works (If I pass in a string prop to render as a text node, does React indeed render this text node?).
In a dynamic language extra checks for proper types make sense, but in a type-safe language I'm more than happy to trust my compiler for that. As a result, my unit tests feel sparse. It feels more natural to write integration tests that encompass multiple parts of the application in an end-to-end or round-trip manner.
How do you decide what's worthy of a unit test and what isn't?
Top comments (15)
What's a unit test? First answer: what's a unit?
A unit should be a discrete element of your code that exhibits a behaviour that is important to the consumer. Often we mean the 'logic' of the application - something internal. But it could be a library API. The unit has behaviour; that behaviour is its surface.
What we don't want to do is test the internals of the unit; if a function is not exposed by the library, we should not test it unless we have a really, really good reason. Why?
Because tightly coupled tests create fragility. We rely on the unit tests to tell us that the behaviour of the code has not changed. If we're testing a layer below the level of behaviour that we care about - say the internals of our library - then we cannot refactor the code easily. Tests will break whenever we refactor the internals of the library; we will find it hard to say what behaviour we care about. We will feel like the tests are trapping us rather than liberating us.
I favour TDD, so I see tests as being a good test of the 'feel' of an API - if it's hard/horrible to call in the tests, then it's probably hard/horrible for real people.
M'learned colleague and friend @quii gave a rather good talk on all of this:
youtube.com/watch?v=Kwtit8ZEK7U
As to coverage... it's a weak metric. Focus more on what a wise man once told Kent Beck:
Ah, this is quite helpful.
This succinctly summarizes what I've been struggling with, but I think the problem was I had the whole concept a little wrong. I was conflating "a discrete element" with "a function". This makes a lot of sense, thank you! I'll definitely check out the talk.
You're welcome. I don't think I express it very well - haven't quite found the right metaphor yet. It's something to do with shapes and how they're constructed. You're a clever chap - I'm sure you'll think of something.
Unit tests are designed to test the developer's code logic. You always aim to cover 100% of your code, but there is limits to what should be tested.
I tend to not test logic of things handled by a framework. For example: Django and its auto-generated admin view. Unless you extend the code with your own implementation, you can assume the team behind already provide a "flawless" logic.
You should also evaluate what are the needs and the aim of your project. If you work for a startup, wasting too much resources on useless tests is bad. If you project on a project that is daily used by 100k+ users, then having something robust, reliable and fully tested is mandatory and the budget to support this is consequent. There is a point where more tests won't add any value and you should know that threshold.
I prefer doing unit tests way before doing any integration tests. Integration tests are black box tests to make sure your use cases are fine, but if the logic behind is buggy things will start to go sour.
Great answer, thanks for your perspective! It makes sense that integration tests need your code logic to be sound already. I think part of my issue is that I'm not working for anyone at all, these are just hobby projects, and perhaps striving for fully tested isn't a good use of time when trying to just produce something working.
Even if you are working on relatively small or personal project, testing for a 100% coverage/mutation is ok I guess. I do this on my personal projects, and obviously I am testing core language features like you (testing if my methods throw a TypeError if they get something else than a Number for example). But overall, it is never wasted.
The only case where I feel loosing time is when I go on with a complex logic, immediately cover it with tests, only to see that the API does not feel good and changing it again. This is frustrating, but I would say that I would never came up to the conclusion of reworking the API if I was not sure my methods worked correctly.
Some other times, I will completely ignore tests when starting a library up to a point when I feel comfortable with the API. Sometimes it feels better like that but I do not have a fixed procedure...
Interesting point, I've definitely had issues with locking down an API and could definitely see how that leads to wasted work.
As with most things, then, "it depends".
Like all things, it really depends.
For testing an application, I would definitely start with integration/UI testing, and when you've got good integration coverage you can start to beef up unit testing individual logic components. Adding automated testing later in a project's lifecycle is probably best starting out with integration testing as well.
For writing a library, I find it best to do strict TDD, unit-testing each method before integration testing the whole thing. This helps you make sure the small components each do exactly what you intend, without necessarily mandating how they should all be used and work together. It's much harder to know how a library will be used compared to a framework, so you really need good integrity at the micro-component level.
Another things I've come to like is to focus on inputs/outputs. Mock only when absolutely necessary, which is usually when testing a method with side-effects (which might actually be a code-smell). If each component is well-tested, then you can safely use that actual component in other tests, rather than needing to mock it.
This makes a lot of sense - thus far I haven't attempted a library, these have all been applications, which may explain this difficulty in knowing where to start. I like the rule of thumb of applying mocking judiciously - I do strive for small pure functions where possible and this is a good litmus test to see how you're doing.
Thanks!
Every piece of software is tested. The most common form of testing is for the person who wrote it to give it a spin to see if it works. Any time I make a change I could spend time (more and more time as my program grows) checking every corner. Instead, unit tests and integration tests can give me confidence that the program still behaves the way I intend it to in a few seconds or a couple minutes.
In a small, one person project you might decide testing with your eyeballs is good enough. As the size of your project grows consider writing tests that mirror the steps you would manually do to check your code. When you change something run the tests to check that your change didn't effect something it's not supposed to.
On a large project there's no way to test with your eyeballs without missing something. In that case more and more tests are needed. If your application makes money the need for confidence in it working as expected grows.
Test first is the only way I know how to get really good test coverage that really demonstrates my software will do what I think it's supposed to do. If I write tests after finishing my work I'll be tempted to say the feature is small or simple or not with testing. I'm probably never coming back to fill in missing tests, so I should write them first for everything.
So what to test? I test that for a normal input (pick one you'd use if you manually checked it) I can get a normal output. If there are side effects, (http request, database records, output on the screen) check that they seem normal. Next if there's more than one path through the code under test make sure each path has one test with normal inputs for that path. Finally think of common failures like http problems, user input with typos or emoji or a precondition not met. Don't stress about covering every possible failure, add these as they happen and you handle them. Over time your tests will check for problems you forgot about.
The end goal is that you run your test suite and it checks everything you would spend all afternoon looking over, in a matter of seconds. Then you make a change, run your tests and trust that they will fail if your program behaves incorrectly
I think test first sounds like a good exercise, it might uncover a lot of these answers to step though the process myself. Thinking about it as self-automation makes a lot of sense! Thank you for the thorough response.
I used exercism.io to practice using test first or test driven design. They probably have programming challenges in a language you're interested in and they pair you with a mentor
A lot depends on why you write tests. My top reason is that I know when to stop writing code. When the tests pass, I'm done writing code.
Other reasons: documenting the code behavior so I can quickly get up to speed without reading code. And finally I want to be able to change code without breaking existing behavior.
If you write tests first and make sure they fail before writing any code, then you can stop writing tests when your tests fully document the behavior required. Hopefully the first few tests you write inspire all sorts of other behavior questions.
But ultimately I think of tests as a contract between me and the user of my code. I will guarantee what's in the tests and nothing else. It is an excellent way of getting everyone on the team to dig into all the edge cases and detailed behavior.
My tests say that my function can add positive integers and give only examples of adding positive integers? I know to not assume the function can handle negative numbers and that it may even throw unexpected errors with negative numbers or non-integers. My tests don't give the largest integers that it can safely add? I don't make assumptions about what that limit is. If you think of tests as documentation about behavior, it's a lot easier to focus on and decide which tests are important.
This makes a lot of sense!
Framing tests as documentation makes it a lot clearer, thank you.
Thanks for your answer!
I think you're right, my mental picture of testing is likely blurry. Something like a
FromStr
implementation on a specific struct is easy, or really any pure function where I know what I should get out given a specific input. Most of my tests are this variety, trying a bunch of "edge case-y" strings, and I get stuck outside of that. It makes sense that this is at heart an architectural issue, this is good food for thought.