The big TDD misunderstanding is that most people consider TDD a testing practice.
The article doesn’t talk about TDD, it gives the reader some tips on how to write tests. That’s not TDD.
I'm fully aware of the idea that TDD is a "design practice" but I find it to be completely wrongheaded.
The principle that tests that couple to low level code give you feedback about tightly coupled code is true but it does that because low level/unit tests couple too tightly to your code - I.e. because they too are bad code!
Have you ever refactored working code into working code and had a slew of tests fail anyway? That's the child of test driven design.
High level/integration TDD doesnt give "feedback" on your design it just tells you if your code matches the spec. This is actually more useful. It then lets you refactor bad code with a safety harness and give failures that actually mean failure and not "changed code".
I keep wishing for the idea of test driven design to die. Writing tests which break on working code is inordinately uneconomic way to detect design issues as compared to developing an eye for it and fixing it under a test harness with no opinion on your design.
So, yes this - high level test driven development - is TDD and moreover it's got a better cost/benefit trade off than test driven design.
I think many people realise this, thus the spike and stabilise pattern. But yes, integration and functional tests are both higher value in and of themselves, and lower risk in terms of rework, so ought to be a priority. For pieces of logic with many edge cases and iterations, mix in some targeted property-based testing and you’re usually in a good place.
Part of test-driven design is using the tests to drive out a sensible and easy to use interface for the system under test, and to make it testable from the get-go (not too much non-determinism, threading issues, whatever it is). It's well known that you should likely _delete these tests_ once you've written higher level ones that are more testing behaviour than implementation! But the best and quickest way to get to having high quality _behaviour_ tests is to start by using "implementation tests" to make sure you have an easily testable system, and then go from there.
>It's well known that you should likely _delete these tests_ once you've written higher level ones that are more testing behaviour than implementation!
Building tests only to throw them away is the design equivalent of burning stacks of $10 notes to stay warm.
As a process it works. It's just 2x easier to write behavioral tests first and thrash out a good design later under its harness.
It mystifies me that doubling the SLOC of your code by adding low level tests only to trash them later became seen as a best practice. It's so incredibly wasteful.
> As a process it works. It's just 2x easier to write behavioral tests first and thrash out a good design later under its harness.
I think this “2x easier” only applies to developers who deeply understand how to design software. A very poorly designed implementation can still pass the high level tests, while also being hard to reason about (typically poor data structures) and debug, having excessive requirements for test setup and tear down due to lots of assumed state, and be hard to change, and might have no modularity at all, meaning that the tests cover tens of thousands of lines (but only the happy path, really).
Code like this can still be valuable of course, since it satisfies the requirements and produces business value, however I’d say that it runs a high risk of being marked for a complete rewrite, likely by someone who also doesn’t really know how to design software. (Organizations that don’t know what well designed software looks like tend not to hire people who are good at it.)
"Test driven design" in the wrong hands will also lead to a poorly designed non modular implementation in less skilled hands.
I've seen plenty of horrible unit test driven developed code with a mess of unnecessary mocks.
So no, this isnt about skill.
"Test driven design" doesnt provide effective safety rails to prevent bad design from happening. It just causes more pain to those who use it as such. Experience is what is supposed to tell you how to react to that pain.
In the hands of junior developers test driven design is more like test driven self flagellation in that respect: an exercise in unnecessary shame and humiliation.
Moreover since it prevents those tests with a clusterfuck of mocks from operating as a reliable safety harness (because they fail when implementation code changes, not in the presence of bugs), it actively inhibits iterative exploration towards good design.
These tests have the effect of locking in bad design because keeping tightly coupled low level tests green and refactoring is twice as much work as just refactoring without this type of test.
> I've seen plenty of horrible unit test driven developed code with a mess of unnecessary mocks.
Mocks are an anti-pattern. They are a tool that either by design or unfortunate happenstance allows and encourages poor separation of concerns, thereby eliminating the single largest benefit of TDD: clean designs.
> … TDD is a "design practice" but I find it to be completely wrongheaded.
> The principle that tests that couple to low level code give you feedback about tightly coupled code is true but it does that because low level/unit tests couple too tightly to your code - I.e. because they too are bad code!
But now you’re asserting:
> "Test driven design" in the wrong hands will also lead to a poorly designed non modular implementation in less skilled hands.
Which feels like it contradicts your earlier assertion that TDD produces low-level unit tests. In other words, for there to be a “unit test” there must be a boundary around the “unit”, and if the code created by following TDD doesn’t even have module-sized units, then is that really TDD anymore?
Edit: Or are you asserting that TDD doesn’t provide any direction at all about what kind of testing to do? If so, then what does it direct us to do?
>"Test driven design" in the wrong hands will also lead to a poorly designed non modular implementation in less skilled hands.
>Which feels like it contradicts your earlier assertion that TDD produces low-level unit tests.
No, it doesnt contradict that at all. Test driven design, whether done optimally or suboptimally, produces low level unit tests.
Whether the "feedback" from those tests is taken into account determines whether you get bad design or not.
Either way I do not consider it a good practice. The person I was replying to was suggesting that it was a practice that was more suited to be people with a lack of experience. I dont think that is true.
>Or are you asserting that TDD doesn’t provide any direction at all about what kind of testing to do?
I'm saying that test driven design provides weak direction about design and it is not uncommon for test driven design to still produce bad designs because that weak direction is not followed by people with less experience.
Thus I dont think it's a practice whose effectiveness is moderated by experience level. It's just a bad idea either way.
> Whether the "feedback" from those tests is taken into account determines whether you get bad design or not.
Which to me was kind of the whole point of TDD in the first place; to let the ease and/or difficulty of testing become feedback that informs the design overall, leading to code that requires less set up to test, fewer dependencies to mock, etc.
I also agree that a lot of devs ignore that feedback, and that just telling someone to “do TDD” without first making sure that they know that they need to strive to have little to no test setup and few or no mocks, etc., otherwise the advice is pointless.
Overall I get the sense that a sizable number of programmers accept a mentality of “I’m told programming is hard, this feels hard so I must be doing it right”. It’s a mentality of helplessness, of lack of agency, as if there is nothing more they can do to make things easier. Thus they churn out overly complex, difficult code.
>Which to me was kind of the whole point of TDD in the first place; to let the ease and/or difficulty of testing become feedback that informs the design overall
Yes and that is precisely what I was arguing against throughout this thread.
For me, (integration) test driven development development is about creating:
* A signal to let me know if my feature is working and easy access to debugging information if it is not.
* A body of high quality tests.
It is 0% about design, except insofar as the tests give me a safety harness for refactoring or experimenting with design changes.
Don't agree, though I think it's more suble than "throw away the tests" - more "evolve them to a larger scope".
I find this particularly with web services,especially when the the services are some form of stateless calculators. I'll usually start with tests that focus on the function at the native programming language level. Those help me get the function(s) working correctly. The code and tests co-evolve.
Once I get the logic working, I'll add on the HTTP handling. There's no domain logic in there, but there is still logic (e.g. mapping from json to native types, authentication, ...). Things can go wrong there too. At this point I'll migrate the original tests to use the web service. Doing so means I get more reassurance for each test run: not only that the domain logic works, but that the translation in & out works correctly too.
At that point there's no point leaving the original tests in place. They're just covering a subset of the E2E tests so provide no extra assurance.
I'm therefore with TFA in leaning towards E2E testing because I get more bang for the buck. There are still places where I'll keep native language tests, for example if there's particularly gnarly logic that I want extra reassurance on, or E2E testing is too slow. But they tend to be the exception, not the rule.
> At that point there's no point leaving the original tests in place. They're just covering a subset of the E2E tests so provide no extra assurance.
They give you feedback when something fails, by better localising where it failed. I agree that E2E tests provide better assurance, but tests are not only there to provide assurance, they are also there to assist you in development.
Starting low level and evolving to a larger scope is still unnecessary work.
It's still cheaper starting off building a playwright/calls-a-rest-api test against your web app than building a low level unit test and "evolving" it into a playwright test.
I agree that low level unit tests are faster and more appropriate and if you are surrounding complex logic with a simple and stable api (e.g. testing a parser) but it's better to work your way down to that level when it makes sense, not starting there and working your way up.
That’s not my experience. In the early stages, it’s often not clear what the interface or logic should be - even at the external behaviour level. Hence the reason tests and code evolve together. Doing that at native code level means I can focus on one thing: the domain logic. I use FastAPI plus pytest for most of these projects. The net cost of migrating a domain-only test to use the web API is small. Doing that once the underlying api has stabilised is less effort than starting with a web test.
I dont think ive ever worked on any project where they hadnt yet decided whether they wanted a command line app or a website or an android app before I started. That part is usually fixed in stone.
Sometimes lower level requirements are decided before higher level requirements.
I find that this often causes pretty bad requirements churn - when you actually get the customer to think about the UI or get them to look at one then inevitably the domain model gets adjusted in response. This is the essence of why BDD/example driven specification works.
What exactly is it wasting? Is your screen going to run out of ink? Even in the physical contruction world, people often build as much or more scaffolding as the thing they're actually building, and that takes time and effort to put up and take down, but it's worthwhile.
Sure, maybe you can do everything you would do via TDD in your head instead. But it's likely to be slower and more error-prone. You've got a computer there, you might as well use it; "thinking aloud" by writing out your possible API designs and playing with them in code tends to be quicker and more effective.
Time. Writing and maintaining low level unit tests takes time. That time is an investment. That investment does not pay off.
Doing test driven development with high level integration tests also takes time. That investment pays dividends though. Those tests provide safety.
>Sure, maybe you can do everything you would do via TDD in your head instead. But it's likely to be slower and more error-prone.
It's actually much quicker and safer if you can change designs under the hood and you dont have to change any of the tests because they validate all the behavior.
Quicker and safer = you can do more iterations on the design in the available time = a better design in the end.
The refactoring step of red, green, refactor is where the design magic happens. If the refactoring turns tests red again that inhibits refactoring.
> It's well known that you should likely _delete these tests_ once you've written higher level ones that are more testing behaviour than implementation!
Is it? I don't think I've ever seen that mentioned.
I think there can be some value to using TDD in some situations but as soon as people get dogmatic about it, the value is lost.
The economic arguments are hard to make. Sure, writing the code initially might cost $X and writing tests might cost $1.5X but how can we conclude that the net present value (NPV) of writing the tests is necessarily negative - this plainly depends on the context.
I don't even like TDD much, but I think that this missed the point:
> Have you ever refactored working code into working code and had a slew of tests fail anyway?
Yes - and that is intended. The "refactor of working code into working code" often changes some assumptions that were made during implementation.
Those tests are not there to give "feedback on your design", they are there to endure that the implementation does what you thought it should do when you wrote your code. Yes, that means that when you refactor your code, quite a few tests will have to be changed to match the new code.
But the amount of times I had this happen and it highlighted issues on the refactor is definitely not negligible. The cost of not having these tests (which would translate into bugs) would certainly have surpassed the costs of keeping those tests around.
If we’re talking “what you thought it should do” and not “how you thought it should do it” this is all fine. If requirements change tests should change. I think the objection is more to changing implementation details and having to rewrite twice as much code, when your functional tests (which test things that actually make you money) never changed.
Maybe, but I think the point is that it's probably very easy to get into this situation, and not many people talk about it or point out how to avoid it.
I’m still not following what the issue is. If you refactor some code and change the behaviour of the code, and the code tests the expected behaviour and passes, then you have one of two problems:
1. You had a bug you didn’t know about and your test was invalid (in which case the test is useless! Fix the issue then you fix the test…)
or
2. You had no bug and you just introduced a new one, in which case the test has done its job and alerted you to the problem so you can fix your mistake.
What is the exact problem?
Now if this is an issue with changing the behaviour of the system, that’s not a refactor. In that case, your tests are testing old behaviour, and yes, they are going to have to be changed.
The point is that you're not changing the interface to the system, but you're changing implementation details that don't affect the interface semantics. TDD does lead you to a sort of coupling to implementation details, which results in breaking a lot of unit tests if you change those implementation details. What this yields is either hesitancy to undertake positive refactorings because you have to either update all of those tests or just delete them altogether, so were those tests really useful to begin with? The point is that it's apparently wasted work and possibly an active impediment to positive change, and I haven't seen much discussion around avoiding this outcome, or what to do about it.
There has been discussion about this more than a decade ago by people like Dan North and Liz Keogh. I think it’s widely accepted that strict TDD can reduce agility when projects face a lot of uncertainty and flux (both at the requirements and implementation levels). I will maintain that functional and integration tests are more effective than low-level unit tests in most cases, because they’re more likely to test things customers care about directly, and are less volatile than implementation-level specifics. But there’s no free lunch, all we’re ever trying to do is get value for our investment of time and reduce what risks we can. Sometimes you’ll work on projects where you build low level capabilities that are very valuable, and the actual requirements vary wildly as stakeholders navigate uncertainty. In those cases you’re glad to have solid foundations even if everything above is quite wobbly. Time, change and uncertainty are part of your domain and you have to reason about them the same as everything else.
> I will maintain that functional and integration tests are more effective than low-level unit tests in most cases
Right, that's pretty much the only advice I've seen that makes sense. The only possible issue is that these tests may have a broader state space so you may not be able to exhaustively test all cases.
Absolutely right. If you’re lucky, those are areas where you can capture the complexity in some sort of policy or calculator class and use property based testing to cover as much as possible - that’s a level of unit testing I’m definitely on board with. Sometimes it’s enough to just trust that your functional tests react appropriately to different _types_ of output from those classes (mocked) without having to drive every possible case (as you might have seen done in tabular test cases). For example I have an app that tests various ways of fetching and visualising data, and one output is via k-means clustering. I test that the right number of clusters gets displayed but I would never test the correctness of the actual clustering at that level. Treat complexity the same way you treat external dependencies, as something to be contained carefully.
Why does testing behavior matter? I don’t care if my tests exhaustively test each if branch of the code to make sure that they call the correct function when entering that if branch. That’s inane.
I care about whether the code is correct. A more concrete example; say I’m testing a float to string function, I don’t care how it converts the floating point binary value 1.23 into the string representation of “1.23”. All I care about, is the fact that it correctly turns that binary value into the correct string. I also care about the edge cases. Does 0.1E-20 correctly use scientific notation? What about rounding behavior? Is this converter intended to represent binary numbers in a perfect precision or is precision loss ok?
If your tests simply check that you call the log function and the power function x times, your tests are crap. And this is what I believe the parent commenter was talking about. All too often, tests are written to fulfill arbitrary code coverage requirements or to obsequiously adhere to a paradigm like TDD. These are bad tests, because they’ll break when you refactor code.
One last example, I recently wrote a code syntax highlighter. I had dozens of test cases that essentially tested the system end to end and made sure if I parsed a code block, I ended up with a tree of styles that looked a certain way. I recently had to refactor it to accommodate some new rules, and it was painless and easy. I could try stuff out, run my tests, and very quickly validate that my changes did not break prior correct behavior. This is probably the best value of testing that I’ve ever received so far in my coding career.
"Have you ever reconsidered your path up the cliff face and had to reposition a slew of pitons? This means your pitons are too tightly coupled to the route!"
> Have you ever refactored working code into working code and had a slew of tests fail anyway? That's the child of test driven design.
I had this problem, when either testing too much implementation, or relying too much on implementation to write tests. If, on the other hand, I test only the required assumptions, I'd get lower line/branch coverage, but my tests wouldn't break while changing implementation.
My take on this - TDD works well when you fully control the model, and when you don't test for implementation, but the minimal required assumptions.
I don't think that's TDD's fault, that's writing a crappy test's fault.
If you keep it small and focussed, don't include setup that isn't necessary and relevant, only exercise the thing which is actually under test, only make an assertion about the thing you actually care about (e.g. there is the key 'total_amount' with the value '123' in the response, not that the entire response body is x); that's much less likely to happen.
Not sure why I’m getting downvoted so badly, because by its very nature refactoring should t change the functionality of the system. If you have functional unit tests that are failing, then something has changed and your refactor has changed the behaviour of the system!
It is very common for unit tests to be white-box testing, and thus to depend significantly on internal details of a class.
Say, when unit testing a list class, a test might call the add function and then assert that the length field has changed appropriately.
Then, if you change the list to calculate length on demand instead of keeping a length field, your test will now fail even thought the behavior has not actually changed.
This is a somewhat silly example, but it is very common for unit tests to depend on implementation details. And note that this is not about private VS public methods/fields. The line between implementation details and public API is fuzzy and depends on the larger use of the unit within the system.
Checking length is now a function call and not a cached variable — a change in call signature and runtime performance.
Consumers of your list class are going to have to update their code (eg, that checks the list length) and your test successfully notified you of that breaking API change.
Then any code change is a breaking API change and the term API is meaningless. If the compiler replaces a conditional jump + a move with a conditional move, it has now changed the total length of my code and affected its performance, and now users will have to adjust their code accordingly.
The API of a piece of code is a convention, sometimes compiler enforced, typically not entirely. If that convention is broken, it's good that tests fail. If changes outside that convention break tests, then it's pure overhead to repair those tests.
As a side note, the length check is not necessarily no longer cached just because the variable is no longer visible to that test. Perhaps the custom list implementation was replaced with a wrapper around java.ArrayList, so the length field is no longer accessible.
I mean I think it's fair to assume that TEST-Driven-Development has something to do with testing. That being said, Kent Beck recently (https://tidyfirst.substack.com/p/tdd-outcomes) raised a point saying TDD doesn't have to be just an X technique, which I wholeheartedly agree with.
Well, it's exactly as much about testing as it focus on writing and running tests.
What means, it's absolutely entirely about them.
People can claim it's about requirements all they want. The entire thing runs around the tests, and there's absolutely no consideration to the requirements except on the part where you map them into tests. If you try to create a requirements framework, you'll notice that there is much more to them than testing if they are met.
As I remember the discourse about TDD, originally it was described as a testing practice, and later people started proposing to change the last D from "development" to "design".
Yeah it’s kind of unfortunate because they make a very good argument about defining a thing better, and in the title use a wrong definition of an adjacent term.