The Legacy Code Challenge

A viewer asks:

Have you done any shows on how you can start automated testing on an existing project? Much of what I do is on an existing, complex, single-page-app, and there isn’t a single test. We do use jslint for the main builds, but it’s really just for code-style checks. Have any thoughts on how you might approach that scenario?

Retrofitting tests onto existing code is a tough problem and an important skill. It’s high time we tackled it.

If you’d rather see a video demonstration, we cover it in The Lab starting with episode #6. For details, skip to the bottom of this essay.

The Challenge

There’s a chicken-and-egg problem with adding tests to existing code. The majority of your tests should be fine-grained unit tests; they run faster and are less likely to break than other forms of tests. Unfortunately, these sorts of tests need to poke into your codebase in order to set up dependencies and validate state. Unless your code was written with testability in mind (it wasn’t!), you can’t test it.

So you need to refactor. The problem is that, in a complex codebase, refactoring is dangerous. Side effects lurk behind every function. Twists of logic wait to trip you up. In short, if you refactor, you’re likely to break something without realizing it.

So you need tests. But to test, you need to refactor. But to refactor, you need tests. Etc., etc., argh.

The Solution

To break the chicken-and-egg dilemma, we need some way of refactoring without breaking code. One option is to just run manual tests after every change, but that is slow and error-prone. Refactorings in existing code have to be done extremely carefully, which means you’re taking small steps. Really small steps. As in “run the tests every 30 seconds” small. You don’t want to do that manually, believe me.

What we need is the simplicity of manual tests and the speed and convenience of automated tests. In other words, pinning tests.

A “pinning test” is not a good test. It doesn’t try to be. It’s just the simplest, fastest-running test you can write that will allow you to refactor your code. It’s often an end-to-end test, but it could also look at your log files, monkey-patch a core library function, or do something similarly ridiculous.

The key here is that the pinning test lets you do your dozens-of-tiny-refactorings loop really quickly. As the code improves, you add high-quality unit tests. Once the tests are good enough, you get rid of the pinning test. Lather, rinse, repeat.

The Strategy

Overall, the key is to think small. Don’t try to fix everything all at once. Choose something small, preferably something you needed to work on anyway, and fix that. Then fix the next thing, and the next thing. If you keep fixing the areas you’re actually working in, then the parts of the system you touch most will improve most quickly.

With that in mind, here’s the general strategy I use when adding tests to code:

  1. Start with manual or automated end-to-end testing. You’ll need to keep some form of end-to-end testing until the entire codebase has good fine-grained tests, which will probably be… hmm, multiply by ½τ, commute the one, don’t forget the e and i… ah, forever. Yep, definitely forever.

  2. Choose some small part of the system that gets changed frequently or has a lot of bugs. Preferably one that already has some changes or bug fixes planned. Introduce some basic build automation so you can lint your code and run your tests easily.

  3. Write pinning tests for a subset of that part of the system. Keep them small and don’t worry about code quality more than you have to.

  4. Using the pinning tests as a safety net, refactor the code and add unit tests. Avoid big rewrites, work incrementally, and don’t be afraid to revert or revise your changes. Refactoring existing code to make it testable is an art that takes a lot of practice to do well, and even then there’s a lot of trial and error.

  5. Delete the pinning tests when you’re done.

  6. Repeat steps 3-5, mixing it with regular development work, until you can stop running end-to-end tests on that part of the system. You’ll know it’s time when the end-to-end tests stop catching bugs.

  7. Repeat steps 2-6 until you no longer have any manual tests and you have a minimal number of automated end-to-end tests. That will take a while, and may never be done entirely, but you should see noticeable improvements after a few months of dedicated effort.

(And next time, use test-driven development from the beginning. You’ll save yourself a headache.)

Doing it Live

Pinning tests are one of many techniques you’ll use when retrofitting tests to legacy code. They let you make changes safely. Deciding what to change, though, is a bit of an art. So I’m going to do it live! Starting with the March special, I’ll head to The Lab and record an exhibition of testing legacy code. Unrehearsed, of course, so you can see all the real-world pain, anguish, and triumphs.

  • See tests get added to a real open source project!

  • Marvel at the depths of despair to which I can sink!

  • Amaze that it works out in the end! (erm... I hope.)

Ahem.

Anyway, I need your help choosing a project. (Update: it’s done! Thanks for your help.) I want to use a real-world, open-source project. The best project:

  • uses front-end JavaScript
  • is recognizable and widely-used
  • has interesting, photogenic behavior
  • includes code only a mother could love
  • has no or minimal tests

Put your suggestions in the comments! I look forward to hearing what you think.

comments powered by Disqus