The Long Journey To Clean Tests

It doesn’t take much searching to find a blog post or twitter account preaching the benefits of TDD. One aspect of practicing this discipline that I think is overlooked, as it’s use in a legacy system.

I believe many developers are working in a legacy system, without tests, and are striving to make that legacy system better.

A conversation between a developer in that position and a TDD advocating idealist may go something like this:

Me: “These tests are difficult to write and they take a long time to run.”

Idealist: “Sounds like your code was written poorly. Your unit tests should not depend on outside dependencies like the database and web service calls. It should only test the business rules.”

Me: “I’d agree with that, but the functionality I need to test is not written that way.”

Idealist: “That should tell you something. You need to refactor to decouple your business rules from those dependencies.”

Me: “We’re making progress on that, but it’s not so easy. Some of our business logic is coupled with the database. We’ll need to rewrite all of our queries. To inject the dependencies, we would need to change our API. By changing the API we need to make a lot of changes to the client code. A LOT of changes to the client code.”

Idealist: “You should rewrite the functionality as a separate class or method. That way you can test it. When you are sure it’s working you can gradually make replacements to the client code by switching to the new class/method.”

Me: “We’re trying that, but it takes quite a bit of development to create those new objects. For example, we are writing the persistence logic as a repository and injected it. How do you recommend we go about testing the client code? Most of client code does not have automated tests either. Some of the client code is used by other teams, and they are used to it working correctly. It’s hard to justify the manual testing for functionality that currently works. And did I mention it’s a lot of client code?”

Idealist: “You have to start somewhere, or else you will be testing manually forever. You should write an automated test for a piece of the client code. Once it passes, change the code slightly toward what you are hoping the API looks like. If the test fails, make a modification so that it passes, but only just enough to make it pass. Keep refactoring while keeping the tests green. Do this enough times until the functionality is decoupled”

Me: “We’re trying, but the tests are difficult to write, and they take a long time to run.”

The point is, that the journey to having clean, fast running tests may begin with writing slow and ugly tests. Tests that are slow because they are testing tightly coupled code. The slow, coupled tests still play a vital role: making sure all the current functionality still works and providing the freedom to refactor.

TDD evangelists would agree, the freedom to refactor is key. When you don’t have it, the continuous improvement loop is broken. The problem is, in a legacy system, getting the feedback loop in the first place can be quite the struggle.

3 Ups and 3 Downs of Test Driven Development

Below are three goods things I like about Test Driven Development and three annoyances. I have tried to avoid any benefit or annoyance that relates to unit testing in general. We all know there are many of those. The pros and cons here are relate specifically to following the strict guidelines of TDD.

3 Ups

It takes out the decision of “how do I start?”

Sometimes you don’t know the best place to start when your given a project or feature to develop. By following Test Driven Development, there is one less decision you need to make. Your task is to write a test, not an entire piece of software. Just by following this discipline, we are already breaking our software in to manageable chunks.

Write a test, then write another test. Then another.

Let’s you think in terms of how will it be used rather than how should it work

I’m sure it has happened to all of us at some point. We build out feature to the point where it can handle everything we asked it to do, but what remains is a little bit ugly. We’re left with a method signature that takes a handful of objects and gives away to much about the implementation details.

This can be avoided. It starts with pretending like every time you write a function, it is going to magically do the thing it is describing. What would this function look like? When you start this way, you are only going to add complexity when it is warranted.

TDD drives this process by starting with the calling code first.

It prevents your project from getting away from you

We all do it. We write production code and get in a groove. We don’t want to break our train of thought by switching contexts to write the calling code. We assure ourselves that we will remember to go back and write some tests later to make sure the code is working.

The rules of TDD are to prevent you from writing too much code before its tested. Well it actually prevents you from writing any code before it’s tested. When you write too much code, it’s rare you will remember all the tests you thought you needed to write.

3 Downs

Adhering to “The simplest thing that will possibly work.”

Following the rules of TDD requires writing code that does the simplest thing that will possibly work. This is a tough one to follow. The simplest thing that can possibly work usually involves intentionally adding code you know in the long run will NOT work. It often gets you into trouble because the only way to catch the bug you just made is to make sure you have enough tests, but when can you ever be sure?

It requires the thinking “I know that passed test is wrong, I need to come up with a new test that proves it’s wrong”. You are in an awkward state in the development process here. If you take a break from your work for a second, are you going to remember you did the simplest thing that could possible work on purpose?

I understand the rule, it is to prevent your mind from getting way ahead. It prevents you from skipping steps. However, adding bugs so that they can be snuffed out later and removed is a slightly risky process.

The assumption of fast running tests

It is not often that you begin using TDD when you happen to be starting a greenfield project. More commonly, you know it’s a practice that has well known benefits and want to apply it to your current work.

The difficult part here is applying TDD to code that isn’t well built for unit testing. Even if it’s new functionality you are writing, you may rely on libraries that are relying on database or web service calls that you can’t do anything about.

This is very commonly overlooked, advocates of TDD are going under the assumption that the Red – Green – Refactor is a fast feedback loop. If you are hamstrung to legacy code, that is not fit to run frequently, your efficiency may take a hit.

Is testable code always cleaner?

When following TDD, the test dictates the API. This avoids writing new functionality and then getting stuck trying to create a test after the fact. The general belief is, testable code is quality code, so the restriction to developing in this order yields well designed software.

What this tends to give way to, is many injected dependencies. This is normally a good practice but an absolute requirement can lead to unmaintainable software. Quite often you have to convince yourself that “it will be cleaner in the end” but is it always?

We also know that we are not perfect as developers. We are going to take wrong turns that TDD won’t be able to correct. Much like deliberately adding the buggy code, when we know something doesn’t feel right, we shouldn’t continue blindly.

Layered Architecture

I’m going to give my take on a very common layered architecture diagram like the one below. I don’t think this diagram is a perfect example of what layers belong where, but I’ll save that to the end. I think there’s value in giving some details on each one without being too picky on how the diagram needs to look.

There are many resources out there for figuring out what belongs in each layer. I am going to focus on the why the layer exists and how to determine what belongs in it.

Presentation Layer

What it is

All aspects of the code that the user or component client sees or interacts with. This could be a web site, a mobile application, a desktop application, a console application or whatever technology that comes next.

Why is its own layer

The way you present your application, should not change your application. The application should be able to work regardless of the mechanism you choose to interact with it. This is important for a few reasons.

  1. Technologies are going to change over time and you are going to want to change with them. You should be able to tinker with the presentation without having to change anything in the application. In fact, you should be able to completely replace your front end design if you so choose.
  2. You may want to move your code to a native mobile application or a console app. If this were to happen, you should not need to rewrite any of the same application logic.

 

Helping Answer “Does this belong in the Presentation Layer?”

  1. Does the code rely on a certain I/O Technology? Web, Mobile, Desktop etc.?” If so it belongs here.
  2. This layer shouldn’t directly do anything business related. All that work should be delegated to the application layer classes.
  3. You can still have business object names in your classes, so long as they are not “doing stuff” business related. The “stuff” should be functionality for presenting data.
  4. Any designer or marketer that has influence over the “look and feel” of your application, should be limited to this portion of the code base. If a change to “look and feel” causes you to rework some business logic, you probably have some code in the wrong area of the system.

 

Application Layer

What it is

This is everything that makes the application useful outside of the user interface. This is also where the business logic mixes with the technical details, but only if the technical details can be reused in different presentation layers.

Some diagrams may also refer to this layer as the “use cases” for the business objects. This also makes sense, as we don’t want our business objects to be coupled to each other. The “use cases” are going to refer to entire sequences of steps abstracted into method calls for the presentation layer.

Why it is its own layer

This layer is going to contain everything that needs to be coupled to technology, so its really what is left over from the presentation layer and the business layer. If something can be decoupled from use cases or technical details, it should be a part of the business layer.

An example of how we distinguish the above and below layers.

  1. Presentation Layer: Android application music application.
    • Useful for android users.
  2. Application Layer: Music Application Web Service with various methods: GetAllMusic, GetAllPlaylists, GetFavoriteMusicByUser, etc.
    • Useful for any presenter that needs to call a web service.
  3. Business Layer: Music, Playlist and Security libraries containing classes and business rules.
    • Useful for any music application we want to create

 

Helping Answer “Does this belong in the Application Layer?”

  1. If this is a core business rule, it should belong in the business layer if possible.
  2. Service or Controller classes typically belong in this layer, these are often performing a sequence of steps related to different business entities.
  3. All of the code that makes an application “work” behind the scenes.

 

Business Layer

What it is

The business entities in their simplest form. These are the building blocks used by the application layer to create an application.

Why it is its own layer

The application layer is going to need to change frequently. As more requirements, refactoring and optimizations occur, your core business logic should remain the same. This layer will be shielded from those technical decisions. This layer will be flexible and will avoid most of the complex architecture decisions.

Helping Answer “Does this belong in the Business Layer?”

  1. If you had to show a business user how the product works in the code, this is your BEST chance to show them some code they may actually understand
  2. Business functionality in this layer should be less likely to change then business functionality in the application layer.
  3. A client using code from this layer should be able to build any application they can think of. No technical decisions will have been made for them.
  4. Consider coding as if this were to be created as a third party library.
    1. The Application layer, works out of the box, but inflexible
    2. The Business layer, is the “do it yourself” kit
  5. These are much smaller and specific libraries, your application classes and methods are likely going to reference many business libraries. Your business libraries should avoid that.

 

Data Access Layer

What is it

This layer has one job, and it is to contain the classes in functions that persist the business entities in the layer above.

Why it is its own layer

  1. How we save things should not be a concern to the business
  2. Just like the presentation layer, how you save your application should not change your application
  3. Like the presentation layer, could contain business object names, but should not make business decisions

 

Helping Answer “Does this belong in the Data Access Layer?”

  1. Ideally there should be no business logic in this layer, but sometimes unavoidable for performance reasons. If performance is not an issue, the logic should not be added here.
  2. Product managers, sales, marketing or other business stake holders should not be influencing any decisions made in this area of the code base.

 

Final Thoughts

What I don’t like about this chart is that the data access layer is listed at the bottom. I see it more at the application layer level. Persisting data is something you are choosing to do as part of the application. However, often times you’ll want to reference the data access layer from the business entities. It works so long as its flexible enough to change later.

The toughest decisions to make are between the application and business layers. There are going to be times where you have “core business logic” that can’t be decoupled from technical details. The layer between application and business will be something that evolves as your application evolves.

A key role in all of this are using interfaces to keep logic separated. It is impossible to keep layers decoupled without them. Perhaps that is something to discuss in detail in a future post.

 

My First Post to No One

I’m here to talk Software Architecture. Specifically, the journey of truly visualizing what good architecture looks like and understanding why it is what the industry would consider “good”. I’m no expert, but In my 5 years of being a Software Web Developer, I’ve spent a lot of it trying to figure out the hidden formula. The hidden formula of knowing when you are on the wrong course and how to correct it. How to prevent applications from turning into the “big ball of mud”. There are some good resources on the topic, but they too often they give you the answer and skip over the thought process.

Range of topics were going to attempt to cover:

  • Actual Real World Problems
    • Such as handling conflicting requirements, determining the better of two implementations or resorting to bad practice because no other solution seems to exist.
  • The Best Resources on Clean Architecture
    • Because It’s important to sift through the fluff and get to the substance.
  • Algorithms, Processes and Strategies
    • We are all seeking a repeatable process that produces the desired result. It’s something I’ve been seeking for quality software architecture. Wish me luck in finding it.
  • Common problems and helpful tricks on how to get better at our profession
    • Because in any subset of the industry, these are helpful. And any time a good solution is found, it is it worth documenting.

 

We’ll try to get specific but stay simple. I’m going to try to stay on topic, but no promises.