[Book Study] Clean Architecture — Part 1

Zong Yuan
11 min readOct 3, 2021

In my previous company, I was working on an web application from scratch with just 4 developers in the team. As the team grows to more than 30 developers, I realized the team’s overall performance didn’t just rapid increasing as the head counts did. At the same time, while the system was becoming more and more complicated, the quality of system dropped and there are less code refactoring occurred to enhance the performance compared with the earlier stages of development. I also noticed that every time after the deployment, my team members are stress if they deployed some critical features or CR even they did the testing before deployment.
As other larger companies (Google, Facebook…) have more developers work on much complicated project, but this kind of issue just look like didn’t happen on them, so it must be us that miss out something important in our development. Therefore, I aimed to start enhancing myself from two directions, code structure and also testing (TDD).
And I found this book “Clean Architecture” by Robert C. Martin, which answer my question and solve my problem. I finished the book and decided to read it again, but this time, I want to write down my summary (which will ignore some details that the book uses them to explain or strengthen the author’s ideas) and also my personal thought while reading for future reference.
I will split the whole book to a series of articles, please feel free to correct me or provide any feedback.

This article covers Chapter 1 to 6.

Foreword

When I was reviewing the PR, some times I just felt that the code structure is not right and will be hard to maintain in future. However, those are more likely my instinct/feeling and I am lacking some supportive theory to support those feelings.

Make a software working is easy, however, make it right is not easy. But how to define a right software?

  1. Easy to maintain
  2. Changes are simple and rapid
  3. Defects are few

In other words, developer can always use minimal efforts to maintain the software while the software working correctly and has a maximum flexibility.

Even the software is working fine (sure it is fine otherwise the boss will come to you), but it doesn’t mean it has flexibility for changes. If making changes on software sometimes required lots of efforts even the scope of changes are small, and if making one change on a part but always cause other parts of software crashed, then the flexibility of software is bad.

Chapter 1 — What is design and architecture?

The goal of software architecture is to minimized the human resources required to build and maintain the required system.

Design Quality Measurement

The effort required to meet the needs of customer. If effort stays low throughout the lifetime of software, the design is good. If efforts grows with each new release, the design is bad.

Defects can be happened when the developer working on feature A, but accidentally broke feature B logic. So if the kind of defects keep happening, then I think it considered as bad design too.

Case Study

The book provides a interesting case study here. When the development teams grows (and the expense on their salary grows), the productivity of the team drops instead.

So, what went wrong?

  1. Rush to market first, we can clean it up later”. However, it never happen because once you get to market, the market pressure and never ending new features will come, which mean your rushed and messed code become more messy, and at the end it become tech debts that drag your team productivity to zero.
  2. Messier code equals to faster code implementation in short term. The experiment done by Jason Gorman showed that adopt TDD on coding actually shorten the time he needs to complete the code. In other words, when you decided to write a messier code to rush for a deadline, you are actually driving yourself to the cliff.

I tried to adopt TDD in my challenges on Leetcode, and it works amazing, I still remember when I completed the code, I still doubt if I missed out anything because it just so smooth. Another benefit of TDD is you can make sure you really understand the requirement before start coding.

However, as mentioned here by Jason Gorman, TDD didn’t cost extra time for those TDD experts, but if you are unfamiliar to TDD, please expect learning curve on it.

The real evil is the developer’s overconfidence. Developers believed they can clean it up later, developers believed they can work faster without architecture and design. Therefore, even if developers decided to start over from scratch, the same issue will happen again as long as they still being overconfidence.

To design a good software architecture, so you need to know what is a good architecture look like, and this is what this book is about.

There is a famous quote in Chinese, “To finish a job nicely, first you need to prepare a good tools”. Which I think is suitable for software development, to deliver a good software, first you will need a good architecture design (Sure good development tools are required at the same time).

Chapter 2 — A Tale of Two Values

Behavior & Structure are two important values for a software. Programmers are hired not only to make machines behavior work as expected, they should also have to make sure the system structure/architecture as flexible to add in new feature.

The difficulty in making changes for new feature should be proportional to the scope of changes, not to the shape of changes. If the system architecture prefers one shape over another, when new feature belongs to another shape, it will be harder and harder every time to fit into the structure.

In other words, when stakeholder requests new feature, developer shouldn’t have to make changes on current business logic or code to let the new feature (shape of change) fit into current system. The new feature should be plugged in easily to new current system if the code structure is good enough.

If we categorize the values of system according to urgency and important , we can say that behavior of software is urgent but not always important, while the structure of software is important but not urgent.

And now we can arrange our priorities of tasks according to their urgency and importance:

  1. Urgent and important
  2. Not urgent but important
  3. Not important but urgent
  4. Not urgent and not important

As we can see from above, the structure of software is 1st and 2nd while the behavior of system is 1st and 3rd. The problem is developers often take 3rd (behavior) as top priority, which lead to messy structure in the end.

Business managers or the persons who raise new features/changes are not always aware about the importance of system structure, it is their duty to raise new features. However, we, as a software developer, should take the responsibility to make sure the system structure is good while working on the new features.

Just remember: If architecture comes last, then the system will become ever more costly to develop, and eventually change will become practically impossible for part or all of the system. It is software development team’s duty to fight hard enough for the architecture.

I always believe, it is my job to understand the requirement and propose a good solution, not just follow the instruction to implement the changes directly. Don’t be the code machine, be the developer.
I remember I read a funny quotes last time, “Imagine the developer that going to take over your code is someone angry easily and know where you live, if you not do it properly, you might get punch next time when you walk out from your house.”

Program Paradigms

The book introduces 3 program paradigms in following section. Program paradigms are ways of programming, telling you which structures to use and when to use them.

From my point of view, the book tries to explain two points in this section. First is the progress of software development that starts from unrestricted code until today structured code. Secondly, the book explains why restrict the way to write your code make software better.

Chapter 3 — Paradigm Overview

  1. Structured Programming
    - Removed goto , added if/then/else/do/while/until .
    - Imposes discipline on direct transfer of control
  2. Object-Oriented Programming
    - Removed use of function pointer
    - Imposes discipline on indirect transfer of control
  3. Functional Programming
    - Removed variables value modification
    - Imposes discipline upon assignment

Note that all of the paradigms mentioned remove capabilities from programmer, and none of them adds new capabilities. In other words, these paradigms teach us what not to do, more than tell us what to do.

However, what are the relationship between these paradigms and code architectures? As summarized from the book:

  1. Structured Programming
    - Used as algorithmic foundation of modules
    - Function
  2. Object-Oriented Programming
    - The polymorphism is used to cross architectural boundaries
    - Separation of Components
  3. Functional Programming
    - Discipline on location of and access to data
    - Data Management

Until here, actually I didn’t really catch what the book is trying to explain here. However, I got the idea once I finished next 3 chapters and come back to this chapter.

Chapter 4 — Structured Programming

In short, structured programming suggest that goto is considered harmful in programming.

Dijkstra (Edsger Wybe Dijkstra) discovered that certain uses of goto statement prevent modules from being decomposed recursively into smaller and smaller, which preventing the use of divide-and-conquer approach. He discovered that all those good goto statements are just simple selection (if/then/else) and iteration (do/while). Modules with these “good” uses of control structures could be recursively subdivided to smaller and provable units.

C provides goto statement, which was one of my favorite when I start learning programming. I remember once my teacher mentioned don’t use goto in future, but never told me the reason. I once thought that it is because using goto will cause some burden on the code reviews.

I guess it is why higher level programming languages like Java don’t have this statement at all.

The benefit of subdivided program to smaller and provable units is make your programs become testable.

For example,

if (a == 0) {
// Logic A
} else {
// Logic B
}

You can test the logic A and logic B separately to make sure your programs are working correctly under different situation.

There is an interesting fact mentioned in the book.

“Testing shows the presence, not the absence, of bugs.”

Which means when we are testing our programs, we actually trying to test if the functions work incorrect. If we failed to prove the incorrectness, then we consider the functions/programs to be correct enough for our purpose.

The ability to create falsifiable units of programming makes structured programming valuable today. And this is why functional decomposition is one of the best practices until today.

Chapter 5 — Object-Oriented Programming

The author discuss 3 major concepts of object-oriented programming (Encapsulation, Inheritance and Polymorphism) separately to explain what is OO, and what is great in OO from the perspective of software architecture.

Encapsulation

Encapsulation is not something new that provided by OO, low level like C also able to do perfect encapsulation. Besides on that, OO didn’t enforce encapsulation, programmers have to be well-behaved enough to achieve the encapsulation.

Inheritance

Same as encapsulation, inheritance is something that programmers able to do before OO (the author did provide example in C, but I will skip that here). Even though the OO provide easier way to achieve inheritance, inheritance still cannot considered as greatest score in OO.

Polymorphism

Polymorphism is available before OO too. Programming languages like C use function pointer to achieve the polymorphism behavior. However, using pointers to functions are dangerous which might cause bugs that hard to track and eliminate.

POSIX is good example of polymorphism in C.

OO eliminate function pointers and makes polymorphism trivial which is more convenient and much safer compared with old C way.

And with polymorphism of OO, we can use plugin architecture easier to loosen the dependencies between business core logic and other low level details (such as IO device).

If you remove dependency between core logic and other details, when other details have to be changed, the core logic will remain untouched.

Dependency Inversion

The control flow is not same with code dependency after polymorphism introduced.

As you can see in picture above, the source code dependency is opposite with direction of control flow, and this is called dependency inversion.

With this, OO languages able to provide a strong feature, that is “any source code dependency, no matter where it is, can be inverted”. In other words, programmer who working with OO languages now have absolute control over the source code dependencies in the system. And this is the what OO is really all about from software architect’s point of view.

What is the benefits of this dependency inversion? Let’s consider a system that build by 3 parts, UI, Business Logic and Database. The code dependency of system showed as below.

UI depends on Business Logic, Business Logic depends on Database

The code dependency is now following the flow of control, UI trigger business logic and business logic access data from database.

Now, imagine the database is outdated, and we decide to replace database with another type of database, instead of changing database code only, we have to change lots of the code in the business logic at the same time since it is depends on database.

With dependency inversion, we now can adjust the system’s code dependency like below.

UI and Database both depend on Business Logic

With the code dependency like above, the business logic won’t be affected if we need to replace UI or database. And this is why absolute control over every source code dependency in the system important to the software architecture.

Chapter 6 — Functional Programming

The important statement of functional programming is the variables of functional programming are immutable, in other words, the variables do not vary in functional programming.

Why immutability important to architecture?
All the problems (Race conditions, deadlock, concurrent update…) are causing by mutable variables, if the system variables are fully immutable, these problems cannot happen at all.

In other words, if we allow the immutable part of system become larger, the system become more robust.

Since immutability always larger storage and faster processor speed, it is difficult(impractical) to make whole system become immutable.

Segregate the system to immutable and mutable components is one of the most common compromises to enhance robustness of system with immutability.

Keep in mind that if you are segregating system to mutable and immutable components, always push as much code as possible to immutable components while put as less code as possible to mutable components.

Event Sourcing

The book provide a simple example for event sourcing. Imagine a banking application that maintains the account balances of its customers, those balances will be mutated when deposit and withdrawal transactions are executed.

Normally we will use a column in database table to keep the balance, however, when a lot transactions happening at the same time, a simple mistake will cause the balance become inaccurate.

Now imagine if we don’t keep the balance in database, every time when we need an account’s balance, we compute the balance from those transactions from the account. With this way, there is no mutable variables and there will be less chance to have inaccurate balance.

Of course this is crazy, imagine you have thousands of transactions per minutes, the calculation of balance will burn all your computing resources. But here just want to give you an idea about immutability provide better robustness. The book also provide a more practical idea with event sourcing implementation.

And this is the idea behind event sourcing. Event sourcing is a strategy that store the transactions but not the state, and apply all the transactions to find out the state if it is required.

As mentioned in the book, the source control system is working with this event sourcing idea.

Afterword

The first few chapters of the book are explaining the importance of code structure. Explaining why the three paradigms was created/discovered at first really give us some ideas about what is a good code structure design, and also why we need architecture to restrict how we write our code.

The next article will cover the principles part (Chapter 7–14) , which are talking about what principles to follow when we are designing our system architecture.

--

--