This tutorial is drawn from my book The Art of Clean Code (NoStarch 2022):
The Art of Clean Code
Most software developers waste thousands of hours working with overly complex code. The eight core principles in The Art of Clean Coding will teach you how to write clear, maintainable code without compromising functionality. The bookās guiding principle is simplicity: reduce and simplify, then reinvest energy in the important parts to save you countless hours and ease the often onerous task of code maintenance.
- Concentrate on the important stuff with the 80/20 principle — focus on the 20% of your code that matters most
- Avoid coding in isolation: create a minimum viable product to get early feedback
- Write code cleanly and simply to eliminate clutter
- Avoid premature optimization that risks over-complicating code
- Balance your goals, capacity, and feedback to achieve the productive state of Flow
- Apply the Do One Thing Well philosophy to vastly improve functionality
- Design efficient user interfaces with the Less is More principle
- Tie your new skills together into one unifying principle: Focus
The Python-based The Art of Clean Coding is suitable for programmers at any level, with ideas presented in a language-agnostic manner.
Write Clean & Simple Code
Story: I learned to focus on writing clean code the hard way.
One of my research projects during my time as a doctoral researcher in distributed systems was to code a distributed graph processing system from scratch.
The system allowed you to run graph algorithms such as computing the shortest path on a large map in a distributed environment to speed up computation among multiple machines.
If you’ve ever written a distributed application where two processes that reside on different computers interact with each other via messages, you know that the complexity can quickly become overwhelming.
My code had thousands of lines of code and bugs were popping up frequently. I didn’t make any progress for weeks at a timeāit was very frustrating.
In theory, the concepts I developed sounded great and convincing. But practice got me!
Finally, after a month or so working full-time on the code base without seeing any encouraging progress, I decided to radically simplify the code base.
- I started to use libraries instead of coding functions myself.
- I removed large code blocks of premature optimizations (see later).
- I removed code blocks that I had commented out for a possible later use.
- I refactored variable and function names. I structured the code in logical units and classes.
And, after a week or so, not only was my code more readable and understandable by other researchers, it was also more efficient and less buggy. I managed to make progress again and my frustration quickly morphed into enthusiasmāclean code had rescued my research project!
Complexity: In the previous chapters, you’ve learned how harmful complexity is for any code project in the real world.
Complexity kills your productivity, motivation, and time. Because most of us haven’t learned to speak in source code from an early age, it can quickly overwhelm our cognitive abilities.
The more code you have, the more overwhelming it becomes. But even short code snippets and algorithms can be complicated.
The following one-liner code snippet from our book Python One-Liners is a great example of a piece of source code that is short and concise but still complex!
# Quicksort algorithm to sort a list of integers unsorted = [33, 2, 3, 45, 6, 54, 33] q = lambda l: q([x for x in l[1:] if x <= l[0]]) + [l[0]] + q([x for x in l if x > l[0]]) if l else [] print(q(unsorted)) # [2, 3, 6, 33, 33, 45, 54]
You can find an explanation of this code snippet in our book Python One-Liners or online at https://blog.finxter.com/python-one-line-quicksort/.
Complexity comes from many directions when working with source code. It slows down our understanding of the code.
And it increases the number of bugs in our code. Both slow understanding and more bugs increase the project costs and the number of people hours required to finish it.
Robert C. Martin, author of the book Clean Code, argues that the more difficult it is to read and understand code, the higher the costs to write code as well:
“Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.” — Robert C. Martin
This relationship is visualized in Figure 5-1.
The x axis corresponds to the number of lines written in a given code project. The y axis corresponds to the time to write one additional line of code.
In general, the more code you’ve already written in one project, the more time it takes to write an additional line of code.
Why is that? Say, you’ve written n lines of code and you add the n+1
st line of code. Adding this line may have an effect on potentially all previously written lines.
- It may have a small performance penalty which impacts the overall project.
- It may use a variable that is defined at another place.
- It may introduce a bug (with probability c) and to find that bug, you must search the whole project (so, your expected costs per line of code is c * T(n) for a steadily increasing function T with increasing input n).
- It may force you to write additional lines of code to ensure backward compatibility.
There are many more reasons but you get the point: the additional complexity causes to slow down your progress the more code you’ve written.
But Figure 5-1 also shows the difference between writing dirty versus clean code. If writing dirty code wouldn’t result in any benefit, nobody would do it!
There’s a very real benefit of writing dirty code: it’s less time-consuming in the short term and for small code projects. If you cram all the functionality in a 100-line code script, you don’t need to invest a lot of time thinking and structuring your project.
But as you add more and more code, the monolithic code file grows from 100 to 1000 lines and at a certain point, it’ll be much less efficient compared to a more thoughtful approach where you structure the code logically in different modules, classes, or files.
š Rule of thumb: try to always write thoughtful and clean codeābecause the additional costs for thinking, refactoring, and restructuring will pay back many times over for any non-trivial project. Besidesāwriting clean code is just the right thing to do. The philosophy of carefully crafting your programming art will carry you further in life.
You don’t always know the second-order consequences of your code. Think of the spacecraft on a mission towards Venus in 1962 where a tiny bugāan omission of a hyphen in the source codeācaused NASA engineers to issue a self-destruct command which resulted in a loss of the rocket worth more than $18 million at the time.
To mitigate all of those problems, there’s a simple solution: write simpler code.
Simple code is less error-prone, less crowded, easier to grasp, and easier to maintain.
It is more fun to read and write.
In many cases, it’s more efficient and takes less space.
It also facilitates scaling your project because people won’t be scared off by the complexity of the project.
If new coders peek into your code project to see whether they want to contribute, they better believe that they can understand it. With simple code, everything in your project will get simpler.
You’ll make faster progress, get more support, spend less time debugging, be more motivated, and have more fun in the process.
So, let’s learn how to write clean and simple code, shall we?
Clean code is elegant and pleasing to read. It is focused in the sense that each function, class, module focuses on one idea.
A function transfer_funds(A,B)
in your banking application does just thatātransferring funds from account A
to account B
. It doesn’t check the credit of the sender A
āfor this, there’s another function check_credit(A)
. Simple but easy to understand and focused.
How do you get simple and clean code? By spending time and effort to edit and revise the code. This is called refactoring and it must be a scheduled and crucial element of your software development process.
Let’s dive into some principles to write clean code. Revisit them from time to timeāthey’ll become meaningful sooner or later if you’re involved in some real-world projects.
Principles to Write Clean Code
Next, you’ll going to learn a number of principles that’ll help you write cleaner code.
Principle 1: You Ain’t Going to Need It
The principle suggests that you should never implement code if you only expect that you’re going to need its provided functionality someday in the future—because you ain’t gonna need it! Instead, write code only if you’re 100% sure that you need it. Code for today’s needs and not tomorrow’s.
It helps to think from first principles: The simplest and cleanest code is the empty file. It doesn’t have any bug and it’s easy to understand. Now, go from thereāwhat do you need to add to that? In Chapter 4, you’ve learned about the minimum viable product. If you minimize the number of features you pursue, you’ll harvest cleaner and simpler code than you could ever attain through refactoring methods or all other principles combined. As you know by now, leaving out features is not only useful if they’re unnecessary. Leaving them out even makes sense if they provide relatively little value compared to other features you could implement instead. Opportunity costs are seldomly measured but most often they are very significant. Only because a feature provides some benefits doesn’t justify its implementation. You have to really need the feature before you even consider implementing it. Reap the low-hanging fruits first before you reach higher!
Principle 2: The Principle of Least Surprise
This principle is one of the golden rules of effective application and user experience design. If you open the Google search engine, the cursor will be already focused in the search input field so that you can start typing your search keyword right away without needing to click into the input field. Not surprising at allābut a great example of the principle of least surprise. Clean code also leverages this design principle. Say, you write a currency converter that converts the user’s input from USD to RMB. You store the user input in a variable. Which variable name is better suited, user_input
or var_x
? The principle of least surprise answers this question for you!
Principle 3: Don’t Repeat Yourself
Don’t Repeat Yourself (DRY) is a widely recognized principle that implies that if you write code that partially repeats itselfāor that’s even copy&pasted from your own codeāis a sign of bad coding style. A negative example is the following Python code that prints the same string five times to the shell:
print('hello world') print('hello world') print('hello world') print('hello world') print('hello world')
The code repeats itself so the principle suggests that there will be a better way of writing it. And there is!
for i in range(5): print('hello world')
The code is much shorter but semantically equivalent. There’s no redundancy in the code.
The principle also shows you when to create a function and when it isn’t required to do so. Say, you need to convert miles into kilometers in multiple instances in your code (see Listing 5-1).
miles = 100 kilometers = miles * 1.60934 # ... # BAD EXAMPLE distance = 20 * 1.60934 # ... print(kilometers) print(distance) ''' OUTPUT: 160.934 32.1868 '''
Listing 5-1: Convert miles to kilometers twice.
The principle Don’t Repeat Yourself suggests that it would be better to write a function miles_to_km(miles)
onceārather than performing the same conversion explicitly in the code multiple times (see Listing 5-2).
def miles_to_km(miles): return miles * 1.60934 miles = 100 kilometers = miles_to_km(miles) # ... distance = miles_to_km(20) # ... print(kilometers) print(distance) ''' OUTPUT: 160.934 32.1868 '''
Listing 5-2: Using a function to convert miles to kilometers.
This way, the code is easier to maintain, you can easily increase the precision of the conversion afterwards without searching the code for all instances where you used the imprecise conversion methodology.
Also, it’s easier to understand for human readers of your code. There’s no doubt about the purpose of the function miles_to_km(20)
while you may have to think harder about the purpose of the computation 20 * 1.60934.
The principle Don’t Repeat Yourself is often abbreviated as DRY and violations of it as WET: We Enjoy Typing, Write Everything Twice, and Waste Everyone’s Time.
Principle 4: Code For People Not Machines
The main purpose of source code is to define what machines should do and how to do it. Yet, if this was the only criteria, you’d use a low-level machine language such as assembler to accomplish this goal because it’s the most expressive and most powerful language.
The purpose of high-level programming languages such as Python is to help people write better code and do it more quickly. Our next principle for clean code is to constantly remind yourself that you’re writing code for other people and not for machines. If your code will have any impact in the real world, it’ll be read multiple times by you or a programmer that takes your place if you stop working on the code base.
Always assume that your source code will be read by other people. What can you do to make their job easier? Or, to put it more plainly: what can you do to mitigate the negative emotions they’ll experience against the original programmer of the code base their working on?
Code for people not machines! š§
What does this mean in practice? There are many implications. First of all, use meaningful variable names. Listing 5-3 shows a negative example without meaningful variable names.
# BAD xxx = 10000 yyy = 0.1 zzz = 10 for iii in range(zzz): print(xxx * (1 + yyy)**iii)
Listing 5-3: Example of writing code for machines.
Take a guess: what does the code compute?
Let’s have a look at the semantically equivalent code in Listing 5-4 that uses meaningful variable names.
# GOOD investments = 10000 yearly_return = 0.1 years = 10 for year in range(years): print(investments * (1 + yearly_return)**year)
Listing 5-4: Using a function to convert miles to kilometers.
The variable names indicate that you calculate the value of an initial investment of 1000 compounded over 10 years assuming an annual return of 10%.
The principle to write code has many more applications. It also applies to indentations, whitespaces, comments, and line lengths. Clean code radically optimizes for human readability. As Martin Fowler, international expert on software engineering and author of the popular book Refactoring, argues:
āAny fool can write code that a computer can understand. Good programmers write code that humans can understand.ā
Principle 5: Stand on the Shoulders of Giants
There’s no value in reinventing the wheel. Programming is a decade-old industry, and the best coders in the world have given us a great legacy: a collective database of millions of fine-tuned and well-tested algorithms and code functions.
Accessing the collective wisdom of millions of programmers is as simple as using a one-liner import statement. You’d be crazy not to use this superpower in your own projects.
Besides being easy to use, using library code is likely to improve the efficiency of your code because functions that have been used by thousands of coders tend to be much more optimized than your own code functions.
Furthermore, library calls are easier to understand and take less space in your code project.
For example, if you’d need a clustering algorithm to visualize clusters of customers, you can either implement it yourself or stand on the shoulders of giants and import a clustering algorithm from an external library and pass your data into it.
The latter is far more time efficientāyou’ll take much less time to implement the same functionality with fewer bugs, less space, and more performant code. Libraries are one of the top reasons why master coders can be 10,000 times more productive than average coders.
Here’s the two-liner that imports the KMeans module from the scikit-learn Python library rather than reinventing the wheel:
from sklearn.cluster import KMeans kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
If you’d want to implement the KMeans algorithm, it’ll take you a few hours and 50 lines of codeāand it’ll clutter your code base so that all future code will become harder to implement.
Principle 6: Use the Right Names
Your decisions on how to name your functions, function arguments, objects, methods, and variables uncovers whether you’re a beginner, intermediate, or expert coder. How?
In any programming language, there are many naming conventions that are used by all experienced coders.
If you violate them, it immediately tells the reader of your code base that you’ve not had a lot of experience with practical code projects. The more such “tells” exist in your code, the less serious will a reader of your code take it.
There are a lot of explicit and implicit rules governing the correct naming of your code elements. These rules may even differ from programming language to programming language.
For example, you’ll use camelCaseNaming
for variables in the Java programming language while you’ll use underscore_naming
in Python.
If you start using camel case in Python, everyone will immediately see that you’re a Python beginner. While you may not like this, it’s not really a big problem to be perceived as a beginnerāeveryone has been one at one point in time. Far worse is that other coders will be negatively surprised when reading their code.
Instead of thinking about what the code does, they start thinking about how your code is written. You know the principle of least surpriseāthere’s no value in surprising other coders by choosing unconventional variable names.
So, let’s dive into a list of naming rule of thumbs you can consider when writing source code. This will speed up your ability to learn how to write clean code names.
However, the best way to learn is to study the code of people who are better than you. Read a lot of programming tutorials, join the StackOverview community, and check out the Github code of open-source projects.
- Choose descriptive names. Say you create a function to convert currencies from USD to EUR in Python. Call it
usd_to_eur(amount)
rather thanf(x)
. - Choose unambiguous names. You may think that
dollar_to_euro(amount)
would be good name as well for the previously discussed function. While it is better thanf(x)
, it’s worse thanusd_to_eur(amount)
because it introduces an unnecessary degree of ambiguity. Do you mean US, Canadian, or Australian Dollar? If you’re in the US, the answer may be obvious to you. But an Australian coder may not know that the code is written in the US and may assume a different output. Minimize these confusions! - Use Pronounceable Names. Most coders subconsciously read code by pronouncing it in their mind. If they cannot do this subconsciously because a variable name is unpronounceable, the problem of deciphering the variable name takes their precious attention. They have to actively think about possible ways to resolve the unexpected naming. For example, the variable name
cstmr_lst
may be descriptive and unambiguous, but it’s not pronounceable. Choosing the variable namecustomer_list
is well worth the additional space in your code! - Use Named Constants, Not Magic Numbers. In your code, you may use the magic number 0.9 multiple times as a factor to convert a sum in USD to a sum in EUR. However, the reader of your codeāincluding your future self that rereads your own codeāhas to think about the purpose of this number. It’s not self-explanatory. A far better way of handling this “magic number” 0.9 is to store it in a variable
CONVERSION_RATE = 0.9
and use it as a factor in your conversion computations. For example, you may then calculate your income in EUR asincome_euro = CONVERSION_RATE * income_usd
. This way, their’s no magic number in your code and it becomes more readable.
These are only some of the naming conventions. Again, to pick the conventions up, it’s best to Google them once (for example, “Python Naming Conventions”) and study Github code projects from experts in your field.
Principle 7: Single-Responsibility Principle
The single responsibility principle means that every function has one main task. A function should be small and do only one thing. It is better to have many small functions than one big function doing everything at the same time. The reason is simple: the encapsulation of functionality reduces overall complexity in your code.
As a rule of thumb: every class and every function should have only one reason to change.
If there are multiple reasons to change, multiple programmers would like to change the same class at the same time. You’ve mixed too many responsibility in your class and now it becomes messy and cluttered.
Let’s consider a small examples using Python code that may run on an ebook reader to model and manage the reading experience of a user (see Listing 5-5).
class Book: def __init__(self): self.title = "Python One-Liners" self.publisher = "NoStarch" self.author = "Mayer" self.current_page = 0 def get_title(self): return self.title def get_author(self): return self.author def get_publisher(self): return self.publisher def next_page(self): self.current_page += 1 return self.current_page def print_page(self): print(f"... Page Content {self.current_page} ...") python_one_liners = Book() print(python_one_liners.get_publisher()) # NoStarch python_one_liners.print_page() # ... Page Content 0 ... python_one_liners.next_page() python_one_liners.print_page() # ... Page Content 1 ...
Listing 5-5: Modeling the book class with violation of the single responsibility principleāthe book class is responsible for both data modeling and data representation. It has two responsibilities.
The code in Listing 5-5 defines a class Book
with four attributes: title, author, publisher, and current page number.
You define getter methods for the attributes, as well as some minimal functionality to move to the next page.
The function next_page()
may be called each time the user presses a button on the reading device. Another function print_page()
is responsible for printing the current page to the reading device.
This is only given as a stub and it’ll be more complicated in the real world. While the code looks clean and simple, it violates the single responsibility principle: the class Book is responsible for modeling the data such as the book content, but it is also responsible for printing the book to the device. You have multiple reasons to change.
You may want to change the modeling of the book’s dataāfor example, using a database instead of a file-based input/output method. But you may also want to change the representation of the modeled dataāfor example, using another book formatting scheme on other type of screens.
Modeling and printing are two different functions encapsulated in a single class. Let’s change this in Listing 5-6!
class Book: def __init__(self): self.title = "Python One-Liners" self.publisher = "NoStarch" self.author = "Mayer" self.current_page = 0 def get_title(self): return self.title def get_author(self): return self.author def get_publisher(self): return self.publisher def get_page(self): return self.current_page def next_page(self): self.current_page += 1 class Printer: def print_page(self, book): print(f"... Page Content {book.get_page()} ...") python_one_liners = Book() printer = Printer() printer.print_page(python_one_liners) # ... Page Content 0 ... python_one_liners.next_page() printer.print_page(python_one_liners) # ... Page Content 1 ...
Listing 5-6: Adhering to the single responsibility principleāthe book class is responsible for data modeling and the printing class is responsible for data representation.
The code in Listing 5-6 accomplishes the same task but it satisfies the single responsibility principle. You create both a book and a printer class.
The book class represents book meta information and the current page number.
The printer class prints the book to the device. You pass the book for which you want to print the current page into the method Printer.print_page()
.
This way, data modeling and data representation are decoupled and the code becomes easier to maintain.
The Art of Clean Code
Most software developers waste thousands of hours working with overly complex code. The eight core principles in The Art of Clean Coding will teach you how to write clear, maintainable code without compromising functionality. The bookās guiding principle is simplicity: reduce and simplify, then reinvest energy in the important parts to save you countless hours and ease the often onerous task of code maintenance.
- Concentrate on the important stuff with the 80/20 principle — focus on the 20% of your code that matters most
- Avoid coding in isolation: create a minimum viable product to get early feedback
- Write code cleanly and simply to eliminate clutter
- Avoid premature optimization that risks over-complicating code
- Balance your goals, capacity, and feedback to achieve the productive state of Flow
- Apply the Do One Thing Well philosophy to vastly improve functionality
- Design efficient user interfaces with the Less is More principle
- Tie your new skills together into one unifying principle: Focus
The Python-based The Art of Clean Coding is suitable for programmers at any level, with ideas presented in a language-agnostic manner.
Do you want to develop the skills of a well-rounded Python professional—while getting paid in the process? Become a Python freelancer and order your book Leaving the Rat Race with Python on Amazon (Kindle/Print)!
References
- https://code.tutsplus.com/tutorials/solid-part-1-the-single-responsibility-principle–net-36074
- https://en.wikipedia.org/wiki/Single-responsibility_principle
- https://medium.com/hackernoon/the-secret-behind-the-single-responsibility-principle-e2f3692bae25
- https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8263157&casa_token=Ydc5j4wwdWAAAAAA:iywl9VJ_TRe_Q3x2F7-XOgKHvrnz7TuJhBQ8iDtsSVDv1WXTGN-bCSscP0WjSs7X7LVXJFGNfgM&tag=1
- https://raygun.com/blog/costly-software-errors-history/
Where to Go From Here?
Enough theory. Letās get some practice!
Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.
To become more successful in coding, solve more real problems for real people. Thatās how you polish the skills you really need in practice. After all, whatās the use of learning theory that nobody ever needs?
You build high-value coding skills by working on practical coding projects!
Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?
š If your answer is YES!, consider becoming a Python freelance developer! Itās the best way of approaching the task of improving your Python skillsāeven if you are a complete beginner.
If you just want to learn about the freelancing opportunity, feel free to watch my free webinar āHow to Build Your High-Income Skill Pythonā and learn how I grew my coding business online and how you can, tooāfrom the comfort of your own home.