Coding 102: Writing code other people can read

In the 1980s, personal computers became common in the workplace. With the release of the Apple II in 1977 and the IBM PC four years later, a tool once reserved for scientists and the military proliferated in accounting offices, universities, and hospitals. You no longer had to be an engineer to use a computer.

Something similar has happened with programming over the past decade. In everyday offices, classrooms, and laboratories, code is now being written and maintained by people who don’t think of themselves as programmers.

The good news is that it’s easier than ever to write code that works. There’s no shortage of beginner-friendly programming tutorials, and high-level languages today are more readable than ever. So, if someone at work hands you a Python script to run over some data, between Stack Overflow and Codecademy you can probably figure out how to get it to work.

The bad news is that after you write the code you just learned to write, someone is going to have to read it (that someone might be you in six months). To a beginner, getting a working solution is the finish line. To a developer, it’s only the starting line. And since more of us are writing and collaborating on code than ever before, it’s becoming crucial to write better code.

There’s a lot to learn about basic programming beyond “making it work.” For professional developers, writing maintainable code is fundamental to the job. But these practices aren’t usually discussed in “Coding 101”. Beginner tutorials usually focus on syntax.

So we’ll assume you already know how to pull together a working solution — it’s time to level up to Coding 102: writing code with and for other humans.

We’ll keep our examples short and simple and write them in pseudocode (which should seem familiar if you’ve written in a popular language). This post is aimed at people who’ve spent time writing code, but don’t write it professionally full-time. It helps if you are familiar with variables, functions, and if/else statements.

There’s an adage in programming that maintaining code is 75% of its total cost. Once you write the code, other people (including future you) will invest two to three times as much time, on average, in reading, updating, and fixing it. This doesn’t mean your code was badly written or buggy! Requirements evolve, other parts of the code change, and sometimes… well, yeah, sometimes there are bugs.

At one of my first (non-programming) jobs, I wrote a script that switched between browser tabs with user accounts and pasted in discount codes. At the time, this task was done by hand for hundreds of users each month.

I hacked the script together in a few hours, and, to my manager’s delight, it pasted hundreds of codes in a few minutes. But the following week, the script went haywire and applied the code to 300 wrong accounts. It took me two hours to code the script, but eight hours to undo the damage it caused and fix my convoluted code. That's the maintenance cost.

Good design “reduces the cost of future changes.” Let’s say my company changed the layout of the discount code page. How long would it take someone to update my script to work with the new layout? If another developer offered to help, how long would it take them to get up to speed on my work? That depends on how well the code was designed.

In order to make code easy to maintain, update, and fix, it should be modular, easy to follow, and easy to reason about. Writing “clean” code (as it’s often called) will help you think through the problem. If other people you work with can embrace these practices, you’ll spend less time struggling and more time collaborating productively.

Now that we understand why maintainable, clean code matters, let’s talk about how to write it.

Aim to write small functions that do one thing. Let’s say you’re writing a function that takes a list of widget prices and returns some text with the average of those prices. For example, you give it [13, 28, 17, 143, 184, 72.3] and it should return “The average price for 6 widgets is $72.22.”

A first attempt at this code might be a function called get_price_average(data) that:

Takes some data
Gets the length of the number list
Gets the sum of the prices
Divides the sum by the number of widgets to get the average
Rounds the average
Assembles all the stats into a string of text and prints it

Depending on the language, some of these operations might be built in. You put these steps in one function, and it works. This is a fine, typical first version.

On the plus side, the code is cohesive: everything we’re doing related to the stats text is in one place. So if you need to change anything having to do with this text, you know where to look.

The problem is that if you do need to make a change (add a stat, clarify which numbers should be averaged, or change the decimal points on the average), you’ll need to read and understand the whole function. And adding the change will just balloon it further.

The solution is to break our oversized get_price_average(), which does many things, into several small functions that each do one thing. A good guideline is: > A function should be small, do just one thing, and ideally shouldn’t impact anything outside the function

Let’s try to see this by example. Here’s how we might break the function up.

get_widget_price_stats(data): we’re renaming our wrapper function, get_price_average(). It now takes some data, extracts the relevant stat and returns texts with that stat. We’ll talk about names shortly!
get_max(list_of_numbers): gets the largest number in a list. Depending on the language you use, this might be part of the standard library.
get_min(list_of_numbers): gets the smallest number in a list. Depending on the language you use, this might be part of the standard library.
get_sum(list_of_numbers): gets the sum in a list of numbers.
get_average(list_of_numbers, decimal_points): takes a list of numbers and returns the average, rounded to decimal_points.

Did you find some of those function descriptions obvious? For example, it might be pretty clear that get_max(list_of_numbers) gets the largest number in a list. Good! That’s by design. When you write small functions that have clear single responsibilities, you can give them names that clearly convey what they do. Then, your future readers don’t have to read all the code in the function to understand what it does.

How does this design respond to changes? Let’s say instead of the average, you need to return the standard deviation. In the initial version, we’d need to find the section of the code that calculates the average and overwrite it with a standard deviation calculation. In the small-function version, we could create a new function, get_standard_deviation(list), then go into get_widget_price_stats() and replace the call to get_average() to get_standard_deviation(). To recap, the functions we break out should be small, self-contained, and do one thing. This makes it much easier to read and change your code. This practice alone will improve your code significantly.

There’s a running joke that “there are two hard things in computer science: cache invalidation and naming things.” We’re going to leave cache invalidation for another post and focus on naming things.

A related idea is that “good code is self-documenting.” In essence, this means that good code conveys what it does. Names are a big part of this: a good name is precise, easy to understand, and gives the code context without having to read every line.

Here are some good naming practices:

Describe intent: Try to describe the variable or function you’re naming as specifically as you can. For example, the function we looked at above was called get_price_average. However, since the string we’re returning is actually an average of widget prices, we could name the function more precisely: get_widget_price_average(). Later, we wanted it to serve as a wrapper function that returned some stat, so we renamed it to get_widget_price_stats() From this function, we extracted a general average function that takes a list of any numbers and returns the average. Try to avoid general names like date, slice, part, numbers. Instead, depending on context, date might become created_at, slice might become outdated_prices, part might become part_to_be_fixed, and numbers might become current_row.
Be consistent: Whichever convention you decide to use, stick with it. If the codebase uses get_ for functions that return values, each function should be get_[stat]: get_average, get_max, etc. If you name a function that returns standard deviation standard_deviation, another dev will wonder how it’s different from all the get_[stat] functions.
Give the right amount of information. A variable called first_name conveys a good amount of information. The alternative, name is too vague (first? last? middle? full?) and n or fn is downright indecipherable. These are examples of under-explaining; it’s not clear what the variable does. It’s also possible to over-explain: user_first_name doesn’t offer any additional information, since user is fairly generic, and it’s not clear what the alternative is (does the app have bots with first names?) Similarly, first_name_for_user_record, or get_price_average_by_dividing_prices_by_quantity offer redundant information. Deciding on the right level of specificity is a skill you’ll build over time. It requires you step back and think deeply about what the variable is doing. This’ll make your code better in the long run and make you a better developer with it.

Speaking of how you shouldn’t call your variable fn, here’s a list of naming practices to avoid:

Don’t include type: Focus on the purpose of the variable, not its implementation. Common names that break this rule are array, number/num, list, word, string. Syntax is usually clear from the code, so you don’t need to add function or the type to the variable name: get_average_function or array_of_prices. As a general rule, your name shouldn’t include the type of the variable (this is called “Hungarian notation”). If you need to say admission_numbers, keep looking for a better name (daily_admissions.)
Don’t use abbreviations: Is det “detail”, “detrimental”, “deterministic”, “Detroit”, or something else? Our earlier fn falls into this category.
Don’t use one-letter names: These are abbreviations on steroids—they’re completely incomprehensible! Instead, describe what the variable’s purpose is. There are a few uncommon exceptions to this:
1. Loops, where it’s common to refer to the index as i.
2. Scientific and mathematical conventions, as long as you’re certain they’ll be clear to other developers.
3. Domain-specific conventions (for example, in JavaScript, webpage events are often referred to as e instead of event).

These are exceptions—99% of your variables should have descriptive, multi-letter names!

The following problematic function takes some input, transforms it with seasonal_multiplier, adds a mystical 14, and returns result.

function get_result(list, seasonal_multiplier) {
	int result = list;
	result = list.sum / list.length;
	result = result * seasonal_multiplier;
	result = result + 14;
	
	return result;
}

Putting aside the opaque variable names, result represents four different things during the course of this function. It starts out as the input, then becomes input per participant, then input per participant adjusted for the seasonal multiplier, and is then finally returned with a magical 14 tacked on. It’s doing too much work! Most languages allow you to assign a new value to a new variable. In practice, this is usually not a great idea. A variable should represent one thing, and change its identity during its lifespan. When you look at a variable name, you should have a good sense for what its purpose is.

There are two ways to clear up this code:

1. Be clear about what the changes are. A better way is to be explicit about what each transformation is, declaring a new variable at each step. This might seem like over-explaining at first, but it makes the function much easier to comprehend.

function get_result(list, seasonal_multiplier) {
	output_per_participant = list.sum / list.length;
	seasonal_output_per_participant = output_per_participant * seasonal_multiplier;
	adjusted_seasonal_output_per_participant = seasonal_output_per_participant + 14;
	
	return adjusted_seasonal_output_per_participant;
}

As a final step, I’d probably rename the function get_adjusted_seasonal_output_per_participant().

2. Combine the math into one line. You can avoid this problem altogether by combining all the steps into one calculation. The reader then has to follow a lot of math at once, but at least the intent of the variable is clear, and one variable has one meaning throughout the function

function list, seasonal_multiplier(input) {
	adjusted_seasonal_output_per_participant = list.sum / list.length *  seasonal_multiplier + 14;
	return adjusted_seasonal_output_per_participant;
}

You might also wonder what that 14 represents. That’s what we call a magic number.

A number used without any context is called a “magic number,” and it makes the code harder to comprehend. Here’s a bit of incomprehensible code:

int yearly_total = monthly_average * 12 + 67.3;

monthly_average * 12 probably refers to the number of months in a year (we deduce from the better named yearly_total). But we shouldn’t have to deduce. And what the heck is 67.3? It might have made perfect sense to the code’s author, but we’re left to scratch our heads.

A better approach is to assign numbers to well-named variables, then use those in calculations.

int total_at_beginning_of_year = 67.3;
int months_per_year = 12;
int yearly_total = monthly_average * months_per_year + total_at_beginning_of_year;

Functions take inputs (referred to as “parameters” and “arguments”). In most languages, the order of the inputs matters. Let’s say you have a function that takes a fruit, a country, and a year, and tells you the average price for this fruit in that country and year.

function getAverageFruitPrice(fruit, country, year) {
    // get and return the price
}

This seems good so far. To get a price, you can call get_average_fruit_price(“peach”, “Albania”, 2007). But what if we want to narrow the price further by adding an option to only count fruit sold industrially (for example, to juice companies)?

No problem! We can just add a new boolean parameter, function get_average_fruit_price(fruit, country, year, sold_to_industry). But remember, we’ll need to pass in the arguments in the right order, and that’s an easy thing to mess up. We’ll need to check the function definition every time we want to call it! If you mess up and write get_average_fruit_price(“peach”, 2007, “Albania”, false), year is now “Albania”!

A good general rule is to keep your input count to two to three at the most. This way, developers building on top of your code don’t have to constantly check for the correct orders, and are less likely to make mistakes.

There are two ways to improve a function that takes too many inputs:

Break it up into smaller functions. Too many parameters could be an indicator that your function is doing too much.
Use an options object. If you can pass the parameters in an object (called a hash or a dictionary in some languages), you don’t have to worry about property order. This object is often called options and referred to an “options object.” The downside to this approach is that it’s less explicit about the expected parameters. Here’s how this looks in practice:

function get_average_fruit_price(options) {
    // use options.fruit, options.country, options.year, options.sold_to_industry
}

There are a lot of great, thorough rules around code comments, but we’ll focus on the low-hanging fruit…

Use comments sparingly to explain code that’s not otherwise clear. If you can improve the code instead, do that! Here’s an example of a bad comment covering up for unclear code:

 int f = 75; // f is the temperature in Fahrenheit

Let the code speak for itself by renaming f to temp_in_farenheit!

You won’t always have time to improve the hard-to-comprehend code you see. You might also write code you’re not thrilled with due to time constraints. Consider whether the code will need to be “touched” (read or modified) by someone else. Be honest — you don’t want to pass on code that’s hard to maintain!

If the code isn’t likely to be modified by anyone else, you can leave a comment explaining what a particularly challenging bit of code does. Ideally, your function and variable names should already do this. But when the functions contain logic that’s hard to comprehend and the function name doesn’t help, leave a comment!

The DRY principle stands for “don’t repeat yourself.” If you find yourself using the same code in multiple places, see if you can pull it out into a variable or a function.

string company_description = “Fruit Trucking & Shipping International ships fruit on our modern fleet of trucks and ships, anywhere in the world!”
string company_slogan = “Fruit Trucking & Shipping International: where the best fruits get on the best ships”

string company_name = “Fruit Trucking & Shipping International”
string company_description = company_name + “ ships fruit on our modern fleet of trucks and ships, anywhere in the world!”
string company_slogan = company_name + “: where the best fruit gets on the best ships”

If the company ever changes its name (for example, if it decides to only ship apples…) we can now do this in one place. Conversely, if you don’t follow DRY, you’ll need to find all the places you have duplicated code and remember to make changes in each of them.

Here’s another example: let’s say all your input temperatures are in Farenheit, but you need your output values to be in Celsius. Everytime you work with a temperature, you’ll need to F - 32 * 5/9. If you find out your organization now uses Kelvin , you’ll have to update this to F - 32 * 5/9 + 273.15 every place you reference a temperature.

A better solution is to pull this conversion into a function:

function fahrenheit_to_celsius(temp_in_farenheit) {
   return temp_in_farenheit - 32 * 5/9;
}

Now to switch to Kelvin, you can update the calculation in one place. In this particular example, you could probably just create a new (small, single-purpose) function called fahrenheit_to_kelvin(), and all the calls to fahrenheit_to_celsius() to fahrenheit_to_kelvin(). Now it’s very clear where you’re updating (you don’t have to look for mentions of “temperature” all throughout your code.)

It’s possible to overdo DRY. You don’t want to over-optimize too early, so a good alternative to DRY is WET, or “write everything twice”. If you need to write some code a third time, it’s probably a good idea to try and pull this out into its own function or variable. Over time, you’ll develop judgment for when to reuse the same piece of code over and over (like the example above) and when a few repeated lines are a part of separate processes.

Reading about good coding practices is like reading about working out. You have to follow the practices to reap the benefits!

Just because you learned about all the muscles doesn’t mean you need to exercise them all right away. Start small: the next time you code, try to be precise with your names. Then, try to write smaller functions. Then, reference this article to see what else you might have missed! Think of it like starting your fitness regimen with a few exercises a week.

Remember, we’re writing code for our future colleagues, and, importantly, our future selves. We don’t need to be dogmatic about the rules; we just need to focus on being as clear as we can.

You’ll quickly find that writing clean code involves making tradeoffs. For example, if you’re under time pressure writing a one-off script to parse some data and are confident nobody will see the script again, you can invest less time into writing clear code. But over time, thinking about code maintainability and readability will empower you to solve problems more effectively and will set you apart from many beginner developers..

By implementing some basic software dev principles, you’re leveling up from someone hacking through code to someone maintaining and building software. And that’s something to be proud of!

Coding 102: Writing code other people can read

We’re all coders now

Prerequisites

Why writing better code matters

Coding 102

1. Write small functions

2. Pick names carefully

3. Avoid reassigning variables

4. Avoid magic numbers

5. Use few parameters

6. Write good comments

7. Don’t repeat yourself

Next steps

Add to the discussion