Sunday, December 14, 2025

Programmers are strangely dogmatic about cryptic variable names

As a very long-time computer programmer, I have noticed an interesting psychological phenomenon: For some reason beginner programmers absolutely love when they can express a lot of functionality with as little code as possible. They also tend to love mainstream programming languages that allow them to do so.

As an example, if they are learning C, and they encounter the fact that you can implement the functionality of strcpy() with a very short one-liner (directly with C code, without using any standard library function), a one-liner that's quite cryptic compared to how it's done in an average programming language, they just love that stuff and get enamored with it.

This love of brevity in programming quite quickly becomes a habit to the vast majority of programmers, and most of them never "unlearn" said habit. One of the most common and ubiquitous language feature where most programmers will apply this love of brevity is in variable names. (Oftentimes they will do the same with other names in the program as well, such as function and class names, but variable names are the most common target for this, even when they keep those other names relatively clear.)

It becomes an instinct that's actually hard to get rid of (and I'm speaking of personal experience): That strong instinct of using single-letter variables, abbreviations and acronyms, no matter how cryptic and unreadable they may be to the casual reader. (Indeed, their most common defense of those names is "I can understand them perfectly well", without acknowledging that it may not be so clear to others reading the code. Or even to they themselves five years along the line.)

Thus, they will use variable names like for example i instead if index, pn instead of player_name, ret instead of return_value, col instead of column (or even better, column_index), col instead of colorrc instead of reference_count, and so on and so forth. After all, why go through the trouble of writing "return_value" when you can just write "ret"? It's so much shorter and convenient!

But the thing is, the more the code is littered with cryptic short variable names, the harder it becomes to read and understand to someone reading (and trying to understand) the code. I have got an enormous amount of experience on that, as I have had in the past to write an absolutely humongous amount of unit tests for a huge existing library. The thing about writing unit tests for existing code is that you really, really need to understand what the code is doing in order to write meaningful unit tests for it (especially when you are aiming for 100% code coverage).

And, thus, I have seen a huge amount of code that I have had to fully understand (in order to write unit tests for), and I have seen in intricate detail the vast difference in readability and understandability between code that uses cryptic variable and function names vs. code that uses clear readable names. Unsurprisingly, the latter helps quite a lot.

In fact, this is not some kind of niche and novel concept. The coding guidelines of many huge companies, like Google, Microsoft, Facebook and so on, have sections delineating precisely this. In other words, they strongly recommend using full English words in variable and function names rather than abbreviations and cryptic acronyms. One common principle is, relating to the latter: "If the acronym does not have an article in Wikipedia, just write it fully out."

One particular situation where I have noticed how much clear variable naming helps is in loop variables. Loop variables are the one thing that most programmers abbreviate the most, and the most often. Sometimes they go to outright unhealthy lengths to use loop variables that are as short as possible, preferably single-letter, even if that means using meaningless cryptic names like i, j and k.

I have, myself, noticed the importance of, and gotten into the habit of, naming loop variables after their use, ie. what they represent and are being used for. For example, let's say you are iterating through an array of names using a loop, with the loop variable indexing said array. Thus, I will name said variable eg. name_index rather than just i (which is way too common.) If the loop variable is counting something, I will usually name it something_count, or similar, rather than just i or n.

The longer the body of the loop is, and especially if there are several nested loops, the more important it becomes to name the loop variables clearly. It helps immensely keep track of and understand the code when the loop variables are directly naming what they represent, especially alongside naming everything else clearly. For example, suppose you see this line of code:

pd[i].n = n[i];

That doesn't really tell us anything. Imagine, however, if we changed it to this:

player_data[player_index].name = names[player_index];

Is it longer? Sure. But is it also significantly clearer? Absolutely! Even without seeing any of the surrounding code we already get a very good idea of what's happening here, much unlike with the original version.

Yet, try to convince the average experienced programmer, who is used to litter his code with short cryptic variable names, of this. You will invariably fail. In fact, for some reason the majority of computer programmers are strangely dogmatic about it. They are, in fact, so dogmatic about it that if you were to make this argument in an online programming forum or discussion board, you will be likely starting a full on flamewar. They will treat it like you are a stupid arrogant person who has personally insulted them to the core. I'm not exaggerating.

The instinct to write short cryptic code, very much including the use of short cryptic variable names, sits very deep in the mind of the average programmer. It's a strange psychological phenomenon, really.

I have concocted a name for this kind of cryptic programming style: Brevity over clarity. 

No comments:

Post a Comment