Thursday, July 31, 2025

Matt Parker (inadvertently) proves why algorithmic optimization is important

Many programmers in various fields, oftentimes even quite experienced programmers, have this notion and attitude that optimization is not really all that crucial in most situations. So what if a program takes, say, 2 seconds to run when it could run in 1 second? In 99.99% of cases that doesn't matter. The important thing is that it works and does what it's supposed to do.

Many will often quote Donald Knuth, who in a 1974 article wrote "premature optimization is the root of all evil" (completely misunderstanding what he actually meant), and interpret that as meaning that one should actually avoid program optimization like the plague, as if it were somehow some kind of disastrous detrimental practice (not that most of them could ever explain why. It just is, because.)

Some will also reference some (in)famous cases of absolutely horrendous code in very successful programs and games, the most famous of these cases being probably the video game Papers, Please by Lucas Pope, which source code apparently is so horrendous that it would make any professional programmer puke. Yet, the game is enormously popular and goes to show (at least according to these people) that the actual quality of the code doesn't matter, what matters is that it works. Who cares if the source code looks like absolute garbage and the game might take 5% of resources when it could take 2%? If it works, don't fix it! The game is great regardless.

Well, I would like to present a counter-argument. And this counter-example comes from the youtuber and maths popularizer Matt Parker, although inadvertently so.

For one of his videos, related to the game Wordle (where you have to guess a 5-letter word by entering guesses and it showing correct letters with colors) he wanted to find out if there are any groups of 5 words of length 5 that use unique letters, in other words 25 different letters in total.

To do this he wrote some absolutely horrendous Python code that read a comprehensive word list file and tried to find a combination of five such words.

It took his program an entire month to run!

Any experienced programmer, especially those who have experience on such algorithms, should have alarm bells ringing loudly at this point, as this sounds like something that could be quite trivially done in a few seconds, probably even in under a second. After all, the problem is extremely restricted in its constraints, which are laughably small: Just five-letter words (can discard every other word), which even in the largest English dictionaries should be just a few thousands of them, and checking if a word contains only unique letters (which ought to restrict the list of words to a small fraction, as all the other 5-letter words can be discarded from the search), and finding combinations of five such words that share no common letters.

And, indeed, the actual programmers among his viewers immediately took the challenge and quite quickly wrote programs in various languages that solved the problem in a fraction of a second.

So yes, indeed: His "I don't really care about optimization as long as it does what it's supposed to do" solution took one entire month to calculate something that could be solved in a fraction of a second. Even with an extremely sloppily written program using a slow language it should only take a few seconds.

This is not a question of "who cares if the program takes 10 seconds to calculate something that could be calculated in 5 seconds?"

This is a question of "calculating in one month something that could be calculated in less than a second."

Would you be willing to wait for one entire month for something that could be calculated in less than a second? Would you be using the "as long as it works" attitude in this case?

It's the perfect example of why algorithmic optimization can actually matter, even in unimportant personal hobby projects. It's the perfect example of something where "wasting" even a couple of days thinking about and optimizing the code would have saved him a month of waiting. It would have been actually useful in practice.

(And this is, in fact, what Donald Knuth meant in his paper. His unfortunate wording is being constantly misconstrued, especially since it's constantly being quote-mined and taken out of its context.)

No comments:

Post a Comment