Tuesday, December 23, 2025

I don't understand why Ramanujan summation is taken seriously

One of the most (in)famous "proofs" that the sum of all natural numbers is -1/12 uses as one of its steps a so-called Ramanujan summation to declare that the the infinite sum

1 - 1 + 1 - 1 + 1 - 1 + ... = 1/2

I don't see anything that would justify this equality. The infinite sum is divergent, ie. it does not converge to any particular value no matter how far you advance in it. There is no justification to assigning the arbitrary value 1/2 to it. (There is no justification to just declare that the result of the entire sum is the average of two consecutive partial results. There's even less justification for it because what you get as partial results depends on how you group the terms.)

How this kind of thing should normally be handled is like this:

1) Hypothesis: The infinite sum 1 - 1 + 1 - 1 + ... = 1/2

2) Counter-proof: We assume that the hypothesis is true and show that it leads to a contradiction. Namely: The assumption leads to the contradictory statement that the sum of all natural numbers, which is divergent, is a particular finite value, which would imply the sum would be convergent. An infinite sum cannot be both divergent and convergent at the same time, thus it's a mathematical contradiction.

3) Thus: The hypothesis cannot be true.

In general, whenever a statement leads to a contradiction, it proves that the statement is false. In this case, we have proven by contradiction the Ramanujan summation as incorrect.

But rather than declaring said summation as incorrect (because it leads to a contradictory result), instead mathematicians have taken it as correct and subsequently also the nonsensical statement that results from it.

It's incomprehensible. 

Saturday, December 20, 2025

Quake speedruns slightly marred for me

Some time ago I wrote a blog post about why I don't really watch speedruns anymore, even though I used to be a huge and avid fan 20 years ago. The (quite long) article can be found here. The short summary is: While 20 years ago I didn't mind and in fact was fascinated by extensive glitch abuse, over the years I have grown tired of it and consider it so boring that I don't even watch glitch-heavy speedruns anymore. And, unfortunately, nowadays "glitch-heavy speedruns" covers about 90% of them, if not even more.

I mention in that post that Quake is one of the few games I still watch speedruns of, and the main reason is that, due to how glitch-free the game is, the speedruns are also almost completely glitch-free, and thus show just pure within-the-game sheer awesome playing skill without much if any game-breaking glitches.

I also say this in that post, quote:

"One particularly bad example made me outright want to puke: In a speedrun of the game The Talos Principle, the runner at one point would go to the main menu, go to the game options, set the framerate limit to 30 frames per second, return to the game, perform a glitch (that could only be done with a low framerate), and afterwards go back to the options menu and set the framerate back to unlimited. This was so utterly far-removed from gameplay proper, and was just so utterly disgusting, that I just stopped watching the speedrun right then and there."

Well, you might guess where I'm going with this.

Indeed, framerate abuse has been introduced into Quake speedrunning. It's not an extremely recent addition, mind you (I believe it was started being used several years ago). It's just that I only have noticed now.

I probably did not notice because the framerate abuse is so subtle, as a key can be bound to change the framerate cap, and thus it can be changed on the fly without having to go into any menus, and it doesn't interrupt the speedrun. The only visible indication of this is that a message appears on the top left of the screen telling about the settings change, and it's very easy to miss when watching the speedrun. The framerate is also changed so rarely during speedruns that it's easy to miss for that reason as well.

The game supports capping the framerate in steps of 10, with the minimum being 10 fps, and the maximum 100 fps. And the framerate abuse swaps between those two framerates. 

Quite naturally, I don't really like the idea much better than with other games, like The Talos Principle mentioned above. Some details about it, however, make it slightly less bothering though, so it doesn't really make me want to quit watching Quake speedruns:

  1. As mentioned, it can be done on the fly rather than going to the settings menu to have to do it, so it doesn't interrupt the speedrun itself. Not that this would be the main reason to dislike the technique, but still.
  2. As far as I understand, the technique cannot (so far) be used for any major skips and instead it can only be used in two very specific situations: To press buttons slightly earlier, and to reach the end slightly earlier.
  3. And that "slightly earlier" really means it: At most 0.1 seconds can be saved this way for each button that has to be pressed and once at the end of the level (as a direct consequence of the framerate being 10 frames per second.) And even then, this is only when the button is right in front of the player (and is not eg. pressed from the side by barely glancing it.)

So in a typical level, where the speedrunner might have to press three buttons, about 0.4 seconds can be saved at most.

While I don't really like the technique in principle, it has so little effect on the speedruns that I consider this only an extremely minor annoyance. 

Tuesday, December 16, 2025

How an experienced programmer approaches Matt Parker's wordle problem

I have written earlier a blog post about an example of where program optimization actually really matters and can make an absolutely enormous difference in how much you have to wait for a program to do something (even though the notion that program optimization is not all that important is way too widespread). In that case it was a question of Matt Parker's original Python program taking an entire month to calculate something that could be (quite trivially) calculated in just a few seconds, even in under a second with a bit of algorithmic optimization.

I would like to describe here how an experienced programmer approaches such a problem.

Said problem is:

Given a big English dictionary file (ie. a file containing hundreds of thousands of unique English words), find all combinations of five 5-letter words that combined use 25 unique letters of the alphabet.

Any experienced programmer would very quickly think of ways to do that in a few seconds at most, and categorically know that one month is way, way too slow. This even if we restrict ourselves to using Python.

Matt Parker is not an experienced programmer, and his approach to solving the problem with a Python program was extraordinarily naive and inefficient. Essentially, he read the entire dictionary into some kind of data container, all the several hundreds of thousands of words, and then went through every single combination of 5 words and checked if they matched the conditions: In other words, all five words are 5 letters long, and all the 25 letters are unique.

That is, of course, an extremely inefficient way of doing it and, as it so happens, it can be sped up by several orders of magnitude with extremely simple optimizations. These optimizations might sound trivial to most programmers, and when said, but they might not come to the mind of a beginner programmer.

First: By far the biggest and also the simplest optimization (that Matt clearly didn't think of): We are only interested in 5-letter words. That means we can discard all the other words from the get-go. It's that simple. When reading the dictionary file, if a word doesn't have 5 letters, just discard it and don't add it to the data container.

That simple step reduces the number of words from several hundred thousands to just a few thousands. That speeds up the entire subsequent calculations by several orders of magnitude (more than two because we are talking about quadratic behavior).

Matt's biggest mistake was to take all the words in the dictionary file without discarding any of them. That, rather obviously, is completely unnecessary and extremely wasteful and inefficient. Just this optimization alone, even without anything else, would have probably reduced the runtime of his program from one month to just an hour or so, perhaps even less.

Second: We can actually discard even more words than that, further reducing the amount of data to process. In particular, if any word has a repeated letter, we can also discard it when reading the input, because such words would fail the conditions immediately and cannot be part of the answer. This will further reduce the amount of words probably to less than half of all 5-letter words in the dictionary.

This second optimization would have likely made Matt's program, even without any other changes, take just ten minutes, perhaps even less.

Third: This is where actual algorithmic knowledge and thinking is more required.

Matt's program went through every single possible combination of 5 words in the input. This is unnecessary and we can go through significantly less combinations than that, further reducing runtime by an order of magnitude or two, perhaps even more.

This algorithmic trick is very common and very well known when dealing with exactly this kind of problem. And the idea is: If two words share any letters, you don't need to check any combinations containing those two words (because, rather obviously, all of those combinations will fail.) Just this idea alone allows skipping the vast, vast majority of possible combinations, speeding up the program enormously.

While the idea and principle is quite simple, it might not be immediately obvious to the beginner programmer how to actually implement it in code (and many beginner and even not-so-beginner programmers will often succumb to really complicated and lengthy solutions to try to achieve this.) This is where programming and algorithmic expertise becomes very helpful, as the solution is much simpler than it might sound.

Explaining in great detail the simple algorithm to achieve this would require a bit of text, so I'll just summarize the general idea instead: Going through all the combinations of elements (words in this case) can be implemented as a recursive function which keeps track of which elements we are dealing with. The recursion can be stopped (and execution returned to the previous recursion level) when we detect two words with shared letters, thus skipping all the subsequent deeper recursions.

Fourth: Comparing words.

Here, too, Matt used a very naive approach, where he would take the original words and do an elaborated quadratic comparison of each of their letters.

The thing is: When doing comparison we don't need the letters of the words in their original order. That's unnecessary for this comparison because we only want to know if they share letters, and the order of the letters in the word doesn't matter. Thus, we can rearrange the letters in each word to be eg. alphabetically ordered in order to make the comparison simpler and faster. (The original word can be kept alongside the reordered one in order to print out the found words.)

But that's not even the most efficient way of doing it. Since no letters are repeated (as we have discarded all the words with repeated letters), can just create a small bitmap of each word, with each bit representing each letter of the alphabet. Since the English alphabet consists of 26 letters, a 32-bit integer more than suffices for this. Thus, we can "convert" each 5-letter word into an integer (which bits tells which letters are used in the word), and then we can use a bitwise "and" operator to compare two of them to see if they share any of the bits. In other words, rather than going through a string of letters, we are just comparing two integers with a bitwise-and operator. This is extraordinarily fast.

Even if we restricted ourselves to using Python, doing the four optimizations above would solve the problem in just a few seconds, perhaps even in less than a second. 

Sunday, December 14, 2025

Programmers are strangely dogmatic about cryptic variable names

As a very long-time computer programmer, I have noticed an interesting psychological phenomenon: For some reason beginner programmers absolutely love when they can express a lot of functionality with as little code as possible. They also tend to love mainstream programming languages that allow them to do so.

As an example, if they are learning C, and they encounter the fact that you can implement the functionality of strcpy() with a very short one-liner (directly with C code, without using any standard library function), a one-liner that's quite cryptic compared to how it's done in an average programming language, they just love that stuff and get enamored with it.

This love of brevity in programming quite quickly becomes a habit to the vast majority of programmers, and most of them never "unlearn" said habit. One of the most common and ubiquitous language feature where most programmers will apply this love of brevity is in variable names. (Oftentimes they will do the same with other names in the program as well, such as function and class names, but variable names are the most common target for this, even when they keep those other names relatively clear.)

It becomes an instinct that's actually hard to get rid of (and I'm speaking of personal experience): That strong instinct of using single-letter variables, abbreviations and acronyms, no matter how cryptic and unreadable they may be to the casual reader. (Indeed, their most common defense of those names is "I can understand them perfectly well", without acknowledging that it may not be so clear to others reading the code. Or even to they themselves five years along the line.)

Thus, they will use variable names like for example i instead if index, pn instead of player_name, ret instead of return_value, col instead of column (or even better, column_index), col instead of colorrc instead of reference_count, and so on and so forth. After all, why go through the trouble of writing "return_value" when you can just write "ret"? It's so much shorter and convenient!

But the thing is, the more the code is littered with cryptic short variable names, the harder it becomes to read and understand to someone reading (and trying to understand) the code. I have got an enormous amount of experience on that, as I have had in the past to write an absolutely humongous amount of unit tests for a huge existing library. The thing about writing unit tests for existing code is that you really, really need to understand what the code is doing in order to write meaningful unit tests for it (especially when you are aiming for 100% code coverage).

And, thus, I have seen a huge amount of code that I have had to fully understand (in order to write unit tests for), and I have seen in intricate detail the vast difference in readability and understandability between code that uses cryptic variable and function names vs. code that uses clear readable names. Unsurprisingly, the latter helps quite a lot.

In fact, this is not some kind of niche and novel concept. The coding guidelines of many huge companies, like Google, Microsoft, Facebook and so on, have sections delineating precisely this. In other words, they strongly recommend using full English words in variable and function names rather than abbreviations and cryptic acronyms. One common principle is, relating to the latter: "If the acronym does not have an article in Wikipedia, just write it fully out."

One particular situation where I have noticed how much clear variable naming helps is in loop variables. Loop variables are the one thing that most programmers abbreviate the most, and the most often. Sometimes they go to outright unhealthy lengths to use loop variables that are as short as possible, preferably single-letter, even if that means using meaningless cryptic names like i, j and k.

I have, myself, noticed the importance of, and gotten into the habit of, naming loop variables after their use, ie. what they represent and are being used for. For example, let's say you are iterating through an array of names using a loop, with the loop variable indexing said array. Thus, I will name said variable eg. name_index rather than just i (which is way too common.) If the loop variable is counting something, I will usually name it something_count, or similar, rather than just i or n.

The longer the body of the loop is, and especially if there are several nested loops, the more important it becomes to name the loop variables clearly. It helps immensely keep track of and understand the code when the loop variables are directly naming what they represent, especially alongside naming everything else clearly. For example, suppose you see this line of code:

pd[i].n = n[i];

That doesn't really tell us anything. Imagine, however, if we changed it to this:

player_data[player_index].name = names[player_index];

Is it longer? Sure. But is it also significantly clearer? Absolutely! Even without seeing any of the surrounding code we already get a very good idea of what's happening here, much unlike with the original version.

Yet, try to convince the average experienced programmer, who is used to litter his code with short cryptic variable names, of this. You will invariably fail. In fact, for some reason the majority of computer programmers are strangely dogmatic about it. They are, in fact, so dogmatic about it that if you were to make this argument in an online programming forum or discussion board, you will be likely starting a full on flamewar. They will treat it like you are a stupid arrogant person who has personally insulted them to the core. I'm not exaggerating.

The instinct to write short cryptic code, very much including the use of short cryptic variable names, sits very deep in the mind of the average programmer. It's a strange psychological phenomenon, really.

I have concocted a name for this kind of cryptic programming style: Brevity over clarity. 

Monday, December 8, 2025

I'm tired of "viral math problems" involving PEMDAS

In recent years (or perhaps the last decade or so) there has been a rather cyclic phenomenon of a "viral math problem" that somehow stumps people and reveals that they don't know how to calculate it. It seems that every few months the exact same problem (with just perhaps the numbers involved changed) makes the rounds. And it always makes me think: "Sigh, not this again. It's so tiresome."

And the "viral math problem" is a simple arithmetic expression which, however, has been made confusing and obfuscated not only by using unclear operator precedence but, moreover, by abusing the division symbol ÷ instead of using fractional notation. Pretty much invariably the "problem" involves having a division followed by a multiplication, which is what introduces the confusion. A typical version is something like:

12 ÷ 3(2+2) = ?

This "problem" is so tiresome because it deliberately uses the ÷ symbol to keep it all in one line instead of using the actual fractional notation (ie. a horizontal line with the numerator above it and the denominator below it) which would completely disambiguate the expression. And, of course, it deliberately has the division first and the multiplication after that, causing the confusion.

This is deliberately deceptive because, as mentioned, the normal fractional notation would completely disambiguate the expression: If the division of 12 by 3 is supposed to be calculated first and then the result then multiplied by (2+2), then the fraction would have 12 at the top, 3 on the bottom, and the (2+2) would follow the fraction (ie. be at the same level as the horizontal line of the fraction).

If, however, the 12 is supposed to be divided by the result of 3(2+2) then that entire latter expression would be in the denominator, ie. below the fraction line.

That clearly and uniquely disambiguates the notation. Something that "12 ÷ 3(2+2)" quite deliberately does not.

Many people would think: "What's the problem? It's quite simple: Follow so-called PEMDAS, where multiplication and division have the same precedence, and operators of the same precedence are evaluated from left to right. In other words, calculate 12 by 3 first, then multiply the result by (2+2)."

Except that it's not that simple. It so happens that "PEMDAS" does not really deal with the "implied multiplication", ie. the symbolless product notation, such as when you write "2x + 3y", which has two implied products.

The fact is that there is no universal consensus on whether the implied product should have a higher precedence than explicit multiplication and division. And the reason for this is that in normal mathematical notation the distinction is unnecessary because you don't get these ambiguous situations, and that's because the ÷ symbol is not usually used to denote division alongside implied multiplication.

In other words, there is no universal consensus on whether "1 ÷ 2x" should be interpreted as "(1÷2)x" or "1 ÷ (2x)". People have actually found published physics and math papers that actually use the latter interpretation, so it's not completely unheard of.

The main problem is that this is deliberately mixing two different notations: Usually the mathematical notation that uses implied multiplication does not use ÷ for division, instead using the fraction notation. And usually the notation that does use ÷ does not use implied multiplication. These are two distinct notations (although not really "standardized" per se, which only adds to the confusion.)

Thus, the only correct answer to "how much is 12 ÷ 3(2+2)?" is: "It depends on your operator precedence agreement when it comes to the ÷ symbol and the implied multiplication." In other words, "tell me the precedence rules you want to use, and then I'll tell you the answer, because it depends on that."

(And, as mentioned, "PEMDAS" is not a valid answer to the question because, ironically, that's ambiguous too. Unless you take it literally and consider ÷ and implied multiplication to be at the same precedence level, and thus to be evaluated from left to right. But you would still want to clarify that that's what's meant.)

Also somewhat ironically, even if instead of implied multiplication we borrowed the actual arithmetic notation for multiplication from the same set as the ÷ symbol, in other words, the expression would be:

12 ÷ 3×(2+2)

that would still be ambiguous because even here there is no 100% consensus.

The entire problem is just disingenuous and specifically designed to confuse and trick people, which is why I really dislike it and am tired of it.

An honest version of the problem would use parentheses to disambiguate. In other words, either:

(12÷3)×(2+2)

or

12 ÷ (3×(2+2))

Wednesday, November 12, 2025

Signs that a bodybuilder is not a "nattie" (ie. uses PEDs)

Most bodybuilders are very open about the fact that they use PEDs (ie. "performance enhancing drugs", like steroids, human growth hormone, and a multitude of others.)

However, there are likewise many that are not so open about it, particularly the ones who are social media "influencers" and, especially, if they are trying to sell you something. But even if they aren't trying to sell you products, many of them claim to be "natties" just for the clout and fame, just to get the admiration of people, showing what you can achieve if you work enough for it.

Sometimes it can actually be quite difficult to just outwardly see if someone is a "natural" or is taking PEDs. For those bodybuilders who are extreme mass monsters, with biceps larger than a normal person's head, it's quite clear, as nobody can naturally get muscles that large.

For less large bodybuilders, however, it can be harder to discern. However, there are some signs to look for.

Note that none of these are 100% certain proof, but they add to the evidence. The more of these that can be spotted the less likely the guy is a "natural" and is using PEDs.

1) The rather obvious one: Acne, particularly on the upper back, sometimes even the upper chest, shoulders and face. While some people have naturally acne in those areas even though they take nothing, it's a very telling sign.

2) Another quite obvious one: Extreme "mass monsters" (Schwarzenegger-sized and bigger) are almost certainly not natural. For someone to reach those sizes completely naturally it requires extraordinary genetics (essentially, the body itself naturally produces the steroids that one would usually take externally.) While not impossible, it's extremely unlikely.

Those are the most commonly known ones. However, there are also lesser known things to look for:

3) Neck and shoulders, especially the neck. Steroids have a particularly strong effect on those two (there are physiological and biological reasons for this), and they don't grow nor thicken even nearly as much without them. Most steroid users will have really thick necks, even unnaturally so. Shoulders are also notoriously difficult to grow naturally. People not taking steroids will usually have more normal necks and shoulders. (However, this doesn't rule out other PEDs.)

4) Unnaturally thick veins. One side-effect of PEDs is that they enlarge veins, particularly those close to the skin. If you see a ripped bodybuilder with really thick veins visible, it's an almost certain sign of PEDs. Normally people don't have veins that thick. They tend to be quite narrow.

5) Being very muscular and very ripped. Being "ripped" is that look when your body fat is extremely low, so everything else under the skin, particularly muscles, can be seen in great detail. Bodybuilders do this to really show off muscle definition, particularly prior to competitions. But the thing is: It's extremely hard to gain and particularly maintain that much muscle naturally while having such a low body fat percentage. The only way to reach body fat percentage that low is to be in a calorie deficit, which inevitably eats away at the muscles no matter how much you train, unless you are on PEDs. This is a particularly clear sign if the guy is always extremely ripped, not just temporarily for a competition.

6) Gynecomastia: In other words, grown and "drooping" of the nipple area, which is usually a direct consequence of steroids. If the nipples are bulging or are pointing almost directly down, it's an almost certain sign.

7) Distended belly, ie. "palumboism": If the bodybuilder is "ripped" and has a noticeably large belly, it's an almost certain sign of using human growth hormone and, probably, other substances. Extremely low fat percentage (which is what makes you look "ripped") and a large belly don't usually occur together naturally. Natties who are ripped will almost always have a very prominent hourglass shape with an extremely lean, even inwards belly area (because of the lack of fat), and will be able to produce a huge "vacuum" in that area.

Natties can also have big bellies, but in that case they are pretty much never "ripped". They will look like having a high fat percentage everywhere, because they are exactly that, ie. fat. Abdomen muscles will not be visible and it's usually just a big round ball as a belly. Muscles elsewhere will also not be clearly delineated and will be just big bulges covered in fat tissue. (That being said, this look is in no way a guarantee of a nattie. Many fat bodybuilders use PEDs.)

Thursday, October 16, 2025

Is the Equation Group a "white hat", "black hat" or "grey hat" team?

The "Equation Group" is an unofficial name (invented by Kaspersky Lab) for one of the most notable and advanced team of hackers and software (and possibly hardware) developers in the world. While never officially recognized (for rather obvious reasons), there's very strong and credible evidence that the members of this team are employees of (or at a minimum closely working for) the NSA, most likely as part of their Tailored Access Operations department (which actually is officially recognized.)

From what has been discovered of the work of this group, and also inferred from the Snowden leaks, it's extremely likely that the main purpose of this hacker group is to research and discover zero-day exploits in operating systems and all kinds of other software and hardware, and to develop programs and tools to use those exploits to hack into computers (and who knows what other tasks, by using the ability to hack into the computers of foreign governments and other organizations and people.)

As Kaspersky Lab and other researches have found, these are not just some script-kiddies doing this for fun and fame. Code that has been attributed to them tends to be extremely advanced, use very advanced techniques, and often contain zero-day exploits most likely found by the team themselves. From all that's known about them and their code, they are highly skilled and advanced hackers and software developers.

It is known that the NSA stockpiles these "zero-day exploits" that this team (and probably others) find, for their own uses, rather than disclose them to the software and hardware companies (such as Microsoft.)

There have been known cases of such zero-day exploits having been kept secret by the NSA for many years before they were found independently and patched, or discovered via one of their malware having been examined (most famously Stuxnet). Or, at least in a few cases, by having been themselves hacked!

Indeed, one would think that given the top-level competency, skill and professionalism of this team and the NSA in general, they would have some of the highest digital security in the world, making them pretty much impervious to being hacked themselves. Yet, that has turned out several times to not be the case.

Quite famously Edward Snowden leaked a ton of top-secret NSA documents to the public. Curiously, Snowden was not some kind of NSA employee with a very high security clearance who had been working for the agency for decades when he decided to go rogue. No, he was just an external contractor who had been working in that capacity for quite a short amount of time, and with no other affiliation with the NSA. Essentially, he was just an outsider, not a governmental worker, who had been given temporary access as an external contractor for some minor work. Yet, he had full access to top secret documents of the NSA that he could freely copy for himself without any restrictions, and leak to the public.

That was because, at least back then in 2013, the security and safety measures at NSA were astonishingly lax and poor. Even many private companies, even in 2013, had significantly stricter and stronger security measures than the NSA had. Indeed, Snowden had access to all those top-secret documents just because the sysadmins in charge of all the NSA computers were lazy and just granted everybody access to everything because of convenience. As incredible as it might sound, even if you were just a recently-hired external contractor for a minor job for the NSA, you were granted full access to almost everything pretty much without limits. And that's exactly why one of those temporary external contractors, Edward Snowden, got hold of those documents. It is exactly as incredible and crazy as it sounds.

Whether the NSA started implementing more safety measures after the Snowden leaks is unknown, but apparently even if they did, it wasn't enough because in 2016 another hacker group, who call themselves The Shadow Brokers, were able to hack the NSA's computers and steal many of the exploit software developed by the Equation Group. The latter might consist of some of the top hackers and developers in the world, but apparently even they were not immune to being hacked themselves. Or, at a minimum, the servers where their software was stored (which might actually not be a fault of theirs, depending on who within the NSA was tasked with developing and maintaining those servers. If it was the same admins that allowed Snowden to just access and copy the top-secret documents, who knows.)

Perhaps the most famous exploit software that they stole and leaked was one codenamed EternalBlue, which was an implementation of a zero-day exploit of Windows that allowed running code on any Windows computer remotely (by exploiting a bug in Window's implementation of the SMB protocol that existed at the time.) It became famous because that code was used to create the infamous WannaCry ransomware, and later the (perhaps somewhat less famous) NotPetya, which caused even more damage.

There's evidence to show that the NSA had sat on (and probably used) that EternalBlue exploit for at least five years before it was stolen and leaked, allowing Microsoft to become aware of the bug and patch it. If it hadn't been stolen, it would have probably been gone unpatched for several more years.

Unsurprisingly, Microsoft issued severe criticism of this "stockpiling of zero-day exploits" by the NSA, as it keeps regular citizens vulnerable to exploits that have been found but are deliberately being undisclosed. The amount of damage caused by the several malware that were using EternalBlue is estimated to be at least 1 billion dollars.

Anyway, given all of this, an interesting question arises: Can the Equation Group be classified as "white hat", "black hat" or "grey hat" hackers?

The term "white hat hacker" is used to describe a hacker who tries to find hacks, exploits and vulnerabilities in software and hardware with the full intent of disclosing them to the manufacturers as quickly as possible, and with zero intent of abusing those exploits himself. Usually he will inform the manufacturers well in advance before disclosing the vulnerability to the public, to give the manufacturers time to patch their systems. These hackers try to always remain within legal limits. Many "white hat" hackers are actually outright employed by companies to find vulnerabilities in their own systems, and thus are doing it with full permission (and even paid for it.)

The term "black hat hacker" is, rather obviously, used to describe the opposite: In other words, a hacker who tries to find these vulnerabilities in order to either exploit them himself, or to sell them in the hacker black market to others (useful zero-day exploits, especially those that allow full access to any computer system, are incredibly valuable in the black market, and could fetch a price of tens of thousands of dollars, or even more.)

The term "grey hat hacker" is a bit fuzzier, and the definition depends a bit on who you ask. One common definition is a hacker who has no intent to abuse the exploits he finds (nor sell them to anybody), but has no qualms about breaking the law in order to find them (for example illegally breaking into the computer system of a company, or even and individual person, in order to gain access to more information that could help find even more vulnerabilities.) Some "grey hat hackers" might have primarily good intentions and think of breaking the law (eg. by illegally breaking into computers) as justified for the greater good (ie. discovering and disclosing vulnerabilities.) Other such hackers might just do it for the thrill, even if they don't have any intention of actually abusing the vulnerabilities they find any further (other than eg. rummaging around in the servers of a company) but with no intent to disclose those vulnerabilities either. Maybe they do a bit like the NSA does, ie. "stockpiling" knowledge and vulnerabilities that they might discover.

"Grey hat" might also be used to describe a hacker who illegally exploits computer systems in order to achieve something that's deemed a good thing, even if doing so is illegal. For example, not to disclose the exploits itself, but to disclose some incriminating information about the company or person, such as evidence of a crime they have committed. A bit like modern-day "Robin Hoods" who go against the law in order to fight evil.

So, in light of all of this, is the Equation Group a "white hat", "black hat" or "grey hat" hacker team? I think arguments could be made for all three:

1) They are "white hat" hackers because what they are doing is not illegal, and they are doing it on behest of the government for national security, to combat foreign threats. By the very fact that it's not illegal, it's not against the law, they are "white hats". It's in essence no different from a company hiring a hacker to find vulnerabilities in their systems. Not disclosing the vulnerabilities is in essence no different from for example not disclosing the locations and activities of spies performing activities in foreign countries, which is acceptable for national security reasons.

2) They are "black hat" hackers because they illegally exploit systems and not disclosing vulnerabilities is unethical, irresponsible and puts people in danger (which might even be considered criminal negligence.) Their research of vulnerabilities is not done to help people, but quite explicitly to exploit those vulnerabilities. Just because they might not be prosecuted by the government doesn't mean they aren't actually breaking the law, it's merely the government looking the other way and excusing it as being "for national security" (just like their spies murdering enemies in foreign countries.) Being endorsed by the government doesn't make them any less of "black hat" hackers, it simply makes them "government black hat" hackers.

3) They are "grey hat" hackers because even if what they are doing might be technically illegal, or at least ethically questionable, they are doing it for a good goal: That of protecting their country from foreign threats. There is no evidence that they are using these exploits for abuse their own citizens and compatriots. They are using these exploits to protect their compatriots. Even if the government sometimes might use these hacks to abuse their own citizens, that's most likely not the fault of the hackers themselves who discovered these vulnerabilities. It's very possible that they don't even know what their software is being used for in great detail. They may well have good intentions behind their work, ie. help protect their own country.