Informational deloading

A useful application of the Western Calculus.

Mar 10, 2024

One of the things I had written about in my journal at my new security job before it was confiscated was a useful concept I had come across in my systems planning work for Pokémon games. I briefly named and described it there as informational deloading, and I am going to relate it to you here from memory with less hand cramping and more clarity and zeal.

The idea of informational deloading revolves around balancing the constitution of knowledge into data structure versus into algorithmic structure.

That probably sounds like word salad to you, so let us make an example out of the instance of it that occurred to me. Say, you have an enumeration of all Pokémon by species number. To store this in Generation IX would require over 1,000 members within that enumeration. However, you also know here that many Pokémon evolve, and generally, Pokémon never evolve more than twice. So, instead of storing every Pokémon species flatly in the enumeration, you could instead store considerably fewer “base species” numbers, ideally up to a third, and couple it with a three-value “evolution index” number. Given the right numeric coding (explained later), it is possible to save more space than would have been possible storing the numbers flatly.

Of course, this trivial example presents problems. As you may know, Pokémon evolution is nowhere near as straightforward as this model suggests. Many Pokémon break this supposition, and although that is not really a problem per sé (we are not pedantic metamathematicians that care about making our Lego set look nice), it will have to be accommodated for in the form of special cases in the algorithm, and the size cost incurred by that needs considering too. The other issue is that most computers’ numeric codings are binary, as in their numeric coding is that of base 2, and having a three value “evolution index” leaves one slot wasted given the two bits needed at minimum to hold at least that many values. Such waste would not happen in a ternary computer, in this example, but that computer would have to physically exist to provide the efficiency gained. An additional problem posed by suboptimal numeric coding selections is that the overflow boundaries – that is, the thresholds where enumeration limits call for another unit of data to hold all of their values – may not be crossed very efficiently by a given data set. In other words, it’s entirely possible that you will need 11 bits to store the species enumeration—which maxes out at 2,047 or 2,048 depending on your count—even as you may be using the two-bit (or trit) based evolution modifier in your selection algorithm. In that case it actually wastes space and adds complexity, but such verdicts depend entirely on the data set and how it relates to its companion algorithm.

There is one silver lining then: given a real data set, it is practical to create an algorithm that will do the following things:

find the optimal numeric coding for the variadic breakdown of the data
find the optimal variadic breakdown given a set of possible breakdowns that have been worked into the algorithm as variants

The numeric coding is the numeral base provided by the physical computer system. In nearly all cases this will be binary base 2, but it can be useful to consider numeric codings of other prime numbers greater than 2.

The variadic breakdown of the system’s data is really just the way the data is split apart in different possible combinations when you treat functionally distinct parts of the code as dimensions of the data handled thereof. In our Pokémon species enumeration example above, we reveal that the system gives us two such dimensions, species number and evolution index, and we explore the variadic breakdown of treating those as divided or unified depending on what uses less space.

I think this task constitutes a major part of finding the canonical form of a system in the context of western calculus. The algorithm above, once created, will find a combination of code and data that consumes the fewest octets possible out of all possible arrangements. In this way, it is a higher order rendition of more mundane mechanicalist optimisation strategies, like compiling different variants of machine code to see which one ends up the smallest, or the fastest, or one of those two with some additional relevant constraint or other, and so on.

There is a lot more to say about this, and I probably will as I simmer on it for the weeks to follow. But at least it is the opening salvo in the new world of mechanicalist computing I have been endeavouring to show you all for so many years now.

Informational deloading

A useful application of the Western Calculus.

Discussion about this post