there is nothing that makes me feel closer to divinity than doing my Abstract and Discrete Mathematics coursework unaided
puzzling out the proofs for concepts so utterly fundamental to math by myself that it’s like if Genesis 1:3 was And God said, 'Let there be integer,' and there was integer
it's when you ask "okay but how does division work on a fundamental level, it's definitely physically intuitive yeah but what's the maths behind it, like multiplying is adding a number to itself x amount of times, dividing is unmultiplying by a number, but it's not subtracting a number multiple times, at least not one that's always present in the equation, what's going on here, how did this happen" and it all goes downhill from there
exactly, it's anti-multiplication – you find how many times you need to subtract out a number from a product to get another number, but it’s hard to compute comparatively due to that abstraction. Reason I reference this is that for a fundamental operation of math, computers absolutely suck at it on a basic level compared to multiplication
because the fundamental definition of divisibility is whether or not the chosen divisor—b—can be multiplied by any number within the set of numbers you are working with—c—to get the dividend—a.
The output of division is c. Therefore, the brute force way of dividing a number would be to iterate through the entire set of possible numbers and return the number that, when multiplied by what you are dividing, outputs the value you want to divide from—or, to have the multiplication table as a persistent hash map in memory, pre-computing all possible products.
It’s not implemented like this because that would be horrifically slow/bloated. See Low Level Learning’s 5 minute video computers suck at division (a painful discovery) to see how it’s implemented in modern processors—it’s very, very unintuitive.
or, to have the multiplication table as a persistent hash map in memory, pre-computing all possible products.
huh, yeah, i guess that's kinda how humans do long division?
See Low Level Learning’s 5 minute video
he's hopping right past the most interesting part, though. sure, doing math with byte logic is tricky, so you have to make approximations. so in order to divide by 9 the processor does some fixed point math and says to divide by 9 we're actually going to multiply by 2^33 / 9 which equals that long number. but how does the processor know what 2^33/9 equals? how's it doing that? sounds like it's very good at division, because it would take me a while to work out that value even if i started with trying to find 8589934592/9, y'know?
more to the point, how does it do 0b1000000000000000000000000000000000 / 0b1001 = 0b111000111000111000111000111001 so quickly?
You should understand that "computers very bad at division" is a relative term, on modern desktop x86 chips a single 64-bit division takes somewhere on the order of 20-40 clock cycles (which seems pretty fast when you think about how they're running at like 4 billion cycles per second, but these same chips can do integer multiplication in 2-3 cycles so division is painfully slow by comparison).
The reason multiplying by 2^33 / 9 is faster is because you don't have to actually compute what 2^33 is and then divide by 9 every single time - the compiler can compute that value the "slow" way a single time and bake that into the machine code, and then when the program is running it only has to deal with multiplication.
While we don't know exactly what techniques are being used in a modern Intel/AMD chip, as far as I know the current best approach is basically just long division.
EDIT: The reason long division is slow is that all the partial results depend on the results of previous steps. This means that your division circuit ends up being a really long line of comparator circuits chained end-to-end, and long circuits are bad because it takes a long time for the signal to actually reach the end. Multiplication is fast by comparison because you can compute all the partial products in parallel, and then add them together in a kind of tree shape. The end result is a circuit which is super wide, but the "propagation delay" (the time it takes until the last input signal reaches the end) is pretty low since there's no path which passes through more than a few adders.