If Anyone Builds It, Everyone Dies by Eliezer Yudokowsky
a short path to human extinction
Thanks for lending me your copy, Wyatt. This may, in fact, end up being the single most important book I read (as it pertains to my career choices, voting, etc.). Sneaky big problems have a way of not showing up in your ballot.
The authors do a great job at expressing how AI defeats humanity and the book is littered with examples to appeal to your need for illustrative stories. It’s almost a mute point; I think their premise stands tall even without them.
There seems to be many more bad outcomes than good ones. My personal observation is that very few people even know about the alignment problem and fewer care to educate themselves on it (quite disturbing).
Notes:
You don’t get what you trained for. The preferences that wind up in a mature AI are complicated, practically impossible to predict, and vanishingly unlikely to be aligned with our own, no matter how it was trained
Once AIs get sufficiently smart they’ll start acting like they have preferences (again we’re unsure what they will be)
Nobody anywhere has any idea how to make a benevolent AI and that nobody can engineer exact desires into AI
Gradient descent is not the same as natural selection. Gradient descent works directly on every part of a large mind that it’s tuning, whereas natural selection tunes small genomes that work as a sort of recipe for a large brain. Gradient descent will not instill exactly the right preferences into their AIs.
AI research is an alchemy, not a science
They’re looking for outputs without understanding the black box that creates it. We can look at the web of weights & biases across neural networks and still not come close to understanding it. Likewise, certain neurons in the human brain light up when you see a loved one, but if you were to look back and just try to understand what “love” as an emotion or feeling is through scans of the neurons that activate, you wouldn’t get very far.
The 4 compounding curses, as it relates to nuclear and to an even greater degree ASI, are a great framework.
An engineering challenge is much harder to solve when the underlying processes run on timescales faster than humans can react
An engineering challenge is much harder to solve when there is a narrow margin for error, especially if it’s a narrow margin between ‘unimpressive’ and ‘explosive’
Self-amplifying processes, like an overwhelming reactor boiling off its coolant water and then overheating more, leave little room for error… Overheating nuclear reactors don’t start trying to fool the operators into complacency until the reactor is ready to fully explode.
Complications make engineering problems worse… The complicated internals of a nuclear reactor have nothing on the unknown complications that lurk in the hundreds of billions of weights that make up a modern LLM.
Bad arguments:
We’ll design AI to be submissive
We’ll just have AI solve the ASI alignment problem for us
AIs are trapped inside computers
As Nassim Taleb would say, the threat of AI has made humanity the most fragile it’s ever been
Realizing we’re in an ‘arms race’, the question becomes


