Вы неправильно понимаете, что такое wireheading

Russian version/Русская версия

When muggles talk about AI, transhumanism, brain implants, and similar topics, you would often hear about wireheading.



Here’s the logic behind it. The human brains works roughly as a reinforcement learner (RL), maximizing the reward, taking actions that lead to the biggest number. Right now humans get the most reward from normal good things, like food, music, sex, screams of enemies crushed under your feet, and so on. The system can be hacked – some drugs can directly stimulate the reward system in such a way that a heroin addict doesn’t need anything but heroin, because no real-life experience can give them as much reward. In the future, we can do this even better – stimulate the reward center directly and set it to the maximum possible value. This is the purported infinite pleasure, the proverbial ‘rats on heroin’ kind of future. If we are talking about an AI, then a sufficiently smart AI can just hack its own reward center and get the maximum possible reward – because why even bother doing anything else?



Humans don’t seem too excited about the opportunity to press a button that sends them into endless bliss and ends their story on that. But not many can articulate why. Some bite the bullet and go full solipsist – yes, the only thing that matters is the pleasure I feel inside my own head. Some add epicycles like “yes, I will be happy, but that’s bad for some vague reason”. Some explain why hedonism is flawed and the only real purpose is serving their equivalent of God. Some think that it’s a point against utilitarianism in general. Some try to rebuild their entire conception of ethics. There is no shortage of takes that try to solve this problem by overcomplicating the philosophy behind it.



And they are all wrong!



There are no errors in the philosophy. There is an error in the math. And you don’t even need a degree to understand it, you just need to have played that ancient game, Creatures, where the titular creatures’ AI works by RL.



RL-based systems aren’t trying to maximize reward, per se. When they decide on their next action, the concept of reward doesn’t even come up. Reward only exists on the training step, when it gets computed into behavior, but it does not exist when you execute the behavior. You can think of it like that an RL does things that used to maximize reward during the training, and not things that will maximize the reward now.



In humans, training and execution aren’t well-separated and happen simultaneously, but still, at no point does the human brain ask itself “what will maximize my reward”. Right now, I feel no compulsion to stick an electrode in my brain, because this never happened in my life, and my brain didn’t train on it. If I do it, the brain will update on that, and then I would want to spend all my life glued to the electrode. This works in reverse, too – people suffering from addiction often keep doing it even if they fully understand that it’s no longer pleasurable. The way humans form their desires and behavior follows directly from the math without having to reinvent ethics.



And yes, humans aren’t pure RLs, and there are different kinds of rewards, and human brain is just generally more complicated than that. But even in this simplest case wireheading doesn’t become a good idea. This entire discourse stems from a misunderstanding. Somebody said that RL is “maximizing reward”, which is only partially true. Then someone invents a way to maximize reward very much indeed. And then they ask the tricky question – “the math says that you should want to put an electrode in your brain, why don’t you want that?” And people keep searching for ways to alleviate the cognitive dissonance instead of just saying “the math never said that”.



No, maximizing your reward center is not a good idea by any means. If you do that, then instead of all the things that you liked at the training steps – sex, art, knowledge, beauty – you would just keep pressing the button, which is as pointless and unfun as it sounds.



And you are totally allowed to care about things outside of your head. You don’t need to invent any philosophical justifications for why you care specifically that your loved one survives, and not that you would think they survived. It’s the other way around, if someone cares only about his own mental state, you should investigate – why, what happened to his brain?


Other posts
Subscribe