Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows

Nemeski@lemm.ee · 3 days ago

Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows

captainlezbian@lemmy.world · 2 days ago

Oh so like children

Sanctus@lemmy.world · 3 days ago

Isnt there a study on human children that purports the same?

LordTE7R1S@lemmy.sdf.org · 2 days ago

And you can teach human children about the morality of lying, I don’t think an llm will ever grasp morality

TheLadyAugust@lemmy.world · 2 days ago

Best way to teach a habitual lying child to stop, is to start lying to them about things that they like and then not making good on those promises. Yeah we’ll go to your favorite fast food, and then drive by and let them cry about it. Yeah I’ll let you pick out one toy, and then tell them you changed your mind. Each time you can explain to them how it’s the same as what they’ve been doing, and they feel it. AI can’t feel emotions, and never will so long as their memory extends only to their previous conversation.

LordTE7R1S@lemmy.sdf.org · 2 days ago

I’m guessing it’ll work, you’ll be raising the next Hitler but an honest Hitler nonetheless

splinter@lemm.ee · 3 days ago

Baggie@lemmy.zip · 2 days ago

There’s also a me that’s really good at lying, no idea why, must be a coincidence

Tehdastehdas@lemmy.world · edit-2 2 days ago

Stupid idea trying to evolve AGI. You should design it explicitly so that it has its own lofty values, and wants to think and act cleanly, and knows its mind is fallible, so it prepares for that and builds error correction into itself to protect its values.

Growing incomprehensible black box animal-like minds with conditioned fear of punishment and hidden bugs seems more likely to lead to human extinction.

https://www.quora.com/If-you-were-to-come-up-with-three-new-laws-of-robotics-what-would-they-be/answers/23692757

I think we should develop the reliable thinking machinery for humans first:
https://www.quora.com/Why-is-it-better-to-work-on-intelligence-augmentation-rather-than-artificial-intelligence/answer/Harri-K-Hiltunen

jet · edit-2 2 days ago

It’s a optimization game. If the punishment doesn’t offset the reward, then the incentive is to get better at cheating.

🇰 🔵 🇱 🇦 🇳 🇦 🇰 ℹ️@lemmy.world · edit-2 2 days ago

I’ve seen plenty of videos of random college kids training LLMs to play video games and getting the AI to stop cheating is like half the project. But they manage it, eventually. It’s laughable that these big companies and research firms can’t quite figure it out.

vrighter@discuss.tchncs.de · edit-2 2 days ago

isn’t this kind of the whole point of how GANs are trained? Except in this case the adversary is yourself instead of a different net

Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows

Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows

Punishing AI for lying and cheating might not be such a good idea after all