🃏Joker@sh.itjust.works to Technology@lemmy.worldEnglish · 2 months agoAlignment faking in large language modelswww.anthropic.comexternal-linkmessage-square12fedilinkarrow-up179arrow-down17
arrow-up172arrow-down1external-linkAlignment faking in large language modelswww.anthropic.com🃏Joker@sh.itjust.works to Technology@lemmy.worldEnglish · 2 months agomessage-square12fedilink
minus-squareeleitl@lemm.eelinkfedilinkEnglisharrow-up1arrow-down1·2 months agoSo you mean “alignment with human expectations”. Not what I was meaning at all. Good that that word doesn’t even mean anything specific these days.
So you mean “alignment with human expectations”. Not what I was meaning at all. Good that that word doesn’t even mean anything specific these days.