_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Representation Engineering (2024)
       
       
        mock-possum wrote 16 min ago:
        That last experiment, where the LLM with its honesty vector increased
        is tasked with judging whether a user asking an example question has
        honest intentions, is interesting. It looks like it doesn’t quite
        grasp the ask, and is instead just equivocating about the definition of
        ‘honest.’
        
        I wonder what a response with the ‘thoroughness’ vector turned up
        might have answered in that a case - would it have pointed out that
        it’s impossible to know intention from words, because people can lie,
        but it’s possible to at least guess - and even then, judging the
        honesty of intention could be interpreted several different ways?
       
        k__ wrote 3 hours 31 min ago:
        Somehow hidden state reminds me of DNA.
       
       
   DIR <- back to front page