_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   WebSim, WorldSim and the Summer of Simulative AI
       
       
        grfhjyffbnh wrote 6 hours 13 min ago:
        > exploring the latent space of multiverses adjacent to ours
        
        Poe’s law strikes again. Are they serious? Is it satire? Pretty
        hilarious.
       
        smusamashah wrote 12 hours 21 min ago:
         [1] is the project's website being discussed in the article
        
   URI  [1]: https://websim.ai/
       
        mlb_hn wrote 14 hours 31 min ago:
        nice overview of progress over time. are there quant metrics for the
        sim capabilities or is it mostly vibes?
       
          ClassicRob wrote 14 hours 0 min ago:
          Cofounder of Websim here. Right now it's not clear that there's any
          eval for a language model's simulation capabilities. Internally,
          we've (vibe) tested Llama 3, Command R+, WizardLM 8x22b, Mistral
          Large (first version of Websim came out of a Mistral hackathon) and
          GPT-4 Turbo and found them all lacking, due to either meh website
          outputs or mode collapse from reinforcement learning (lack of
          creativity and flexibility). That also may be a "skill issue" thing
          because our system prompt is very much optimized for Claude 3's
          "mind." We'll release functionality in the next week or two that lets
          users update the system prompt, in which case this may be less of an
          issue
          
          Claude 3 has a much broader latent space, and seems to "enjoy"
          imagining things. It hasn’t been banged into too specific of an
          assistant shape, and doesn’t suffer the same degree of “mode
          collapse” [1] Even Sonnet produces mindblowingly good outputs ( [2]
          ). Haiku is capable of producing full websites with insightful and
          creative content, even if it isn't as capable as Sonnet/Opus. For
          example, I found Curio, an esolang where every line of code is a
          living, sentient being with its own unique personality, memories, and
          goals, mostly by browsing around with Haiku ( [3] ). Although Haiku
          tends to perform better when it is few-shot prompted with outputs
          from Sonnet or Opus earlier in the "browser history."
          
   URI    [1]: https://lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-m...
   URI    [2]: https://x.com/RobertHaisfield/status/1774579381132050696
   URI    [3]: https://x.com/RobertHaisfield/status/1782586807261233620
       
        swyx wrote 14 hours 39 min ago:
        author here! I absolutely enjoyed interviewing Joscha Bach who was
        graceful enough to give 30mins of his time with zero prep and no idea
        who I was. I also am in a unique position to report on the rise of both
        WorldSim and WebSim as I literally saw them both happen up close.
        questions welcome!
        
        if you liked the ChatGPT Virtual Machine story from 2022: [1] you will
        like this.
        
        if you enjoy behind the scenes, i live streamed the making of the
        video, audio, and essay last night with a few people on twitter/youtube
        [2] comments and tough love welcome!
        
   URI  [1]: https://news.ycombinator.com/item?id=33847479
   URI  [2]: https://x.com/swyx/status/1784110650777854148
       
          fjkdlsjflkds wrote 12 hours 52 min ago:
          A quick comment: The idea seems interesting/entertaining, but the
          requirement to login with a Google account will make some people
          (like me) simply not even try it.
       
            ClassicRob wrote 12 hours 22 min ago:
            Login with google was just the quickest thing we could do to get
            auth, we'll roll out more ways to sign in soon. Thanks for the
            feedback!
       
       
   DIR <- back to front page