_______ __ _______ | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| on Gopher (inofficial) URI Visit Hacker News on the Web COMMENT PAGE FOR: URI WebSim, WorldSim and the Summer of Simulative AI grfhjyffbnh wrote 6 hours 13 min ago: > exploring the latent space of multiverses adjacent to ours Poeâs law strikes again. Are they serious? Is it satire? Pretty hilarious. smusamashah wrote 12 hours 21 min ago: [1] is the project's website being discussed in the article URI [1]: https://websim.ai/ mlb_hn wrote 14 hours 31 min ago: nice overview of progress over time. are there quant metrics for the sim capabilities or is it mostly vibes? ClassicRob wrote 14 hours 0 min ago: Cofounder of Websim here. Right now it's not clear that there's any eval for a language model's simulation capabilities. Internally, we've (vibe) tested Llama 3, Command R+, WizardLM 8x22b, Mistral Large (first version of Websim came out of a Mistral hackathon) and GPT-4 Turbo and found them all lacking, due to either meh website outputs or mode collapse from reinforcement learning (lack of creativity and flexibility). That also may be a "skill issue" thing because our system prompt is very much optimized for Claude 3's "mind." We'll release functionality in the next week or two that lets users update the system prompt, in which case this may be less of an issue Claude 3 has a much broader latent space, and seems to "enjoy" imagining things. It hasnât been banged into too specific of an assistant shape, and doesnât suffer the same degree of âmode collapseâ [1] Even Sonnet produces mindblowingly good outputs ( [2] ). Haiku is capable of producing full websites with insightful and creative content, even if it isn't as capable as Sonnet/Opus. For example, I found Curio, an esolang where every line of code is a living, sentient being with its own unique personality, memories, and goals, mostly by browsing around with Haiku ( [3] ). Although Haiku tends to perform better when it is few-shot prompted with outputs from Sonnet or Opus earlier in the "browser history." URI [1]: https://lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-m... URI [2]: https://x.com/RobertHaisfield/status/1774579381132050696 URI [3]: https://x.com/RobertHaisfield/status/1782586807261233620 swyx wrote 14 hours 39 min ago: author here! I absolutely enjoyed interviewing Joscha Bach who was graceful enough to give 30mins of his time with zero prep and no idea who I was. I also am in a unique position to report on the rise of both WorldSim and WebSim as I literally saw them both happen up close. questions welcome! if you liked the ChatGPT Virtual Machine story from 2022: [1] you will like this. if you enjoy behind the scenes, i live streamed the making of the video, audio, and essay last night with a few people on twitter/youtube [2] comments and tough love welcome! URI [1]: https://news.ycombinator.com/item?id=33847479 URI [2]: https://x.com/swyx/status/1784110650777854148 fjkdlsjflkds wrote 12 hours 52 min ago: A quick comment: The idea seems interesting/entertaining, but the requirement to login with a Google account will make some people (like me) simply not even try it. ClassicRob wrote 12 hours 22 min ago: Login with google was just the quickest thing we could do to get auth, we'll roll out more ways to sign in soon. Thanks for the feedback! DIR <- back to front page