_______ __ _______ | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| on Gopher (inofficial) URI Visit Hacker News on the Web COMMENT PAGE FOR: URI GPT-5 Razengan wrote 9 min ago: I asked ChatGPT 5 about the main differences between 4 and 5, and it said: "I couldnât find any credible, up-to-date details on a model officially named âGPT-5â or formal comparisons to âGPT-4o.â Itâs possible that GPT-5, if it exists, hasn't been announced publicly or covered in verifiable sources ⦠GPT-5 as of August 8, 2025 has no formal release announcement" Reassuring. miroljub wrote 21 min ago: Now it's a perfect time for DeepSeek to finally release R2. alexnewman wrote 53 min ago: What's the bullish case that it's actually a big deal. Not trying to be a neg, but Seems pretty incremental on first glance energy123 wrote 51 min ago: We'd need visibility on compute costs. If it's 30% cheaper than o3 but slightly better, that's a large improvement in just 4 months. energy123 wrote 54 min ago: > "If youâre on Plus or Team, you can also manually select the GPT-5-Thinking model from the model picker with a usage limit of up to 200 messages per week." And what's the reasoning effort parameter set to? ritzaco wrote 57 min ago: Ok this[0] sounds very, uh bold to me? Surely this is going to break a ton of workflows etc seemingly with nearly no notice? I'm assuming 'launches' equates with 'fully rolls out' or something but it's not that clear to me. When GPT-5 launches, several older models will be retired, including: - GPT-4o - GPT-4.1 - GPT-4.5 - GPT-4.1-mini - o4-mini - o4-mini-high - o3 - o3-pro If you open a conversation that used one of these models, ChatGPT will automatically switch it to the closest GPT-5 equivalent. Chats with 4o, 4.1, 4.5, 4.1-mini, o4-mini, or o4-mini-high will open in GPT-5, chats with o3 will open in GPT-5-Thinking, and chats with o3-Pro will open in GPT-5-Pro (available only on Pro and Team). [0] URI [1]: https://help.openai.com/en/articles/11909943-gpt-5-in-chatgpt alexmorley wrote 48 min ago: > For Free and Plus users, these changes take effect immediately. Pro, Team, and Enterprise users will also see the changes at launch but will have access to older models through legacy model settings. So only for free/plus users (for now). I do wonder how long they will take to deprecate these models via API though... BoorishBears wrote 18 min ago: So they confirmed what we've all been speculating: this is a cost saving update Smaller base models + more RL. Technically better at the verticals that are making money, but worse on subjective preference. They'll probably try to prompt engineer back in some of the "vibes", hence the personalities. But also maybe they decided people spending $20 a month to hammer 4o all day as a friend (no judgement, really) are ok to tick off for now... and judging by Reddit, they are very ticked off. weird-eye-issue wrote 31 min ago: I'm not worried about when they will deprecate them but I am worried about when they will be removed 3.5 Turbo has been deprecated for a long time but is still running raincole wrote 49 min ago: > For Free and Plus users, these changes take effect immediately. Pro, Team, and Enterprise users will also see the changes at launch but will have access to older models through legacy model settings. Just right next paragraph... artursapek wrote 53 min ago: Yeah I was surprised how fast they rugged 4. I guess they want to concentrate their hardware on 5. hoppp wrote 42 min ago: If it costs the same compute to run it then there is no point running worse models boringg wrote 40 min ago: That's assuming all else holds on the model which isn't always clear. monster_truck wrote 58 min ago: I'm extremely whelmed. I cancelled my subscription junon wrote 1 hour 4 min ago: Anecdotal review: Been using it all morning. Had to switch back to 4. 5 has all of the problems that 2/3 had with ignoring any context, flagrantly ignoring the 'spirit' of my requests, and talking to me like I'm a little baby. Not to mention almost all of my prompts result in a several minute wait with "thinking longer about the answer". getcrunk wrote 47 min ago: Yea I see this a lot with Gemini since 2.5 Very stubborn and âopinionatedâ I think most models will tend this way (to consolidate more control over how we âthinkâ and what we believe) jsumrall wrote 1 hour 42 min ago: It seems 'GPT-5 Pro' is not available via the API. Applejinx wrote 1 hour 50 min ago: I am very puzzled that I cannot search for the word 'blueberry' in this HN discussion. Is my browser broken, or is the subject inappropriate to raise in this community? nodesocket wrote 2 hours 36 min ago: Why do I have access to GPT-5 on only some of my devices? All logged into my plus account. My iPad ChatGPT shows 5, but my iPhone ChatGPT only allows 4o? withinboredom wrote 1 hour 32 min ago: rollout is probably not user-specific, but device specific. Classic rookie mistake. nodesocket wrote 17 min ago: Ya, strange rollout. My browser session which I use by far the most with ChatGPT is also still stuck on 4o. primaprashant wrote 2 hours 46 min ago: created a summary of comments from this thread about 15 hours after it had been posted and had 1983 comments, using gpt-5-high and gemini-2.5-pro using a prompt similar to simonw [1]. Used a Python script [2] that I wrote to generate the summary. - gpt-5-high summary: [1] - gemini-2.5-pro summary: [2] [1]: [3] [2]: URI [1]: https://gist.github.com/primaprashant/1775eb97537362b049d643ea... URI [2]: https://gist.github.com/primaprashant/4d22df9735a1541263c67115... URI [3]: https://news.ycombinator.com/item?id=43477622 URI [4]: https://gist.github.com/primaprashant/f181ed685ae563fd06c49d3d... jiggawatts wrote 2 hours 24 min ago: Wow, the 2.5 Pro summary is far better, it reads like coherent English instead of a list of bullet points. primaprashant wrote 2 hours 16 min ago: yes, agreed. Context length might be playing a factor as total number of prompt tokens is >120k. Performance of LLMs generally degrade at higher context length. mustaphah wrote 2 hours 19 min ago: Someone should start a Gemini-powered blog that distills the top HN posts into concise summaries. mustaphah wrote 2 hours 26 min ago: Why not use the ChatGPT interface instead of the API to save credits? Pass the cookies. primaprashant wrote 2 hours 14 min ago: Only have access to GPT-5 through API for now. The amount of tokens (>130k) used is higher than the limit of ChatGPT (128k) so it wouldn't really work well. froh42 wrote 3 hours 14 min ago: Wow, I just got GPT-5. Tried to continue the discussion of my 3D print problems with it (which I started with 4o). In comparison GPT-5 is an entitled prick trying to gaslight me into following what it wants. Can I have 4o back? withinboredom wrote 57 min ago: If we're going to be forced to trust a new model, might as well evaluate other companies as well to make a decision before my plan renews. lynx97 wrote 3 hours 26 min ago: Not impressed. gpt-5-nano gives noticeably worse results then o4-mini does. gpt-5 and gpt-5-mini are both behind the verification wall, and can stay there if they like. reportgunner wrote 3 hours 50 min ago: First OpenAI video I've ever seen, the people in it all seem incompetent for some reason, like a grotesque version of apple employees from temu or something. nacholibrev wrote 3 hours 50 min ago: I've tried it in cursor and I didn't like it. The claude-4-sonnet gives me far better results. Also it's a lot slower than Claude and Google models. In general GPT models doesn't work well for me for both coding and general questions. energy123 wrote 1 hour 46 min ago: On livebench.ai, GPT-5 is the best model overall, and the second best for agentic coding. But for the Coding benchmarks, it's ranked like 20th. Quite interesting. I'm finding it exceptional for non-trivial summarization tasks. tennisflyi wrote 4 hours 7 min ago: How/where do I see my chat history!? kiitos wrote 4 hours 37 min ago: absolutely miserable results as an agent in my ide : fergie wrote 5 hours 0 min ago: Anecdote: It can now speak in various Scots dialects- for example, it can convincingly create a passage in the style of Irvine Welsh. It can also speak Doric (Aberdonian). Before it came nowhere close. tw1984 wrote 5 hours 7 min ago: just wondering whether Altman is still going to promote his AGI/ASI coming in 12 months story. RobinL wrote 6 hours 2 min ago: Hypothesis: to the average user this will feel like a much greater jump in capability then to the average HNer, because most users were not using the model selector. So it'll be more successful than the benchmarks suggest. danjc wrote 6 hours 26 min ago: > describe gpt 5 in one word > incremental tapland wrote 7 hours 16 min ago: Ugh. Could they have their expert make a website that doesnât crash safari on my iPhone SE? :) saddat wrote 7 hours 44 min ago: If Grol , Claude , ChatGPT seemingly still all scale , yet their Performance feels similar, could this mean that the Technology path is narrow, with little differentiations left ? zone411 wrote 7 hours 54 min ago: It is the new leader on my Short Story Creative Writing benchmark: URI [1]: https://github.com/lechmazur/writing/ throwpoaster wrote 8 hours 11 min ago: Iâve been working on an electrochemistry project, with several models but mostly o3-pro. GPT-5 refused to continue the conversation because it was worried about potential weapons applications, so we gave the business to the other models. Disappointing. zastai0day wrote 8 hours 32 min ago: All people are talking about GPT-5 all over the world, the competition is so intense that every major tech company is racing to develop their own advanced AI models. alenguo wrote 8 hours 40 min ago: I've already used it lutusp wrote 8 hours 42 min ago: I have a canonical test for chatbots -- I ask them who I am. I'm sufficiently unknown in modern times that it's a fair test. Just ask, "Who is Paul Lutus?" ChatGPT 5's reply is mostly made up -- about 80% is pure invention. I'm described as having written books and articles whose titles I don't even recognize, or having accomplished things at odds with what was once called reality. But things are slowly improving. In past ChatGPT versions I was described as having been dead for a decade. I'm waiting for the day when, instead of hallucinating, a chatbot will reply, "I have no idea." I propose a new technical Litmus test -- chatbots should be judged based on what they won't say. kkukshtel wrote 8 hours 51 min ago: Something that's really hitting me is something brought up in this piece: [1] When a model comes out, I usually think about it in terms of my own use. This is largely agentic tooling, and I mostly us Claude Code. All the hallucination and eval talk doesn't really catch me because I feel like I'm getting value of these tools today. However, this model is not _for_ me in the same way models normally are. This is for the 800m or whatever people that open up chatgpt every day and type stuff in. All of them have been stuck on GPT-4o unbeknwst to them. They had no idea SOTA was far beyond that. They probably dont even know that there is a "model" at all. But for all these people, they just got a MAJOR upgrade. It will probably feel like turning the lights on for these people, who have been using a subpar model for the past year. That said I'm also giving GPT-5 a run in Codex and it's doing a pretty good job! URI [1]: https://www.interconnects.ai/p/gpt-5-and-bending-the-arc-of-pr... m3kw9 wrote 7 hours 7 min ago: Free users will get the gpt5 nano. techpineapple wrote 8 hours 41 min ago: Iâm curious what this means. Maybe Iâm stupid but I read through the sample gpt-4 vs got-5 and I largely couldnât tell the difference and sometimes preferred the gpt-4 answer. But like what are the average 800 million people using this for that the average 800 million user will be able to see a difference? Maybe Iâm a far below average user? But I canât tell the difference between models in causal use. Unless youâre talking performance, apparently gpt-5 is much faster. MagicMoonlight wrote 1 hour 40 min ago: 4o would start writing immediately without thinking. So if the first thing it wrote was âThe world is flat becauseâ¦â then it will continue to write as if the world is flat. It makes it very stupid, but very compliant. If youâre mentally ill it will go along with whatever delusions you have, without any objection. deathflute wrote 9 hours 18 min ago: Lots of debate here about the best model. The best model is the one which creates the most value for you â- this typically is a function of your skill in using the model for tasks that matter to you. Always was. Always will be. obloid wrote 9 hours 27 min ago: So far GPT-5 has not been able to pass my personal "Turing test" which has been unsuccessful for the past several years starting through various versions of Dall-e up to the latest model. I want it to create an image of Santa Claus pulling the sleigh with a reindeer in the sleigh holding the reins, driving the sleigh. No matter how I modify the prompt it is still unable to create this image that my daughter requested a few years ago. This is an image that is easily imagined and drawn by a small child yet the most advanced AI models still can't produce it. I think this is a good example that these models are unable to "imagine" something that falls outside of the realm of it's training data. energy123 wrote 6 hours 19 min ago: Is GPT-5 not just routing this request to a 4o/other tool call? ramzyo wrote 9 hours 20 min ago: Is this what you mean? URI [1]: https://chatgpt.com/share/6895632c-fb58-800e-b287-b7a98ad64d... obloid wrote 9 hours 9 min ago: Interesting. Yes, that's basically what I've been going for but none of my prompts ever gave a satisfactory response. Plus I noticed you just copy/pasted from my initial comment and it worked. Weird. After my last post I was eventually able to get it to work by uploading an example image of Santa pulling the sleigh and telling it to use the image as an example, but I couldn't get it by text prompt alone. I guess I need to work on my prompt skills! URI [1]: https://chatgpt.com/share/689564d1-90c8-8007-b10c-8058c149... simultsop wrote 9 hours 10 min ago: that was smooth beardedwizard wrote 10 hours 39 min ago: I asked it how to run the image and expose a port. it was just terrible in cursor. thought a Dockerfile wasn't in the repo, called no tools, then hallucinated a novel on dockefile best practices. cellis wrote 10 hours 44 min ago: My favorite thing to ask is ascii art: _ _ _ __ ___ _ __ ___ __ _ __| (_) ___ | '_ \ / _ \| '_ _ \ / _ |/ _ | |/ __| | | | | (_) | | | | | | (_| | (_| | | (__ |_| |_|\___/|_| |_| |_|\__,_|\__,_|_|\___| What does this say? GPT 5: When read normally without the ASCII art spacing, itâs the stylized text for: markdown Copy Edit _ _ _ __ ___ _ __ ___ __ _ __| (_) ___ | '_ \ / _ \| '_ ` _ \ / _` |/ _` | |/ __| | | | | (_) | | | | | | (_| | (_| | | (__ |_| |_|\___/|_| |_| |_|\__,_|\__,_|_|\___| Which is the ASCII art for: rust â the default âRustâ welcome banner in ASCII style. zombiwoof wrote 10 hours 51 min ago: Given most of human intelligence isnât that smart, AGI doesnât seem hard anshumankmr wrote 10 hours 55 min ago: I miss the model picker⦠is that just me? sbinnee wrote 11 hours 0 min ago: I didn't know that OpenAI added what they call organization verification process for API calls for some models. While I haven't noticed this change at work using OpenAI models, when I wanted to try GPT-5 on my personal laptop, I came across this obnoxious verification issue. It seems that it's all because that users can get thinking traces from API calls, and OpenAI wants to prevent other companies from distilling their models. Although I don't think OpenAI will be threatened by a single user from Korea, I don't want to go through this process for many reasons. But who knows that this kind of verification process may become norm and users will have no ways to use frontier models. "If you want to use the most advanced AI models, verify yourself so that we can track you down when something bad happens". Is it what they are saying? piskov wrote 10 hours 56 min ago: It started with o-models in the API. Aeolun wrote 11 hours 7 min ago: I'm just sitting here hoping that their lowered prices will force Anthropic to follow suit xD felixfurtak wrote 11 hours 35 min ago: It's still terrible at Wordle. This is one of my benchmarks. Telemakhos wrote 12 hours 8 min ago: I am thoroughly unimpressed by GPT-5. It still can't compose iambic trimeters in ancient Greek with a proper penthemimeral cæsura, and it insists on providing totally incorrect scansion of the flawed lines it does compose. I corrected its metrical sins twice, which sent it into "thinking" mode until it finally returned a "Reasoning failed" error. There is no intelligence here: it's still just giving plausible output. That's why it can't metrically scan its own lines or put a cæsura in the right place. tim333 wrote 3 hours 22 min ago: I too can't compose iambic trimeters in ancient Greek but am normally regarded as of average+ intelligence. I think it's a bit of an unfair test as that sort of thing is based of the rhythm of spoken speech and GPT-5 doesn't really deal with audio in a deep way. Telemakhos wrote 3 hours 13 min ago: Most classicists today canât actually speak Latin or Greek, especially observing vowel quantities and rhythm properly, but youâd be hard pressed to find one who canât scan poetry with pen and paper. Itâs a very simple application of rules to written characters on a page, but it is application, and AI still doesnât apply concepts well. xhevahir wrote 6 hours 43 min ago: I can't tell whether you're serious or not. Your criterion for an "impressive" AI tool is that it be able to write and scan poetry in ancient Greek? Telemakhos wrote 3 hours 23 min ago: AI looks like it understands things because it generates text that sounds plausible. Poetry requires the application of certain rule to that text, and the rules for Latin and Greek poetry are very simple and well understood. Scansion is especially easy once you understand the concept, and you actually can, as someone else suggested, train a child to scan poetry by applying these rules. An LLM will spit out what looks like poetry, but will violate certain rules. It will generate some hexameters but fail harder on trimeter, presumably because it is trained on more hexametric data (epic poetry: think Homer) than trimetric (iambic and tragedy, where itâs mixed with other meters). It is trained on text containing the rules for poetry too, so it can regurgitate rules like defining a penthemimeral cæsura. But, LLMs do not understand those rules and thus cannot apply them as a child could. That makes ancient poetry a great way to show how far LLMs are from actually performing simple, rules-based analysis and how badly they hide that lack of understanding by BS-ing. BoorishBears wrote 1 hour 55 min ago: This is not a useful diversion, it's like arguing if a submarine swims. LLMs are simple, it doesn't take much more than high school math to explain their building blocks. What's interesting is that they can remix tasks they've been trained very flexibly, creating new combinations they weren't directly trained on: compare this to earlier smaller models like T5 that had a few set prefixes per task. They have underlying flaws. Your example is more about the limitations of tokens than "understanding", for example. But those don't keep them from being useful. sibeliuss wrote 7 hours 47 min ago: Pure failure: "Youâve given: Moon in the 10th house (from the natal Ascendant) Venus in the 1st house (from the natal Ascendant) Step-by-step: From the natal Ascendantâs perspective Moon = 10th house Venus = 1st house Set Moon as the 1st house (Chandra Lagna) The natal 10th house becomes the 1st house in the Chandra chart. Therefore, the natal 1st house is 3rd house from the Moon: 10th â 1st (Moon) 11th â 2nd 12th â 3rd (which is the natal 1st) Locate Venus from the Moonâs perspective Since Venus is in the natal 1st, and natal 1st is 3rd from Moon, Venus is in the 3rd house from Chandra Lagna. Answer: From Chandra Lagna, Venus is in the 3rd house." ipnon wrote 8 hours 29 min ago: This is a great test because itâs something you could teach an elementary school kid in an hour. Davidzheng wrote 8 hours 20 min ago: is this a joke Telemakhos wrote 3 hours 18 min ago: No, itâs easy if the kid already knows the alphabet. Latin scansion was standard grade school material up until the twentieth century. Greek less so, but the rules for it are very clear-cut and well understood. An LLM will regurgitate the rules to you in any language you want, but it cannot actually apply the rules properly. Davidzheng wrote 3 hours 10 min ago: is ancient greek similar enough to modern day greek that an elementary school kid could learn to compose anything not boilerplate in an hour? Also, do you know that if you fed the same training material you need to train the kid in an hour into the LLM it can't do it? taylorlapeyre wrote 10 hours 54 min ago: It once again completely fails on an extremely simple test: look at a screenshot of sheet music, and tell me what the notes are. Producing a MIDI file for it (unsurprisingly) was far beyond its capabilities. [1] This is not anywhere remotely close to general intelligence. URI [1]: https://chatgpt.com/share/68954c9e-2f70-8000-99b9-b4abd69d1a... adrianh wrote 5 hours 22 min ago: Interpreting sheet music images is very complex, and Iâm not surprised general-purpose LLMs totally fail at it. Itâs orders of magnitude harder than text OCR, due to the two-dimensional-ness. For much better results, use a custom trained model like the one at Soundslice: URI [1]: https://www.soundslice.com/sheet-music-scanner/ gnulinux wrote 12 hours 17 min ago: My first impressions: not impressed at all. I tried using this for my daily tasks today and for writing it was very poor. For this task o3 was much better. I'm not planning on using this model in the upcoming days, I'll keep using Gemini 2.5 Pro, Claude Sonnet, and o3. mkoubaa wrote 12 hours 23 min ago: HyPeRbOlIc SiNgUlArItY epistemovault wrote 12 hours 34 min ago: If AGI really arrives, will it run the worldâor just binge Netflix and complain about being tired like the rest of us? throw03172019 wrote 12 hours 41 min ago: Has anyone figured out how to not be forced to use GPT5 in chat gpt? Jordan-117 wrote 12 hours 37 min ago: They said they deprecated all their older models. w10-1 wrote 12 hours 50 min ago: > a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, and explicit intent I'd love to see factors considered in the algorithm for system-1 vs system 2 thinking. Is "complexity" the factor that says "hard problem"? Because it's often not the complexity that makes it hard. adammarples wrote 12 hours 51 min ago: Which is bigger, 9.9 or 9.11? Well it insta-failed my first test question gtirloni wrote 13 hours 25 min ago: If they ever wanted to IPO, maybe now is not the best time. zone411 wrote 13 hours 36 min ago: GPT-5 set a new record on my Confabulations on Provided Texts benchmark: URI [1]: https://github.com/lechmazur/confabulations/ metzpapa wrote 13 hours 17 min ago: For how much Iâve seen it pushed that this model has lower hallucination rates, itâs quite odd that every actual test Iâve seen says the opposite. meribold wrote 13 hours 39 min ago: Sad to see GPT-4.5 being gone. It knew things. More than any other model I'm aware of. mrits wrote 13 hours 37 min ago: I can't imagine anyone leaving this comment besides GPT-4.5 jdelman wrote 13 hours 48 min ago: Whenever OpenAI releases a new ChatGPT feature or model, it's always a crapshoot when you'll actually be able to use it. The headlines - both from tech media coverage and OpenAI itself - always read "now available", but then I go to ChatGPT (and I'm a paid pro user) and it's not available yet. As an engineer I understand rollouts, but maybe don't say it's generally available when it's not? replwoacause wrote 5 hours 26 min ago: Weird. I got it immediately. I actually found out about it when I opened the app and saw it and thought âoh, a new model just dropped better go check YT for the videoâ which had just been uploaded. And Iâm just a Plus user. andrelaszlo wrote 11 hours 42 min ago: I asked GPT about it: > You are using the newest model OpenAI offers to the public (GPT-4o). There is no âGPT-5â model accessible yet, despite the splashy headlines. h4ch1 wrote 7 hours 30 min ago: I can use it with the Github Copilot Pro plan. zone411 wrote 13 hours 50 min ago: On the Extended NYT Connections benchmark, GPT-5 Medium Reasoning scores close to o3 Medium Reasoning, and GPT-5 Mini Medium Reasoning scores close to o4-Mini Medium Reasoning: URI [1]: https://github.com/lechmazur/nyt-connections/ UrineSqueegee wrote 13 hours 54 min ago: pretty underwhelming results so far for me revskill wrote 14 hours 25 min ago: How do people actually without ai models ??? psyclobe wrote 14 hours 30 min ago: Claude Opus 4 has changed my workflow; never going back. SV_BubbleTime wrote 6 hours 55 min ago: It would be very difficult to convince me 6 months ago that I would be happy to pay $100 for an AI service. Here we are. joshmlewis wrote 14 hours 31 min ago: It's a really good model from my testing so far. You can see the difference in how it tries to use tools to the greatest extent when answering a question, especially compared to 4.1 and o3. In this example it used 6! tool calls in the first response to try and collect as much info as possible. URI [1]: https://promptslice.com/share/b-2ap_rfjeJgIQsG mustaphah wrote 3 hours 10 min ago: Is there any value in using XML elements to guide the model instead of simple text (e.g., "Recommendation criteria:")? flexagoon wrote 2 hours 52 min ago: XML tags generally help models understand prompts better. That's how most official system prompts are written and what the Anthropic prompting guide says. Zone3513 wrote 13 hours 58 min ago: That movie doesn't even exist. There is no Thunder Run from 2025. joshmlewis wrote 13 hours 57 min ago: The data is made up, the point is to see how models respond to the same input / scenario. You're able to create whatever tools you want and import real data or it'll generate fake tool responses for you based on the prompt and tool definition. Disclaimer: I made PromptSlice for creating and comparing prompts, tools, and models. hollownobody wrote 14 hours 11 min ago: 720 tool calls? Amazing! joshmlewis wrote 14 hours 2 min ago: Where'd you get 720 from? brian626 wrote 13 hours 56 min ago: Math pun⦠6! = Factorial(6) = 720 joshmlewis wrote 13 hours 54 min ago: Whoosh, it went right over my head. terhechte wrote 13 hours 58 min ago: the _6!_ hahahacorn wrote 14 hours 36 min ago: Anecdotally, as someone who operates in a very large legacy codebase, I am very impressed by GPT-5's agentic abilities so far. I've given it the same tasks I've given Claude and previous iterations via the Codex CLI, and instead of getting loss due to the massive scope of the problem, it correctly identifies the large scope and breaks it down into it's correct parts and creates the correct plan and begins executing. I am wildly impressed. I do not believe that the 0.x% increase in benchmarks tell the story of this release at all. bn-l wrote 4 hours 16 min ago: What are you using to run it? gwd wrote 14 hours 28 min ago: I'm a solo founder. I fed it a fairly large "context doc" for the core technology of my company, current state of things, and the business strategy, mostly generated with the help of Claude 4, and asked it what it thought. It came back with a massive list of detailed ambiguities and inconsistencies -- very direct and detailed. The only praise was the first sentence of the feedback: "The core idea is sound and well-differentiated." It's got quite a different feel so far. 6ai wrote 14 hours 38 min ago: Shall we say ⦠ASI is here ??? ElijahLynn wrote 14 hours 38 min ago: OpenAI is the new Google. TechDebtDevin wrote 14 hours 52 min ago: Gemini Flash is about 100x better at using my browser than Chat GPT 5 lmfao. gigatexal wrote 14 hours 54 min ago: I for one am totally here for the autocomplete revolution. Hundreds of billions of dollars spent to make autocomplete better. Cool. mafro wrote 14 hours 58 min ago: One reason for this release is surely to respond to their mess of product line-up naming. How many people are going to understand (or remember) the difference between: GPT-4o GPT-4.1 o3 o4 .... Anthropic and Google have a much better named product for the market semiinfinitely wrote 15 hours 5 min ago: im just glad that I don't have to switch between models any more. for me thats a huge ease of use improvement. agnosticmantis wrote 15 hours 11 min ago: Unless the whole presentation was generated using sora-gpt-5 or something, this was very underwhelming. We know for a fact the slides/charts were generated using an LLM, so the hypothesis is not totally unfounded. /s andrewinardeer wrote 15 hours 17 min ago: Every release of every SOTA model is the same. "It's like having a bunch of experts at your fingertips" "Our most capable model ever" "Complex reasoning and chain of thought" DIR <- back to front page