_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Large language models, small labor market effects [pdf]
       
       
        credit_guy wrote 18 hours 33 min ago:
        In the early days of computers most scientists kept using slide rules.
       
        kazinator wrote 18 hours 51 min ago:
        Economists who write LaTeX docs are scary, even with AI help.
       
        trod1234 wrote 21 hours 44 min ago:
        We seriously live in the world of Anathem now where apparently most
        people need a specialized expert to cut through plausible generated
        misinformation as a whole.
        
        This is a second similar study I've seen today on HN that seems in part
        generated by AI, and fails rigorous methodology, while making
        conclusions that are unbased to seemingly fuel a narrative.
        
        The study fails to account for a number of elements which nullify the
        conclusions as a whole.
        
        AI Chatbot tasks by their nature are communication tasks involving a
        third-party (the customer). When the Chatbot fails to direct, or loops
        coercively, and this is a task computer's really can't do well;
        customers get enraged because it results in crazy-making/inducing
        behavior. The Chatbot in such cases imposes time-cost, with all the
        necessary elements suitable to call it torture. Those elements being
        isolation, cognitive dissonance, coercion with perceived/real loss,
        lack of agency. There is little if any differentiation between the
        tasks measured. Emotions Kill [1].
        
        This results in outcomes where there is no change, or higher demand for
        workers, just to calm that person down and this is true regardless of
        occupation. In other words the punching bag of verbal hostility, which
        is the role of CSR receiving calls or communications from irrationally
        enraged customers after AI has had their first chance to wind them up.
        
        It is a stochastic environment, and very few conclusions can actually
        be supported because they seem to follow reasoning along a null
        hypothesis.
        
        The surveys use Denmark as an example (being part of the EU), but its
        unclear if they properly take into account company policies about not
        submitting certain private data for tasks to a US-based LLM given the
        risks related to GDPR. They say the surveys were sent to workers
        directly who are already employed, but it makes no measure of displaced
        workers, nor overall job reductions, which historically is how the
        changes in integration are adopted, misleading the non-domain expert
        reader.
        
        The paper does not appear to be sound, and given it relies solely on a
        DiD approach without specifying alternatives, it may be pushing a
        pre-fabricated narrative that AI won't disrupt the workforce when the
        study doesn't actually support that in any meaningful rational way.
        
        This isn't how you do good science. Overgeneralizing is a fallacy, and
        while some computation is being done to limit that it doesn't touch on
        what you don't know, because what you don't know hasn't been quantified
        (i.e. the streetlight effect)[1].
        
        To understand this, the layman and expert alike must always pay
        attention to what you don't know. The video below touches on some of
        the issues without requiring technical expertise. [1][Talk] Survival
        Heuristics: My Favorite Techniques for Avoiding Intelligence Traps -
        SANS CTI Summit 2018
        
   URI  [1]: https://www.youtube.com/watch?v=kNv2PlqmsAc
       
        Legend2440 wrote 1 day ago:
        Seems premature, like measuring the economic impact of the internet in
        1985.
        
        LLMs are more tech demo than product right now, and it could take many
        years for their full impact to become apparent.
       
          layoric wrote 23 hours 3 min ago:
          A big difference here is the sheer scale of investment. In 1985, the
          internet was running on the dreams of a few. The sheer depth of
          investment in "AI" currently is hard to fathom, and being injected
          into everything regardless of what customers want.
       
            no_wizard wrote 19 hours 57 min ago:
            While ARPANET itself reportedly cost somewhere between 10 - 20
            million USD, which is relatively cheap, the precursor research that
            allowed the internet to take off - which is directed more at
            general computing and advanced computer networks, the
            telecommunications investments - cost many billions of dollars, it
            was mostly public money is the biggest difference.
            
            That said, private companies are pumping alot of money into this
            space, but technological progress is a peaks and valley situation.
            I imagine most of the money will ultimately move the needle very
            little following things of dubious hindsight value
       
            Legend2440 wrote 20 hours 28 min ago:
            That's because the tech industry has more money than god and
            nothing better to do with it.
            
            Microsoft alone has half a trillion dollars in assets, and
            Apple/Google/Meta/Amazon are in similar financial positions.
            Spending a few tens of billions on datacenters is, as crazy as it
            sounds, nothing to them.
       
          amarcheschi wrote 1 day ago:
          I wouldn't call "premature" when llm companies ceos have been
          proposing ai agents for replacing workers - and similar things that I
          find debatable - in about the 2nd half of the twenties. I mean, a
          cold shower might eventually happen for a lot of Ai based companies
       
            dehrmann wrote 22 hours 29 min ago:
            The most recent example is the Anthropic CEO:
            
            > I think we will be there in three to six months, where AI is
            writing 90% of the code. And then, in 12 months, we may be in a
            world where AI is writing essentially all of the code [1] This
            seems either wildly optimistic or comes with a giant asterisk that
            AI will write it by token predicting, then a human will have to
            double check and refine it.
            
   URI      [1]: https://www.businessinsider.com/anthropic-ceo-ai-90-percen...
       
              pcwalton wrote 14 hours 48 min ago:
              That statement seems extremely hyperbolic. Just today I tried
              automating some very pedestrian code for a system that wasn't
              particularly well-documented and ChatGPT 4o hallucinated the
              entire API. It was deeply frustrating and wasted more of my time
              than it would have taken to just slog through the documentation.
              
              I won't deny that LLMs can be useful--I still use them--but in my
              experience an LLM's success rate in writing working code is
              somewhere around 50%. That leads to a productivity boost that,
              while not negative, isn't anywhere near the wild numbers that are
              bandied about.
       
              latentsea wrote 18 hours 38 min ago:
              This quote to me feels more along the "no computer will ever need
              more than 640kb of RAM" lines in terms of historical accuracy.
              Like whoops, nope.
       
              Legend2440 wrote 20 hours 45 min ago:
              I wouldn't really pay attention to what CEOs say, it's their job
              to sell things and drum up investment.
              
              No one really knows exactly how AI will play out. There's a lot
              of motivated reasoning going on, both from hypesters and cynics.
       
                bluefirebrand wrote 16 hours 13 min ago:
                > I wouldn't really pay attention to what CEOs say, it's their
                job to sell things and drum up investment
                
                We don't have a choice but to pay attention to what CEOs say,
                they are the ones that might lead our companies off of cliffs
                if we let them
       
                  auggierose wrote 13 hours 6 min ago:
                  It's not my company, it can go off the cliff for all I care.
       
              agumonkey wrote 21 hours 58 min ago:
              I anticipate a control issue, where agents can produce code
              faster than people can analyze and beside applications with small
              visible surfaces, nobody will be able to check what is going on
              
              I saw people with trouble manipulating boolean tables of 3
              variables in their head trying to generate complete web
              applications, it will work for linear duties (input -> processing
              -> storage) but I highly doubt they will be able to understand
              anything with 2nd order effects
       
                tylerrobinson wrote 6 hours 49 min ago:
                > people with trouble manipulating boolean tables of 3
                variables in their head
                
                To be fair, 3 booleans (2^3=8) is more working memory than most
                people are productive with. Way more if they’re nullable :)
       
                  agumonkey wrote 4 hours 20 min ago:
                  Really feels like a race to the bottom to me. You used to
                  have a trade that made you try to master those 8
                  configurations and now you're acting like the ignorant
                  manager asking "agents" to deal with it.
       
              amarcheschi wrote 22 hours 8 min ago:
              I'm honestly slightly appalled by what we might miss by not
              reading the docs and just letting Ai code. I'm attending a course
              where we have to analyze medical datasets using up to ~200gb of
              ram. Calculations can take some time. A simple skim through the
              library (or even asking the chatbot) can tell you that one of the
              longest call can be approximated and it takes about 1/3rd of the
              time it takes with another solver. And yet, none of my colleagues
              thought about either looking the docs or asking the chatbot.
              Because it was working. And of course the chatbot was using the
              solver that was "standard" but that you probably don't need to
              use for prototyping.
              
              Again. We had some parts of one of 3 datasets split in ~40 files,
              and we had to manipulate and save them before doing anything
              else. A colleague asked chatgpt to write the code to do it and it
              was single threaded, and not feasible. I hopped up on htop and
              upon seeing it was using only one core, I suggested her to ask
              chatgpt to make the conversion run on different files in
              different threads, and we basically went from absolutely slow to
              quite fast. But that supposed that the person using the code
              knows what's going on, why, and what is not going on. And when it
              is possible to do something different. Using it without asking
              yourself more about the context is a terrible use imho, but it's
              absolutely the direction that I see we're headed towards and I'm
              not a fan of it
       
            frankfrank13 wrote 1 day ago:
            > cold shower might eventually happen for a lot of Ai based
            companies
            
            undoubtedly.
            
            The economic impact of some actually useful tools (Cursor, Claude)
            are propping up hundreds of billions of dollars in funding for,
            idk, "AI for  "or "replace your  with our AI tool"
       
        meta_ai_x wrote 1 day ago:
        It's incredibly hard to model complex non-linear systems. So, while I
        applaud the researchers to provide some data points, these things
        provide ZERO value for current/future decision making.
        
        Chatbots were absolute garbage before chatGPT, while post chatGPT
        everything changed. So, there is going to be a tipping point event on
        labor market effects and past single variable "data analysis" will not
        provide anything to predict the event or it's effects
       
        jaxtracks wrote 1 day ago:
        Interesting study! Far too early in the adoption lifecycle for any
        conclusions I think, especially given that the data is from Denmark
        which tends to be have a far less hype-driven business culture than the
        US going by my bit of experience working in both. Anecdotally, I've
        seen a couple of AI hiring freezes in the states (some from LLM
        integrations I've built) that I'm fairly sure will be reversed when
        management gets a more realistic sense of capabilities, and my general
        sense is that the Danes I've worked with would be far less likely to
        overestimate the value of these tools.
       
          sottol wrote 1 day ago:
          I agree on the "far too early" part. But imo we can probably say more
          about the impact in a year though, not 5-10 years. But it does show
          that some of the randomized-controlled-trials that showed large
          labor-force impact and productivity gains are probably only
          applicable to a small sub-section of the work-force.
          
          It also looks like the second survey was sent out in June 2024 - so
          the data is 10 months old at this point, another reason why this it
          might be early.
          
          That said, the latest round of models are the first I've started
          using more extensively.
          
          The paper does address the fact that Denmark is not the US, but
          supposedly not that different:
          
          "First, Danish workers have been at the forefront of Generative AI
          adoption, with
          take-up rates comparable to those in the United States (Bick, Blandin
          and Deming, 2025;
          Humlum and Vestergaard, 2025; RISJ, 2024).
          
          Second, Denmark’s labor market is highly flexible, with low hiring
          and firing costs
          and decentralized wage bargaining—similar to that of the
          U.S.—which allows firms and
          workers to adjust hours and earnings in response to technological
          change (Botero et al.,
          2004; Dahl, Le Maire and Munch, 2013). In particular, most workers in
          our sample engage
          in annual negotiations with their employers, providing regular
          opportunities to adjust
          earnings and hours in response to AI chatbot adoption during the
          study period."
       
        mediaman wrote 1 day ago:
        Great read. One of the interesting insights from it is how difficult
        good application of AI is.
        
        A lot of companies are just "deploying a chatbot" and some of the
        results from this study show that this doesn't work very well. My
        experience is similar: deploying simple chatbots to the enterprise
        doesn't do a lot.
        
        For things to get better, two things are required, neither of which are
        easy:
        
        - Integration into existing systems. You have to build data lakes or
        similar system that allow the AI to use data and information broadly
        across an enterprise. For example, for an AI tool to be useful in
        accounting, it's going to need high quality data access to the
        company's POs, issued invoices, receivers, GL data, vendor invoices,
        and so on. But many systems are old, have dodgy or nonexistent APIs,
        and data is held in various bureaucratic fiefdoms. This work is hard
        and doesn't scale that well.
        
        - Knowledge of specific workflows. It's better when these tools are
        built with specific workflows in mind that are designed around specific
        peoples' jobs. This can start looking less like pure AI and more like a
        mix of traditional software with some AI capabilities. My experience is
        that I sell software as "AI solutions," but often I feel a lot of the
        value created is because it's replacing bad processes (either terrible
        older software, or attempting to do collaborative work via
        spreadsheet), and the AI tastefully sprinkled throughout may not be the
        primary value driver.
        
        Knowledge of specific workflows also requires really good product
        design. High empathy, ability to understand what's not being said,
        ability to understand how to create an overall process value stream
        from many different peoples' narrower viewpoints, etc. This is also
        hard.
        
        Moreover, this is deceiving because for some types of work (coding,
        ideating around marketing copy) you really don't need that much
        scaffolding at all because the capabilities are latent in the AI, and
        layering stuff on top mostly gets in the way.
        
        My experience is that this type of work is a narrow slice of the total
        amount of work to be done, though, which is why I'd agree with the
        overall direction this study is suggesting that creating actual
        measurable major economic value with AI is going to be a long-term
        slog, and that we'll probably gradually stop calling it AI in the
        process as we attenuate to it and it starts being used as a tool within
        software processes.
       
          bwfan123 wrote 1 hour 31 min ago:
          an analogy i find useful is the search-engine (google).
          
          yea the search-engine improved productivity of almost everyone, but
          didnt change any workflows.
       
          no_wizard wrote 20 hours 4 min ago:
          I work on an application that uses AI to index and evaluate any given
          corpus (like papers, knowledge bases etc) of knowledge and it has
          been a huge help here, and I know its because we are dealing with
          what is effectively structured data that can be well classified once
          identified, and we have relatively straightforward ways of doing
          identification. The real magic is when the finely tuned AI started to
          correctly stitch pieces of information together that previously
          didn't appear to be related that is the secret sauce beyond simply
          indexing for search
          
          Code is similar - programming languages have rules that are well
          known, couple that with proper identification, pattern matching and
          thats how you get to these generated prototypes[0] done via so called
          'vibe coding' (not the biggest fan of the term but I digress)
          
          I think this is early signs that this generation of LLMs at least,
          are likely to be augmentations to many existing roles as opposed to
          strictly replacing them. Productivity will increase by a good
          magnitude once the tools are well understood and scoped to task
          
          [0]: They really are prototypes. You will eventually hit walls by
          having an LLM generate the code without understanding the code.
       
          stego-tech wrote 21 hours 45 min ago:
          > Integration into existing systems.
          
          Integration alone isn't enough. Organizations let their data go
          stale, because keeping it updated is a political task instead of a
          technical one. Feeding an AI stale data effectively renders it
          useless, because it doesn't have the presence of mind to ask for
          assistance when it encounters an issue, or to ask colleagues if this
          process is still correct even though the expected data doesn't "fit".
          
          Automations - including AI - require clean, up-to-date data in order
          to function effectively. Orgs who slap in a chatbot and call it a day
          don't understand the assignment.
       
          treis wrote 23 hours 48 min ago:
          LLM chatbots are a step forward for customer support.  Well, ours
          started hallucinating a support phone number that while is a real
          number is not our number.  Lots of people started calling which was a
          bad time for everyone.    Especially the person's number it actually
          is.  So maybe two steps forward and occasionally one back.
       
            TRiG_Ireland wrote 18 hours 44 min ago:
            As a customer, LLM chatbots are fifteen steps backwards and have
            approximately zero upsides. I hate them with a deep and abiding
            passion.
       
              stingraycharles wrote 13 hours 21 min ago:
              Not compared to the previous iteration of chatbots. But compared
              to human operators, definitely.
       
          AlexCoventry wrote 1 day ago:
          I think when the costs and latencies of reasoning models like o1-pro,
          o3 and o4-mini-high come down, chatbots are going to be much more
          effective for technical support. They're quite reliable and
          knowledgeable, in my experience.
       
          aerhardt wrote 1 day ago:
          > how difficult good application of AI is.
          
          The only interesting application I've identified thus far in my
          domain in Enterprise IT (I don't do consumer-facing stuff like
          chatbots) is in replacing tasks that previously would've been done by
          NLP: mainly extraction, synthesis, classification. I am currently
          working a long-neglected dataset that needs a massive remodel and I
          think that would've taken a lot of manual intervention and a mix of
          different NLP models to whip into shape in the past, but with LLMs we
          might be able to pull it off with far fewer resources.
          
          Mind you at the scale of the customer I am currently working with,
          this task also would've never been done in the first place - so it's
          not replacing anyone.
          
          > This can start looking less like pure AI and more like a mix of
          traditional software with some AI capabilities
          
          Yes, the other use case I'm seeing is in peppering already existing
          workflow integrations with a bit of LLM magic here and there. But why
          would I re-work a worklfow that's already implemented and
          well-understood in Zapier, n8n or Python with total reliability.
          
          > Knowledge of specific workflows also requires really good product
          design. High empathy, ability to understand what's not being said,
          ability to understand how to create an overall process value stream
          from many different peoples' narrower viewpoints, etc. This is also
          hard.
          
          > My experience is that this type of work is a narrow slice of the
          total amount of work to be done
          
          Reading you I get the sense we are on the same page on a lot of thing
          and I am pretty sure if we worked together we'd get along fine. I'm
          struggling a bit with the LLM delulus as of late so it's a breath of
          fresh air to read people out there who get it.
       
            PaulHoule wrote 22 hours 53 min ago:
            As I see it three letter organizations have been using frameworks
            like Apache UIMA to build information extraction pipelines that are
            manual at worst and hybrid at best.  Before BERT the models we had
            for this sucked,  only useful for certain things,  and usually
            requiring training sets of 20,000 or so examples.
            
            Today the range of things for which the models are tolerable to
            "great" has greatly expanded.  In arXiv papers you tend to see
            people getting tepid results with 500 examples,  I get better
            results with 5000 examples and diminishing returns past 15k.
            
            For a lot of people it begins and ends with "prompt engineering" of
            commercial decoder models and evaluation isn't even an afterthought
             For information extraction,  classification and such though you
            get often good results with encoder models (e.g. BERT) put together
            with serious eval, calibration and model selection.  Still the
            system looks like the old systems if your problem is hard and has
            to be done in a scalable way,  but sometimes you can make something
            that  "just works" without trying too hard,  keeping your
            train/eval data in a spreadsheet.
       
          ladeez wrote 1 day ago:
          The pivot to cloud had a decade warmup before HOWTO was normalized to
          existing standards.
          
          In the lead up a lot of the same naysaying we see about AI was
          everywhere. AI can be compressed into less logic on a chip, bootstrap
          from models. Require less state management tooling software dev
          relies on now. We’re slowly being trained to accept a down turn in
          software jobs. No need to generate the code that makes up an
          electrical state when we can just tune hardware to the state from an
          abstract model deterministically. Energy based models are the
          futuuuuuure. [1] Lot of the same naysaying about Dungeons and Dragons
          and comic books in the past too. Life carried on.
          
          Functional illiterates fetishize semantics, come to view their
          special literacy as key to the future of humanity. Tale as old as
          time.
          
   URI    [1]: https://www.chipstrat.com/p/jensen-were-with-you-but-were-no...
       
       
   DIR <- back to front page