_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Artificial Writing and Automated Detection [pdf]
       
       
        xtiansimon wrote 4 hours 45 min ago:
        Slightly off-topic. A client uses Stripe for a small business website.
        We got an automated email saying a transaction was flagged as
        potentially fraudulent. We should investigate and possibly refund
        before a chargeback occurs. What? Is this a stolen card or what?
        
        So I inquired with the chatbot and they list possible causes of a
        flagged transaction could be stolen card, as well as a few other
        examples which amount to a mix of service issues which are
        customer-determined. But the bot says it’s definitely not a
        chargeback. What?
        
        So now I contact support. They say it’s a flag from the credit card
        issuing bank. Wait. What? Is this a fraudulent stolen card or not?
        Still no. It’s just a warning based on pattern usage. Why you passing
        this slop to my client? If there is a pattern problem, the flag should
        go to the customer who authorizes the charge. Otherwise it’s a
        chargeback or known stolen card.
        
        They say, well, you can contact the customer. What? If the pattern is
        actually a stolen card, which is listed as a possible cause of the flag
        while not saying it is or isn’t, then they can just lie!
        
        Which is a lot to say this pattern matching for fraud or negative
        patterns suffers from idiocy, even in the simplest of contexts.
       
        vesterthacker wrote 17 hours 5 min ago:
        The paper Artificial Writing and Automated Detection by Brian Jabarian
        and Alex Imas examines the strange boundary that now divides human
        expression from mechanical imitation. Within their analysis one feels
        not only the logic of research but the deeper unease of our age, the
        question of whether language still belongs to those who think or only
        to those who simulate thought. They weigh false positives and false
        negatives, yet behind those terms lives an older struggle, the human
        desire to prove its own reality in a world of imitation.
        
        I read their work and sense the same anxiety in myself. When I write
        with care, when I choose words that carry rhythm and reason, I feel
        suspicion rather than understanding. Readers ask whether a machine has
        written the text. I lower my tone, I break the structure, I remove what
        once gave meaning to style, only to make the words appear more human.
        In doing so, I betray something essential, not in the language but in
        myself.
        
        The authors speak of false positives, of systems that mistake human
        writing for artificial output. But that error already spreads beyond
        algorithms. It enters conversation, education, and the smallest corners
        of daily life. A clear sentence now sounds inhuman; a careless one,
        sincere. Truth begins to look artificial, and confusion passes for
        honesty.
        
        I recall the warning of Charlotte Thomson Iserbyt in The Deliberate
        Dumbing Down of America. She foresaw a culture that would teach
        obedience in place of thought. That warning now feels less like
        prophecy and more like description.
        
        When people begin to distrust eloquence, when they scorn precision as
        vanity and mistake simplicity for virtue, they turn against their own
        mind. And when a society grows ashamed of clear language, it prepares
        its own silence. Not the silence of peace, but the silence of
        forgetfulness, the kind that falls when no one believes in the power of
        words any longer.
       
          Jacobee wrote 15 hours 11 min ago:
          I saw what you did:
          
          "yet behind those terms lives an older struggle, the human desire to
          prove its own reality in a world of imitation."
          
          ..each paragraph ends with this corny and tiresome 50's mechanized
          `erudite' baloney.
          
          --The Rod Serling Algo, aka, TTZ
       
        foxfired wrote 19 hours 3 min ago:
        There was a post just a few hours ago on the frontpage asking not to
        use AI for writing [0]. I copied the content and pasted it on multiple
        "AI detection" tools. It scored from 0% and to up to 80%. This is not
        gonna cut it. As someone who used LLMs to "improve" my writing, after a
        while, no matter the prompt, you will find the exact same patterns.
        "Here's the kicker" or "here is the most disturbing part" those
        expressions and many more come up no matter how your engineer the
        prompt. But here's the kicker, real people also use these expressions,
        just at a lesser rate.
        
        Detection is not what is going to solve the problem. We need to go back
        and reevaluate why we are asking students to write in the first place.
        And how we can still achieve the goal of teaching even when these
        modern tools are one click away.
        
        [0]:
        
   URI  [1]: https://news.ycombinator.com/item?id=45722069
       
          TomasBM wrote 9 hours 44 min ago:
          I see what you did there.
          
          I think we'll still need ways to detect copy-pasted zero-shot content
          that's generated by LLMs, for the same reasons that teachers needed
          ways to detect plagiarism. Kids, students, and interns [1] "cheat"
          for various different reasons [2], and we want to be able to detect
          lazy infractions early enough so that we can correct their behavior.
          
          This leads to three outcomes:
          
          1. Those that never really meant to cheat will learn how to do things
          properly.
          
          2. Those that cheated out of laziness will begrudgingly need to weigh
          their options, at which point doing things properly may be less
          effort.
          
          3. Those that meant to cheat will have to invest (much) more effort,
          and run the risk of being kicked-out if they're rediscovered.
          
          [1] But also employees, employers, government officials, etc.
          
          [2] There could be some relatively benign reasons. For example, they
          could: not know how to quote/reference others properly; think it's OK
          because "everyone does it" or they don't care about subjects that
          involve writing; do it "just this once" out of procrastination; and
          similar.
       
            dpoloncsak wrote 3 hours 17 min ago:
            The whole argument is "A written response to an answer is no longer
            a valid form of testing for knowledge"
            
            We don't need better detection. We need better ways to measure
            one's grasp of a concept. When calculators were integrated into
            education, the focus shifted from working the problem out, to using
            the correct formulas and using the calculator effectively. Sure,
            elementary classes will force you to 'show your work', but that's
            to build the foundation to build on, I believe.
            
            We don't need to detect plagiarism if we're asking students verbal
            answers, for example
       
          mmooss wrote 11 hours 29 min ago:
          One major missing piece in using AIs is self-expression. The idea of
          writing is to express your own ideas, to put yourself on the page;
          someone writing for you, AI or biological, can't do that. There are
          far too many nuances and subtleties.
          
          I suspect many students write to pass the class, and AI can do that.
          Perhaps the problem is the incentives to write that way.
       
        binarymax wrote 22 hours 37 min ago:
        My two cents about this after working with some teachers: this is a cat
        and mouse game and you're wasting your time trying to catch students
        writing essays on their own time.
        
        It is better to pivot and not care about the actual content of the
        essay, but instead seek alternate strategies to encourage learning -
        such as an oral presentation or a quiz on the knowledge.  In the
        laziest case, just only accept hand-written output - because even if it
        was generated at least they retained some knowledge by copying it.
       
          globalnode wrote 18 hours 25 min ago:
          Why do we even grade people? Just teach the content and be done with
          it. Sure if a student wants to assess their knowledge to see how well
          they can answer questions they can do that for kicks. If industry
          wants well educated people, they should have supervised entrance
          quizes or exams, the onus is on them. This obsession with catching
          cheaters is out of control.
       
            TomasBM wrote 9 hours 23 min ago:
            If you're asking this seriously:
            
            We need to grade people because that's the best way we have to
            determine (for one or more subjects) who's:
            
            1. capable enough, so that we can promote them to the next stage;
            
            2. improving or has potential for improvement, so that we can give
            them the tools or motivation to continue;
            
            3. underperforming, so that we can find out why and help them turn
            it around (or reduce the pressure);
            
            4. actually learning the content, and if not, why not.
            
            Thankfully, everyone knows this system is flawed, so most don't put
            too much weight on school grades. But overall, the grades are there
            to provide both an incentive for teachers and students to do
            better, and a way to compare performance.
       
              globalnode wrote 8 hours 56 min ago:
              All good points, and I was sort of coming at it from the point of
              view of catching cheaters. ofc cheaters skew the data but theyre
              ultimately hurting themselves. They wont pass a companies'
              entrance tests or will soon find themselves unemployed if they
              cant do the work. Yes its a problem but I see a lot of effort
              being spent on trying to detect them. Is that effort proportional
              to the problem?
       
          NewsaHackO wrote 21 hours 25 min ago:
          I think the most realistic way is to do a flipped classroom, where
          middle-school and beyond, children are expected to be independent
          learners. Class time should be spent on application of skills and
          evaluation.
       
          laptopdev wrote 22 hours 2 min ago:
          If computer usage hampers a child's socialization with the group he's
          learning with, maybe the simplest and most meaningful solution would
          be preventing children enrolled in language comprehension classes
          from having access to computers at home particularly at core language
          and reasoning stages in development.
       
          nonethewiser wrote 22 hours 34 min ago:
          Do teachers prefer grading papers or something? This always seemed
          like the obvious answer and there are no shortage of complaints.
          There is something making papers "sticky" that I do not understand.
          Education needs to be agile enough to change it's assessment methods.
          It's getting to the point where we can't just blame LLMs anymore.
          Figure out how to asses learning outcomes instead of just insisting
          on methods that you assumed should work.
       
            burkaman wrote 20 hours 38 min ago:
            Oral exams and quizzes are hard for reasons unrelated to
            understanding the subject matter. Language barriers, public
            speaking anxiety, exam stress, etc. All things that students should
            hopefully learn how to overcome, but that's a lot to ask a teacher
            to deal with in addition to teaching history or whatever. With a
            paper, a student can choose their own working environment, choose a
            day and time when they are best able to focus, have a constructive
            discussion with the teacher if they're having trouble midway
            through the work, and spread their effort (if they want to) across
            more than an hour-long test or 5-minute oral exam. In an imaginary
            world where they couldn't cheat, a paper gives the teacher the best
            chance of evaluating whether a student understands the material.
            
            I don't think you're wrong necessarily, but there are good reasons
            that teachers like papers other than "we've always used them".
       
              mmooss wrote 11 hours 32 min ago:
              > Oral exams and quizzes are hard for reasons unrelated to
              understanding the subject matter. Language barriers, public
              speaking anxiety, exam stress, etc
              
              People have some different challenges writing papers and taking
              oral and written quizzes, but is one way or the other necessarily
              easier? For writing papers, think about language barriers,
              anxiety about writing ability, stress of writing papers, need for
              self-motivation and time management, etc.
       
            binarymax wrote 22 hours 25 min ago:
            Because, assuming it's done properly w/o cheating, it's a great
            learning tool.    It's sometimes easy to forget that certain tasks
            are the way they are because they're supposed to teach.  We don't
            structure teaching and learning around what the least painful thing
            is.
       
              otterley wrote 19 hours 20 min ago:
              How wide is the gap between “least painful thing” and “most
              effective thing”?
       
        andy99 wrote 22 hours 40 min ago:
        While it’s interesting work, so far my experience is that AI isn’t
        good enough (or most people aren’t good enough with AI) for detection
        to really be a concern, at least in “research” or any writing over
        a few sentences.
        
        If you think about the 2x2 of “Good” vs “By AI”, you only
        really care about the case when something it good work that an AI did,
        and then only when catching cheaters, as opposed to deriving some
        utility.
        
        If it’s bad, who cares if it’s AI or not, and most AI is pretty
        obvious thoughtless slop, and most people that use it aren’t paying
        attention to mask that, so I guess what I’m saying is for most cases
        one could just set a quality bar and see if the work passes.
        
        I think maybe a difference AI brings is that in many cases people
        don’t really know how to understand or judge the quality of what they
        are reading, or are to lazy to, so have substituted as proxies for
        quality the same structural cues that AI now uses. So if you’re used
        to saying “it’s well formatted, lots of bulleted lists, no spelling
        mistakes, good use of adjectives, must be good”, now you have to
        actually read it and think about it to know.
       
          vages wrote 22 hours 34 min ago:
          I personally would value a spam filter that filters out AI generated
          content.
       
        Legend2440 wrote 22 hours 52 min ago:
        I suspect AI text detection has actually become easier, as chatbots
        today have been heavily finetuned towards a  more distinctive style.
        
        For example “delve” and the em-dash are both a result of the
        finetuning dataset, not the base LLM.
       
          AuthAuth wrote 21 hours 31 min ago:
          You are forgetting the human mind accounting for this and adding
          "write this like a kinda dumb high school student". I just did a
          little test between a copilot essay and the same prompt with "write
          this like a kinda dumb high school student" and it reads like an
          essay i would have written.
       
            bryanrasmussen wrote 19 hours 16 min ago:
            In the brave world of the future you too will be able to get a C-
            with very little effort!
       
          haffi112 wrote 22 hours 7 min ago:
          That's where the humanizers come in. These are solutions that take
          LLM generated text and make it sound human written to avoid
          detection.
          
          The principle of training them is quite simple. Take an LLM and
          reward it for revising text so that it doesn't get detected.
          Reinforcement learning takes care of the rest for you.
       
        rawgabbit wrote 23 hours 19 min ago:
        Wow. Never heard of Pangram until now. Quote:
        
             Pangram maintains near-perfect accuracy across long and medium
        length texts. It achieves very low error rates even on shorter passages
        and ‘stubs.’
       
          alfalfasprout wrote 22 hours 34 min ago:
          I'm extremely skeptical of these claims. Especially when we're
          dealing with careful prompting to adjust tone/style.
       
            haffi112 wrote 22 hours 6 min ago:
            Even if it was close to being near perfect, that is still not
            enough due to the negative impact of false positive detections on
            students.
       
            zingababba wrote 22 hours 11 min ago:
            Mmmm yes, I probably will never be able to find it again but
            someone recently tested a lot of these out and found you could
            bypass them easily by changing a few words around.
       
              Mkengin wrote 20 hours 38 min ago:
              Or use RL to beat any AI detectors:
              
   URI        [1]: https://reddit.com/r/LocalLLaMA/comments/1lnrd1t/you_can...
       
       
   DIR <- back to front page