_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Searchformer: Beyond A* – Better planning with transformers via search dynamics
       
       
        teleforce wrote 18 hours 5 min ago:
        Previous post and discussions on HN:
        
        Beyond A*: Better Planning with Transformers:
        
   URI  [1]: https://news.ycombinator.com/item?id=39479478
       
        nextaccountic wrote 19 hours 28 min ago:
        Due to the no free lunch theorem [0], any search algorithm that makes
        some problems faster will necessarily make other problems slower. How
        does the worst case for an algorithm like this look like?
        
        I think that part of the appeal of A* to me is that I can readily
        visualize why the algorithm failed at some pathological inputs.
        
        [0]
        
   URI  [1]: https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_opti...
       
          blamestross wrote 13 hours 15 min ago:
          Well you have to run the AI, which is significantly more costly than
          just running A*
          
          Also, the AI has a finite size, which means it has to be scaled up
          for bigger graphs. I doubt it would work at all for non-euclidian 2d
          graphs.
          
          The exciting thing is that they made an AI do a toy problem, not that
          it was efficient.
          
          But people are dumb and tech companies have brain control over them.
          "AI good and solve all problems"
          
          It reminds me of the "slime mold solves mazes" thing. It was
          interesting and there were even things to learn from it, but it
          wasn't magic and couldn't give us better general tools.
       
          IanCal wrote 14 hours 54 min ago:
          NFL theorem is frankly silly.
          
          It boils down to "you can't predict random numbers".
          
          I'm sure it's useful to have formalized but it has no real impact on
          real world situations because most problems we care about have some
          kind of structure we can exploit.
       
            woopsn wrote 4 hours 41 min ago:
            Yes significantly. But take for example the stock market, which is
            obviously of interest, to OAI in particular, but which cannot over
            the medium-long term be extrapolated accurately from historical
            data.
       
            nextaccountic wrote 14 hours 37 min ago:
            Well when I asked "How does the worst case for an algorithm like
            this look like?" I fully expected the answer to be "in practice
            this never happens" (that is, there are bad cases but since we
            aren't drawing inputs at random, in practice we never hit them)
            
            But how likely is that?
       
          reaperman wrote 15 hours 10 min ago:
          The "No free lunch" theorem does have some prequisites for it to
          actually apply.
          
          > There is no free lunch in search if and only if the distribution on
          objective functions is invariant under permutation of the space of
          candidate solutions.
       
          kevindamm wrote 19 hours 14 min ago:
          Well, it's still using A* and state exploration to validate the plans
          generated.  But, it's generating plans nondeterministically so it is
          possible it could run in O(\inf).
          
          Why didn't they give more details about the Sokoban puzzles being
          solved?  The standard benchmark is the 90 puzzles in XSokoban and
          there's a Large Test Suite that has a few thousand but many of them
          haven't been solved by any search or planning system.  Even the 90
          puzzles in XSokoban were only solved a few years ago (2020?) by
          Festival (using FESS, a feature-space search in addition to
          domain-space, state-space).  Before that it had been about 20 years
          of only 57 or so levels solved via search.
          
          I see that they measure trace lengths but I would have really liked
          to see # of XSokoban levels and how many states were examined, to
          line up with existing research.
       
        a_wild_dandan wrote 21 hours 32 min ago:
        Ah, I remember reading this paper! Essentially, they created synthetic
        data by solving search problems using A*. Trained on this data,
        transformers unsurprisingly learned to solve these problems. They then
        improved their synthetic data by repeatedly solving a given problem
        many times with A*, and keeping only the shortest step solution.
        Transformers learned to be competitive with this improved search
        heuristic too!
        
        Pretty remarkable stuff. Given the obsession over chat bots, folk often
        miss how revolutionary transformer sequence modeling is to...well, any
        sequence application that isn't a chat bot. Looking solely at speeding
        up scientific simulations by ~10x, it's a watershed moment for
        humanity. When you include the vast space of other applications, we're
        in for one wild ride,  y'all.
       
          electricships wrote 13 hours 17 min ago:
          After AlphaZero, people tried to train various combinatorial solvers
          
          IIRC, they were unable to exceed Sota of SAT solvers (which
          admittedly are the results of decades of research)
          
          maybe we just needed bigger networks, or transformers, or just more
          training compute
       
            lou1306 wrote 8 hours 26 min ago:
            SAT appears to be rather impervious to statistical methods because
            it is very non-linear and non-local (besides, of course, being
            NP-complete). Flip one single literal anywhere and poof!, your
            formula mght go from SAT to UNSAT, or vice versa, or you might
            enable a ton of simplifications, etc. So it might be very hard to
            train a network on such a riotous state space.
            
            Additionally, in practice SAT solvers are also supposed to provide
            witnesses (i.e., a satisfying assignment or an unsatisfiability
            core) to back their verdicts, and that may be an ever taller order
            for ML.
       
              math_dandy wrote 6 hours 0 min ago:
              Since all of these combinatorial optimization problems are, in
              full generality, are as difficult as possible in the complexity
              theoretic sense, there is little prospect of AI/ML helping to
              solve these problems in general. (?)
              
              But people hope that some subclass of "real-life" instances of
              one of these optimization problems has enough structure that
              AI/ML may be able to learn useful heuristics for it. (This is not
              my field. This is just my take-away from reading some
              AI/ML-for-CO papers.)
       
            davidguetta wrote 10 hours 16 min ago:
            Mm theres one purely based on position analysis and not tree search
            that has GM level in chess. From deepmind
       
            radomir_cernoch wrote 12 hours 33 min ago:
            I'd love to read more. Do you know any sources?
       
              algo_trader wrote 9 hours 41 min ago:
              Try Reinforcement Learning for Combinatorial Optimization: A
              Survey
              
              or a more recent and spirited discussion on reddit..
              
   URI        [1]: https://old.reddit.com/r/reinforcementlearning/comments/...
       
                nickpsecurity wrote 5 hours 30 min ago:
                Link for those who need it:
                
   URI          [1]: https://arxiv.org/abs/2003.03600
       
          unraveller wrote 20 hours 23 min ago:
          what are those other applications? so now everyone has SoTA warehouse
          scheduling in their pockets no one can be expected to do anything or
          go anywhere inefficiently.
       
          jadbox wrote 20 hours 24 min ago:
          Remarkable indeed, I've been toying with many smaller, unorthodox
          use-cases for LLMs and continue to be surprised.
       
            littlestymaar wrote 5 hours 19 min ago:
            It uses transformers but it's not an llm though.
       
            jasonjmcghee wrote 17 hours 40 min ago:
            Can you say more? Where can I read about some of these alternate
            use cases?
       
              ebri wrote 14 hours 26 min ago:
              yes I want to learn about this too! Hadn't thought about it like
              that.
       
            smcin wrote 19 hours 10 min ago:
            I want to see an optimal (2P) speedrun with Sonic the Hedgehog and
            Tails. As a fun proof. There are two players working together; also
            it has more degrees-of-freedom than a Sokoban.
       
        yeldarb wrote 1 day ago:
        > 
        While Transformers have enabled tremendous progress in various
        application settings, such architectures still lag behind traditional
        symbolic planners for solving complex decision making tasks. In this
        work, we demonstrate how to train Transformers to solve complex
        planning tasks and present Searchformer, a Transformer model that
        optimally solves previously unseen Sokoban puzzles 93.7% of the time,
        while using up to 26.8% fewer search steps than standard A∗ search.
        Searchformer is an encoder-decoder Transformer model trained to predict
        the search dynamics of A∗. This model is then fine-tuned via expert
        iterations to perform fewer search steps than A∗ search while still
        generating an optimal plan. In our training method, A∗'s search
        dynamics are expressed as a token sequence outlining when task states
        are added and removed into the search tree during symbolic planning. In
        our ablation studies on maze navigation, we find that Searchformer
        significantly outperforms baselines that predict the optimal plan
        directly with a 5-10× smaller model size and a 10× smaller training
        dataset. We also demonstrate how Searchformer scales to larger and more
        complex decision making tasks like Sokoban with improved percentage of
        solved tasks and shortened search dynamics.
        
        Neat; TIL about Sokoban puzzles. I remember playing Chip's Challenge on
        Windows 3.1 when I was a kid which had a lot of levels like that.
       
          Modified3019 wrote 7 hours 25 min ago:
          It's definitely a fun little puzzle.
          
          I was introduced to it ages ago via "Wyvern" ( [1] ) where I'd
          basically sit around playing it and chatting.
          
          Wyvern was heavily influenced by Crossfire ( [2] ) which had a not so
          functional version. Crossfire itself was influenced by nethack, which
          apparently also has sokoban levels.
          
   URI    [1]: https://web.archive.org/web/20040225005408/http://www.caboch...
   URI    [2]: https://crossfire.real-time.com/
       
          passion__desire wrote 17 hours 51 min ago:
          Jonathan Blow is working on "Sokoban"-Inspired Puzzle Game. I am
          looking forward to its release.
       
          smcin wrote 19 hours 7 min ago:
          Sokoban puzzles have a discrete set of actions, typically 8 (move
          U/D/L/R ; drag U/D/L/R), on a discrete grid.
       
            airstrike wrote 9 hours 53 min ago:
            more like push than drag tbf
       
       
   DIR <- back to front page