_______ __ _______ | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| on Gopher (inofficial) URI Visit Hacker News on the Web COMMENT PAGE FOR: URI Searchformer: Beyond A* â Better planning with transformers via search dynamics teleforce wrote 18 hours 5 min ago: Previous post and discussions on HN: Beyond A*: Better Planning with Transformers: URI [1]: https://news.ycombinator.com/item?id=39479478 nextaccountic wrote 19 hours 28 min ago: Due to the no free lunch theorem [0], any search algorithm that makes some problems faster will necessarily make other problems slower. How does the worst case for an algorithm like this look like? I think that part of the appeal of A* to me is that I can readily visualize why the algorithm failed at some pathological inputs. [0] URI [1]: https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_opti... blamestross wrote 13 hours 15 min ago: Well you have to run the AI, which is significantly more costly than just running A* Also, the AI has a finite size, which means it has to be scaled up for bigger graphs. I doubt it would work at all for non-euclidian 2d graphs. The exciting thing is that they made an AI do a toy problem, not that it was efficient. But people are dumb and tech companies have brain control over them. "AI good and solve all problems" It reminds me of the "slime mold solves mazes" thing. It was interesting and there were even things to learn from it, but it wasn't magic and couldn't give us better general tools. IanCal wrote 14 hours 54 min ago: NFL theorem is frankly silly. It boils down to "you can't predict random numbers". I'm sure it's useful to have formalized but it has no real impact on real world situations because most problems we care about have some kind of structure we can exploit. woopsn wrote 4 hours 41 min ago: Yes significantly. But take for example the stock market, which is obviously of interest, to OAI in particular, but which cannot over the medium-long term be extrapolated accurately from historical data. nextaccountic wrote 14 hours 37 min ago: Well when I asked "How does the worst case for an algorithm like this look like?" I fully expected the answer to be "in practice this never happens" (that is, there are bad cases but since we aren't drawing inputs at random, in practice we never hit them) But how likely is that? reaperman wrote 15 hours 10 min ago: The "No free lunch" theorem does have some prequisites for it to actually apply. > There is no free lunch in search if and only if the distribution on objective functions is invariant under permutation of the space of candidate solutions. kevindamm wrote 19 hours 14 min ago: Well, it's still using A* and state exploration to validate the plans generated. But, it's generating plans nondeterministically so it is possible it could run in O(\inf). Why didn't they give more details about the Sokoban puzzles being solved? The standard benchmark is the 90 puzzles in XSokoban and there's a Large Test Suite that has a few thousand but many of them haven't been solved by any search or planning system. Even the 90 puzzles in XSokoban were only solved a few years ago (2020?) by Festival (using FESS, a feature-space search in addition to domain-space, state-space). Before that it had been about 20 years of only 57 or so levels solved via search. I see that they measure trace lengths but I would have really liked to see # of XSokoban levels and how many states were examined, to line up with existing research. a_wild_dandan wrote 21 hours 32 min ago: Ah, I remember reading this paper! Essentially, they created synthetic data by solving search problems using A*. Trained on this data, transformers unsurprisingly learned to solve these problems. They then improved their synthetic data by repeatedly solving a given problem many times with A*, and keeping only the shortest step solution. Transformers learned to be competitive with this improved search heuristic too! Pretty remarkable stuff. Given the obsession over chat bots, folk often miss how revolutionary transformer sequence modeling is to...well, any sequence application that isn't a chat bot. Looking solely at speeding up scientific simulations by ~10x, it's a watershed moment for humanity. When you include the vast space of other applications, we're in for one wild ride, y'all. electricships wrote 13 hours 17 min ago: After AlphaZero, people tried to train various combinatorial solvers IIRC, they were unable to exceed Sota of SAT solvers (which admittedly are the results of decades of research) maybe we just needed bigger networks, or transformers, or just more training compute lou1306 wrote 8 hours 26 min ago: SAT appears to be rather impervious to statistical methods because it is very non-linear and non-local (besides, of course, being NP-complete). Flip one single literal anywhere and poof!, your formula mght go from SAT to UNSAT, or vice versa, or you might enable a ton of simplifications, etc. So it might be very hard to train a network on such a riotous state space. Additionally, in practice SAT solvers are also supposed to provide witnesses (i.e., a satisfying assignment or an unsatisfiability core) to back their verdicts, and that may be an ever taller order for ML. math_dandy wrote 6 hours 0 min ago: Since all of these combinatorial optimization problems are, in full generality, are as difficult as possible in the complexity theoretic sense, there is little prospect of AI/ML helping to solve these problems in general. (?) But people hope that some subclass of "real-life" instances of one of these optimization problems has enough structure that AI/ML may be able to learn useful heuristics for it. (This is not my field. This is just my take-away from reading some AI/ML-for-CO papers.) davidguetta wrote 10 hours 16 min ago: Mm theres one purely based on position analysis and not tree search that has GM level in chess. From deepmind radomir_cernoch wrote 12 hours 33 min ago: I'd love to read more. Do you know any sources? algo_trader wrote 9 hours 41 min ago: Try Reinforcement Learning for Combinatorial Optimization: A Survey or a more recent and spirited discussion on reddit.. URI [1]: https://old.reddit.com/r/reinforcementlearning/comments/... nickpsecurity wrote 5 hours 30 min ago: Link for those who need it: URI [1]: https://arxiv.org/abs/2003.03600 unraveller wrote 20 hours 23 min ago: what are those other applications? so now everyone has SoTA warehouse scheduling in their pockets no one can be expected to do anything or go anywhere inefficiently. jadbox wrote 20 hours 24 min ago: Remarkable indeed, I've been toying with many smaller, unorthodox use-cases for LLMs and continue to be surprised. littlestymaar wrote 5 hours 19 min ago: It uses transformers but it's not an llm though. jasonjmcghee wrote 17 hours 40 min ago: Can you say more? Where can I read about some of these alternate use cases? ebri wrote 14 hours 26 min ago: yes I want to learn about this too! Hadn't thought about it like that. smcin wrote 19 hours 10 min ago: I want to see an optimal (2P) speedrun with Sonic the Hedgehog and Tails. As a fun proof. There are two players working together; also it has more degrees-of-freedom than a Sokoban. yeldarb wrote 1 day ago: > While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks and present Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard Aâ search. Searchformer is an encoder-decoder Transformer model trained to predict the search dynamics of Aâ. This model is then fine-tuned via expert iterations to perform fewer search steps than Aâ search while still generating an optimal plan. In our training method, Aâ's search dynamics are expressed as a token sequence outlining when task states are added and removed into the search tree during symbolic planning. In our ablation studies on maze navigation, we find that Searchformer significantly outperforms baselines that predict the optimal plan directly with a 5-10Ã smaller model size and a 10Ã smaller training dataset. We also demonstrate how Searchformer scales to larger and more complex decision making tasks like Sokoban with improved percentage of solved tasks and shortened search dynamics. Neat; TIL about Sokoban puzzles. I remember playing Chip's Challenge on Windows 3.1 when I was a kid which had a lot of levels like that. Modified3019 wrote 7 hours 25 min ago: It's definitely a fun little puzzle. I was introduced to it ages ago via "Wyvern" ( [1] ) where I'd basically sit around playing it and chatting. Wyvern was heavily influenced by Crossfire ( [2] ) which had a not so functional version. Crossfire itself was influenced by nethack, which apparently also has sokoban levels. URI [1]: https://web.archive.org/web/20040225005408/http://www.caboch... URI [2]: https://crossfire.real-time.com/ passion__desire wrote 17 hours 51 min ago: Jonathan Blow is working on "Sokoban"-Inspired Puzzle Game. I am looking forward to its release. smcin wrote 19 hours 7 min ago: Sokoban puzzles have a discrete set of actions, typically 8 (move U/D/L/R ; drag U/D/L/R), on a discrete grid. airstrike wrote 9 hours 53 min ago: more like push than drag tbf DIR <- back to front page