_______ __ _______ | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| on Gopher (inofficial) URI Visit Hacker News on the Web COMMENT PAGE FOR: URI The Math Behind GANs (2020) ilzmastr wrote 7 hours 22 min ago: If you ever wondered about the generalization to multiple classes, there is a reason that the gans look totally different: [1] It turns out 2 classes is special. Better to add the classes as side information rather than try to make it part of the main objective. URI [1]: https://proceedings.mlr.press/v137/kavalerov20a/kavalerov20a.p... staticelf wrote 10 hours 52 min ago: Reading an article like this makes me realize I am too stupid to ever build a foundation model from scratch. woah wrote 7 hours 21 min ago: The big zig zaggy "E" is a for loop. That's all you really have to know oersted wrote 9 hours 6 min ago: Paper authors (and this posts author apparently) like to throw in lots of scary-looking maths to signal that they are smart and that what they are doing has merit. The Reinforcement Learning field is particularly notorious for doing this, but it's all over ML. Often it is not on purpose, everyone is taught this is the proper "formal" way to express these things, and that any other representation is not precise or appropriate in a scientific context. In practice, when it comes down to code, even without higher-level libraries, it is surprisingly simple, concise and intuitive. Most of the math elements used have quite straightforward properties and utility, but of course if you combine them all together into big expressions with lots of single-character variables, it's really hard to understand for everyone. You kind of need to learn to squint your eyes and understand the basic building-blocks that the maths represent, but that shouldn't be necessary if it wasn't obfuscated like this. catgary wrote 7 hours 42 min ago: Iâm going to push back on this a bit. I think a simpler explanation (or at least one that doesnât involve projecting oneâs own insecurities onto the authors) is that the people who write these papers are generally comfortable enough with mathematics that they donât believe anything has been obfuscated. ML is a mathematical science and many people in ML were trained as physicists or mathematicians (Iâm one of them). People write things this way because it makes symbolic manipulations easier and you can keep the full expression in your head; what youâre proposing would actually make it significantly harder to verify results in papers. Garlef wrote 7 hours 19 min ago: Maybe. But my experience as a mathematician tells me another part of that story. Certain fields are much more used to consuming (and producing) visual noise in their notation! Some fields have even superfluous parts in their definitions and keep them around out of tradition. It's just as with code: Not everyone values writing readable code highly. Some are fine with 200 line function bodies. And refactoring mathematics is even harder: There's no single codebase and the old papers don't disappear. catgary wrote 5 hours 19 min ago: Maybe! Iâve found that people usually donât do extra work if they donât need to. The heavy notation in differential geometry, for example, can be awfully helpful when youâre actually trying to do Lagrangian mechanics on a Riemannian manifold. And superfluous bits of a definition might be kept around because going from the minimal definition to the one that is actually useful in practice can sometimes be non-trivial, so youâll just keep the âsuperfluousâ definition in your head. voidhorse wrote 7 hours 33 min ago: Agreed. Also, fwiw, the mathematics involved in the paper are pretty simple as far as mathematical sophistication goes. Spend two to three months on one "higher level" maths course of your choosing and you'll be able to fully understand every equation in this paper relatively easily. Even a basic course in information theory coupled with some discrete maths should give you essentially all you need to comprehend the math in this post. The concepts being presented here are not mysterious and much of this math is banal. Mathematical notation can seem foreboding, but once you grasp it, you'll see, like Von Neumann said, that life is complicated but math is simple. gcanyon wrote 5 hours 48 min ago: > like Von Neumann said, that life is complicated but math is simple Maybe for Von Neumann math was simple... MattPalmer1086 wrote 8 hours 42 min ago: Haha, recognise. I invented a fast search algorithm and worked with some academics to publish a paper on it last year. They threw in all the complex math to the paper. I could not initially understand it at all despite inventing the damn algorithm! Having said that, picking it apart and taking a little time with it, it actually wasn't that hard - but it sure looked scary and incomprehensible at first! hoppp wrote 10 hours 24 min ago: It takes a while to get into, just like with everything determination is key Also there are libraries that abstract away most if not all the things, so you don't have to know everything staticelf wrote 10 hours 12 min ago: That's the thing, it's too hard to learn so I rather do something else with the limited time I have left. gregorygoc wrote 9 hours 26 min ago: I come from a country which had a strong Soviet influence, and in school basically we were taught that behind every hard formula lies an intuitive explanation. As otherwise, thereâs no way to come up with the formula in the first place. This statement is not true, there are counter examples I encountered in my university studies but I would say that intuition will get you very far. Einstein was able to come up with special theory of relativity by just manipulating mental models after all. Only when he tried to generalize it, thatâs when he hit the limit of the claim I learned in school. That being said after abandoning intuition, relying on pure mathematical reasoning drives you to a desired place and from there you usually can reason about the theorem in an intuitive way again. Math in this paper is not that hard to learn, you just need someone to present you the key idea. aeonik wrote 6 hours 53 min ago: I wasn't taught this, but came to this conclusion after much struggle, and I think this mentality has served me very well. I hope anyone who is unsure will read your comment and at least try to follow it for a while. reactordev wrote 10 hours 27 min ago: Haha, I was just going to say the same. I was hoping, I guess naively, that this would explain the math. Not just show me math. While I love a good figure, I like pseudocode just as much :) colesantiago wrote 10 hours 54 min ago: Aren't GANs like ancient? Last time I used a GAN was in 2015, still interesting to see a post about GANs now and then. mindcrime wrote 6 hours 10 min ago: Turing machines are ancient as well. programjames wrote 7 hours 1 min ago: They're used as a small regularization term in image/audio decoders. But GANs have a different learning dynamic (Z6 rather than Z1 or Z2) which makes them pretty unstable to train unless you're using something like Bayesian neural networks, so they fell out of favor for the entire image generation process. gchadwick wrote 8 hours 24 min ago: Whilst it's maybe not worth studying them in detail I'd say being aware of their existence and roughly how they work is still useful. Seeing the many varied ways people have done things with neural networks can be useful inspiration for your own ideas and perhaps the ideas and techniques behind GANs will find a new life or a new purpose. Yes you can just concentrate on the latest models but if you want a better grounding in the field some understanding of the past is important. In particular reusing ideas from the past in a new way and/or with better software/hardware/datasets is a common source of new developments! aDyslecticCrow wrote 8 hours 28 min ago: GAN is not a architecture its a training method. As the models themselves change underneath, GAN remain relevant. (Just as you see autoencoder still being used as a term in new published works, which is even older.) Though if you can rephrase the problem into a diffusion it seems to be prefered these days. (Less prone to mode collapse) Gan is famously used for generative usecases, but has wide uses for creating useful latent spaces with limited data, and show up in few-shot-learning-papers. (Im actually not that up to speed on the state of art in few-shot so mabie they have something clever that replace it) sylos wrote 9 hours 20 min ago: The article is from 2020, so it would be closer to relevancy back then. black_puppydog wrote 10 hours 19 min ago: Yeah, title needs (2020) added. GANs were fun though. :) radarsat1 wrote 10 hours 48 min ago: Whenever someone says this I like to point out that they are very often used to train the VAE and VQVAE models that LDM models use. Slowly diffusion is encroaching on its territory with 1-step models, however, and there are now alternative methods to generate rich latent spaces and decoders too, so this is changing, but I'd say up until last year most of the image generators still used an adversarial objective for the encoder-decoder training. This year, not sure.. pilooch wrote 9 hours 23 min ago: Exactly, for real time applications VTO, simulators,...), i.e. 60+FPS, diffusion can't be used efficiently. The gap is still there afaik. One lead has been to distill DPM into GANs, not sure this works for GANs that are small enough for real time. lukeinator42 wrote 9 hours 28 min ago: they're also used a lot for training current TTS and audio codec models to output speech that sounds realistic. GaggiX wrote 10 hours 50 min ago: Adversarial loss is still use on most image generators, diffusion/autoregressive models work on a latent space (they don't have to, but it would incredibly inefficient) created by an autoencoder, these autoencoders are trained on several losses, usually L1/L2, LPIPS and adversarial. DIR <- back to front page