gopher://codevoid.de/1/hn/comments

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   The Math Behind GANs (2020)
       
       
        ilzmastr wrote 7 hours 22 min ago:
        If you ever wondered about the generalization to multiple classes,
        there is a reason that the gans look totally different: [1] It turns
        out 2 classes is special. Better to add the classes as side information
        rather than try to make it part of the main objective.
        
   URI  [1]: https://proceedings.mlr.press/v137/kavalerov20a/kavalerov20a.p...
       
        staticelf wrote 10 hours 52 min ago:
        Reading an article like this makes me realize I am too stupid to ever
        build a foundation model from scratch.
       
          woah wrote 7 hours 21 min ago:
          The big zig zaggy "E" is a for loop. That's all you really have to
          know
       
          oersted wrote 9 hours 6 min ago:
          Paper authors (and this posts author apparently) like to throw in
          lots of scary-looking maths to signal that they are smart and that
          what they are doing has merit. The Reinforcement Learning field is
          particularly notorious for doing this, but it's all over ML. Often it
          is not on purpose, everyone is taught this is the proper "formal" way
          to express these things, and that any other representation is not
          precise or appropriate in a scientific context.
          
          In practice, when it comes down to code, even without higher-level
          libraries, it is surprisingly simple, concise and intuitive.
          
          Most of the math elements used have quite straightforward properties
          and utility, but of course if you combine them all together into big
          expressions with lots of single-character variables, it's really hard
          to understand for everyone. You kind of need to learn to squint your
          eyes and understand the basic building-blocks that the maths
          represent, but that shouldn't be necessary if it wasn't obfuscated
          like this.
       
            catgary wrote 7 hours 42 min ago:
            Iâm going to push back on this a bit. I think a simpler
            explanation (or at least one that doesnât involve projecting
            oneâs own insecurities onto the authors) is that the people who
            write these papers are generally comfortable enough with
            mathematics that they donât believe anything has been obfuscated.
            ML is a mathematical science and many people in ML were trained as
            physicists or mathematicians (Iâm one of them). People write
            things this way because it makes symbolic manipulations easier and
            you can keep the full expression in your head; what youâre
            proposing would actually make it significantly harder to verify
            results in papers.
       
              Garlef wrote 7 hours 19 min ago:
              Maybe.
              
              But my experience as a mathematician tells me another part of
              that story.
              
              Certain fields are much more used to consuming (and producing)
              visual noise in their notation!
              
              Some fields have even superfluous parts in their definitions and
              keep them around out of tradition.
              
              It's just as with code: Not everyone values writing readable code
              highly. Some are fine with 200 line function bodies.
              
              And refactoring mathematics is even harder: There's no single
              codebase and the old papers don't disappear.
       
                catgary wrote 5 hours 19 min ago:
                Maybe! Iâve found that people usually donât do extra work
                if they donât need to. The heavy notation in differential
                geometry, for example, can be awfully helpful when youâre
                actually trying to do Lagrangian mechanics on a Riemannian
                manifold. And superfluous bits of a definition might be kept
                around because going from the minimal definition to the one
                that is actually useful in practice can sometimes be
                non-trivial, so youâll just keep the âsuperfluousâ
                definition in your head.
       
              voidhorse wrote 7 hours 33 min ago:
              Agreed. Also, fwiw, the mathematics involved in the paper are
              pretty simple as far as mathematical sophistication goes. Spend
              two to three months on one "higher level" maths course of your
              choosing and you'll be able to fully understand every equation in
              this paper relatively easily. Even a basic course in information
              theory coupled with some discrete maths should give you
              essentially all you need to comprehend the math in this post. The
              concepts being presented here are not mysterious and much of this
              math is banal. Mathematical notation can seem foreboding, but
              once you grasp it, you'll see, like Von Neumann said, that life
              is complicated but math is simple.
       
                gcanyon wrote 5 hours 48 min ago:
                > like Von Neumann said, that life is complicated but math is
                simple
                
                Maybe for Von Neumann math was simple...
       
            MattPalmer1086 wrote 8 hours 42 min ago:
            Haha, recognise.  I invented a fast search algorithm and worked
            with some academics to publish a paper on it last year.
            
            They threw in all the complex math to the paper.  I could not
            initially understand it at all despite inventing the damn
            algorithm!
            
            Having said that, picking it apart and taking a little time with
            it, it actually wasn't that hard - but it sure looked scary and
            incomprehensible at first!
       
          hoppp wrote 10 hours 24 min ago:
          It takes a while to get into, just like with everything determination
          is key
          
          Also there are libraries that abstract away most if not all the
          things, so you don't have to know everything
       
            staticelf wrote 10 hours 12 min ago:
            That's the thing, it's too hard to learn so I rather do something
            else with the limited time I have left.
       
              gregorygoc wrote 9 hours 26 min ago:
              I come from a country which had a strong Soviet influence, and in
              school basically we were taught that behind every hard formula
              lies an intuitive explanation. As otherwise, thereâs no way to
              come up with the formula in the first place.
              
              This statement is not true, there are counter examples I
              encountered in my university studies but I would say that
              intuition will get you very far. Einstein was able to come up
              with special theory of relativity by just manipulating mental
              models after all. Only when he tried to generalize it, thatâs
              when he hit the limit of the claim I learned in school.
              
              That being said after abandoning intuition, relying on pure
              mathematical reasoning drives you to a desired place and from
              there you usually can reason about the theorem in an intuitive
              way again.
              
              Math in this paper is not that hard to learn, you just need
              someone to present you the key idea.
       
                aeonik wrote 6 hours 53 min ago:
                I wasn't taught this, but came to this conclusion after much
                struggle, and I think this mentality has served me very well.
                
                I hope anyone who is unsure will read your comment and at least
                try to follow it for a while.
       
          reactordev wrote 10 hours 27 min ago:
          Haha, I was just going to say the same. I was hoping, I guess
          naively, that this would explain the math. Not just show me math.
          While I love a good figure, I like pseudocode just as much :)
       
        colesantiago wrote 10 hours 54 min ago:
        Aren't GANs like ancient?
        
        Last time I used a GAN was in 2015, still interesting to see a post
        about GANs now and then.
       
          mindcrime wrote 6 hours 10 min ago:
          Turing machines are ancient as well.
       
          programjames wrote 7 hours 1 min ago:
          They're used as a small regularization term in image/audio decoders.
          But GANs have a different learning dynamic (Z6 rather than Z1 or Z2)
          which makes them pretty unstable to train unless you're using
          something like Bayesian neural networks, so they fell out of favor
          for the entire image generation process.
       
          gchadwick wrote 8 hours 24 min ago:
          Whilst it's maybe not worth studying them in detail I'd say being
          aware of their existence and roughly how they work is still useful.
          Seeing the many varied ways people have done things with neural
          networks can be useful inspiration for your own ideas and perhaps the
          ideas and techniques behind GANs will find a new life or a new
          purpose.
          
          Yes you can just concentrate on the latest models but if you want a
          better grounding in the field some understanding of the past is
          important. In particular reusing ideas from the past in a new way
          and/or with better software/hardware/datasets is a common source of
          new developments!
       
          aDyslecticCrow wrote 8 hours 28 min ago:
          GAN is not a architecture its a training method. As the models
          themselves change underneath, GAN remain relevant. (Just as you see
          autoencoder still being used as a term in new published works, which
          is even older.)
          
          Though if you can rephrase the problem into a diffusion it seems to
          be prefered these days. (Less prone to mode collapse)
          
          Gan is famously used for generative usecases, but has wide uses for
          creating useful latent spaces with limited data, and show up in
          few-shot-learning-papers. (Im actually not that up to speed on the
          state of art in few-shot so mabie they have something clever that
          replace it)
       
          sylos wrote 9 hours 20 min ago:
          The article is from 2020, so it would be closer to relevancy back
          then.
       
          black_puppydog wrote 10 hours 19 min ago:
          Yeah, title needs (2020) added.
          
          GANs were fun though. :)
       
          radarsat1 wrote 10 hours 48 min ago:
          Whenever someone says this I like to point out that they are very
          often used to train the VAE and VQVAE models that LDM models use. 
          Slowly diffusion is encroaching on its territory with 1-step models,
          however, and there are now alternative methods to generate rich
          latent spaces and decoders too, so this is changing, but I'd say up
          until last year most of the image generators still used an
          adversarial objective for the encoder-decoder training.  This year,
          not sure..
       
            pilooch wrote 9 hours 23 min ago:
            Exactly, for real time applications VTO, simulators,...), i.e.
            60+FPS, diffusion can't be used efficiently. The gap is still there
            afaik. One lead has been to distill DPM into GANs, not sure this
            works for GANs that are small enough for real time.
       
            lukeinator42 wrote 9 hours 28 min ago:
            they're also used a lot for training current TTS and audio codec
            models to output speech that sounds realistic.
       
          GaggiX wrote 10 hours 50 min ago:
          Adversarial loss is still use on most image generators,
          diffusion/autoregressive models work on a latent space (they don't
          have to, but it would incredibly inefficient) created by an
          autoencoder, these autoencoders are trained on several losses,
          usually L1/L2, LPIPS and adversarial.
       
       
   DIR <- back to front page