_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Pass-by-Value Overhead
       
       
        jklowden wrote 5 hours 40 min ago:
        There is no pass-by-value overhead.  There are only implementation
        decisions.
        
        Pass by value describes the semantics of a function call, not
        implementation. Passing a const reference in C++ is pass-by-value.  If
        the user opts to pass "a copy" instead, nothing requires the compiler
        to actually copy the data. The compiler is required only to supply the
        actual parameter as if it was copied.
       
          mattnewport wrote 5 hours 0 min ago:
          This might be true in the abstract but it's not true of actual
          compilers dealing with real world calling conventions. Absent
          inlining or whole program optimization, calling conventions across
          translation units don't leave much room for flexibility.
          
          The semantics of pass by const reference are also not exactly the
          same as pass by value in C++. The compiler can't in general assume a
          const reference doesn't alias other arguments or global variables and
          so has to be more conservative with certain optimizations than with
          pass by value.
       
          duped wrote 5 hours 6 min ago:
          Unfortunately "the compiler is required to supply the actual
          parameter as if it was copied" is leaky with respect to the ABI and
          linker. In C and C++ you cannot fully abstract it.
       
        codedokode wrote 7 hours 53 min ago:
        I usually use ChatGPT for such microbenchmarks (of course I design it
        myself and use LLM only as dumb code generator, so I don't need to
        remember how to measure time with nanosecond precision. I still have to
        add workarounds to prevent compiler over-optimizing the code). It's
        amazing, that when you get curious (for example, what is the fastest
        way to find an int in a small sorted array: using linear, binary search
        or branchless full scan?) you can get the answer in a couple minutes
        instead of spending 20-30 minutes writing the code manually.
        
        By the way, the fastest way was branchless linear scan up to 32-64
        elements, as far as I remember.
       
        anonymous908213 wrote 9 hours 15 min ago:
        > Don’t pass around data of size 4046-4080 bytes or 8161-8176 bytes,
        by value (at least not on an AMD Ryzen 3900X).
        
        What a fascinating CPU bug. I am quite curious as to how that came to
        pass.
       
          srcmax wrote 2 hours 56 min ago:
          That's called 4k aliasing.
          4K aliasing occurs when you store one memory location, then load from
          another memory location which is 4KB offset from original.
       
          jasonthorsness wrote 5 hours 47 min ago:
          Apparently some sizes are cursed!
          
          It would be great to repeat the author’s tests on other CPU models
       
            TuxSH wrote 3 hours 39 min ago:
            I wonder what the page size is on his system (and what effective
            alignment his pointers have). If it's 4K, the sizes look really
            close to 0x1000 and 0x2000 - maybe crossing page boundaries?
       
          sgarland wrote 6 hours 49 min ago:
          Me too, and I hope this article gets more traction.
       
       
   DIR <- back to front page