gopher://codevoid.de/1/hn/comments

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Anything can be a message queue if you use it wrongly enough (2023)
       
       
        1718627440 wrote 2 hours 20 min ago:
        I don't really know how S3 works, but either its a single system, then
        you reinvented the use case of local sockets, or what is way more
        likely, it's already implemented on a TCP/IP network, then
        congratulations you just invented IP over IP.
       
        DoneWithAllThat wrote 4 hours 39 min ago:
        Corollary: every message queue can be a database if you use it wrongly
        enough.
       
          rcleveng wrote 3 hours 38 min ago:
          Generalized a bit: everything can be a database if you use it wrongly
          enough.
       
        adamcharnock wrote 4 hours 47 min ago:
        I had a developer colleague a while back who was toying with an idea
        that would require emitting and consuming a _lot_ of messages. I think
        it was somewhere on the order of 10k-100k/second. He was looking at
        some pretty expensive solutions IIRC.
        
        I asked if the messages were all under 1.5kb, he said yes. I asked if
        at-most-one delivery was ok, he said yes. So I proposed he just grab a
        router and fire messages through it as UDP packets, then use BGP/ECMP
        to balance the packets between receivers. Add some queues on the
        router, then just let the receivers pick up the packets as fast as they
        could. You'd need some kind of feedback to manage back pressure, but
        Â¯\_(ã)_/Â¯
        
        A fairly cheap way to achieve 1M+ messages per second.
        
        I never got the chance to flesh-out the idea fully, but the simplicity
        of it tickled me. Maybe it would have worked, maybe not.
       
          HeyLaughingBoy wrote 4 hours 30 min ago:
          Isn't a fundamental property of a queue that it's FIFO?
          
          UDP message delivery order is not guaranteed. Hell, UDP delivery
          itself is not guaranteed (although IME, messages don't usually get
          dropped unless they cross subnets).
       
            adamcharnock wrote 3 hours 58 min ago:
            > UDP message delivery order is not guaranteed
            
            My thinking was that ordering would be pretty unaffected when there
            is only a single hop. But yeah, we would have needed to test that
            under load.
       
            bdcravens wrote 4 hours 25 min ago:
            > UDP delivery itself is not guaranteed
            
            > I asked if at-most-one delivery was ok, he said yes.
            
            Use case satisfied.
       
              trelane wrote 3 hours 14 min ago:
              >> UDP delivery itself is not guaranteed
              
              >> I asked if at-most-one delivery was ok, he said yes.
              
              > Use case satisfied.
              
              No. [1] "ConnectionlessProtocols such as UDP won't detect
              duplicate packets, because there's no information in, for
              example, the UDP header to identify a packet so that packets can
              be recognized as duplicates. The data from that packet will be
              indicated twice (or even more) to the application; it's the
              responsibility of the application to detect duplicates (perhaps
              by supplying enough information in its headers to do so) and
              process them appropriately, if necessary"
              
   URI        [1]: https://wiki.wireshark.org/DuplicatePackets
       
        dang wrote 4 hours 47 min ago:
        Discussed at the time:
        
        Anything can be a message queue if you use it wrongly enough - [1] -
        June 2023 (239 comments)
        
   URI  [1]: https://news.ycombinator.com/item?id=36186176
       
        rented_mule wrote 4 hours 52 min ago:
        In the 1990s, I was at a startup that had a need for a message queue.
        The only thing we found at the time was a product from TIBCO that was
        priced way-way-way out of our reach. IIRC, it didn't even run on PCs,
        only mainframes and minis. Microsoft Exchange Server (Microsoft's email
        server) had just been released at the time, and we decided to use it as
        a message queue.
        
        Message-submitting clients used SMTP libraries. Message-consuming
        clients used Exchange APIs. Consumers would only look at unread
        messages, they would mark messages as read when they started
        processing, and move them to a folder other than the Inbox if they
        succeeded. Many of the queues were multi-producer, but all queues were
        single-consumer (CPUs were pricey at the time - our servers were all
        Pentiums and Pentium Pros), which simplified things a lot.
        
        Need a new queue / topic? Add an email address. Need to inspect a
        queue? Load up an email client. An unexpected benefit was that we could
        easily put humans in the loop for handling certain queues (using HTML
        in the messages).
        
        It worked surprisingly well for the 5 years that the company was
        around. Latency was okay, but not great. Throughput was much better
        than we would have hoped for - Exchange was almost never the
        bottleneck.
       
          1718627440 wrote 2 hours 33 min ago:
          Honestly that's not even abuse, this is what a mail delivery system
          truly is.
       
          supportengineer wrote 3 hours 58 min ago:
          I can assure you that various companies are still doing this.  It
          still works and for all the same reasons as you list.
       
        dwedge wrote 4 hours 59 min ago:
        I thought the "multiple anime personalities explaining things to each
        other" style of tech blogging was so 2018
       
          unmotivated-hmn wrote 4 hours 57 min ago:
          My first time seeing it. I was somewhat pleasantly confused.
       
        packetlost wrote 5 hours 19 min ago:
        I once had a coworker use GitLab + a git repo + webhooks to implement a
        queued event system. Some change (I think it was in Jenkins) would call
        a webhook which would append to some JSON array in a repo, commit it,
        which would itself trigger something else downstream. It was horrifying
        and glorious.
       
        devmor wrote 5 hours 24 min ago:
        This is utterly incredible and inspiring in the worst way. Mad
        engineering!
       
        spectraldrift wrote 5 hours 24 min ago:
        People often forget a message queue is just a simple, high-throughput
        state machine.
        
        It's tempting to roll your own by polling a database table, but that
        approach breaks down- sometimes even at fairly low traffic levels. Once
        you move beyond a simple cron job, you're suddenly fighting row locking
        and race conditions just to prevent significant duplicate processing;
        effectively reinventing a wheel, poorly (potentially 5 or 10 times in
        the same service).
        
        A service like SQS solves this with its state management. A message
        becomes 'invisible' while being processed. If it's not deleted within
        the configurable visibility timeout, it transitions back to available.
        That 'fetch next and mark invisible' state transition is the key, and
        it's precisely what's so difficult to implement correctly and
        performantly in a database every single time you need it.
       
          groone wrote 5 hours 13 min ago:
          Message becomes invisible in a regular relational database when using
          `SELECT FOR UPDATE SKIP LOCKED`
       
            spectraldrift wrote 3 hours 27 min ago:
            That's totally feasible, and works for small to medium traffic (SQS
            scales seamlessly from 1 message per year to millions per second).
            
            In practice, I've never seen this implemented correctly in the
            wild- most people don't seem to care enough to handle the
            transactions properly. Additionally, if you want additional
            features like DLQs or metrics on stuck message age, you'll end up
            with a lot more complexity just to get parity with a standard queue
            system.
            
            A common library could help with this though.
       
            kerblang wrote 4 hours 49 min ago:
            Overall it's completely feasible to build a message queue with
            RDBMS _because_ they have locking. You might end up doing extra
            work compared to some other products that make message queueing
            easy/fun/so-simple-caveman-etc.
            
            Now if SQS has some super-scalar mega-cluster capability where one
            instance can deliver 100 billion messages a day across the same
            group of consumers, ok, I'm impressed, because most MQ's can't,
            because... locking. Thus Kafka (which is not a message queue).
            
            I think the RDBMS MQ should be treated as the "No worse than this"
            standard - if my fancy new message queueing product is even harder
            to set up, it isn't worth your trouble. But SQS itself IS pretty
            easy to use.
       
        no_thank_you wrote 5 hours 30 min ago:
        The truly cursed thing in the article is this bit near the end (unless
        this is part of the satire):
        
        "Something amusing about this is that it is something that technically
        steps into the realm of things that my employer does. This creates a
        unique kind of conflict where I can't easily retain the intellectial
        property (IP) for this without getting it approved from my employer. It
        is a bit of the worst of both worlds where I'm doing it on my own time
        with my own equipment to create something that will be ultimately owned
        by my employer. This was a bit of a sour grape at first and I almost
        didn't implement this until the whole Air Canada debacle happened and I
        was very bored."
       
          mananaysiempre wrote 5 hours 23 min ago:
          Yes, I guess this is how we learn that Tailscale will lay claim to
          things you do on your own time using your own machine.
       
        stego-tech wrote 5 hours 42 min ago:
        This is beyond cursed and I love it.
       
        stephenlf wrote 6 hours 6 min ago:
        Remember when Amazon Video moved from serverless back to a monolith
        because they were using S3 for storing video streams for near realtime
        processing? This feels the same. Except Amazon Video is an actual
        company trying to build real software.
        
        Amazon Videoâs original blog post is gone, but here is a third party
        writeup.
        
   URI  [1]: https://medium.com/@hellomeenu1/why-amazon-prime-video-reverte...
       
          thrance wrote 5 hours 31 min ago:
          IIRC they were storing individual frames in S3 buckets and hitting
          their own internal lambda limits. Funny story tbh.
       
            mikepurvis wrote 5 hours 15 min ago:
            Has a lot of âorders from on high to dog food all the thingsâ
            energy.
       
              breppp wrote 4 hours 12 min ago:
              My guess was "no real cost accounting for service usage
              internally, until one day zero interest ends and a VP changes
              that"
       
            LeifCarrotson wrote 5 hours 15 min ago:
            You remember correctly:
            
            > The main scaling bottleneck in the architecture was the
            orchestration management that was implemented using AWS Step
            Functions. Our service performed multiple state transitions for
            every second of the stream, so we quickly reached account limits.
            Besides that, AWS Step Functions charges users per state
            transition.
            
            > The second cost problem we discovered was about the way we were
            passing video frames (images) around different components. To
            reduce computationally expensive video conversion jobs, we built a
            microservice that splits videos into frames and temporarily uploads
            images to an Amazon Simple Storage Service (Amazon S3) bucket.
            Defect detectors (where each of them also runs as a separate
            microservice) then download images and processed it concurrently
            using AWS Lambda. However, the high number of Tier-1 calls to the
            S3 bucket was expensive.
            
            They were really deeply drinking the AWS serverless kool-aid if
            they thought the right way to stream video was multiple
            microservices accessing individual frames on S3...
       
              zoogeny wrote 3 hours 8 min ago:
              I haven't read the entire article, but just based on the snippets
              you posted it doesn't look like they were streaming video using
              this process. It sounds like they were doing defect detection.
              
              I would guess this was part of a process when new videos were
              uploaded and transcoded to different formats. Likely they were
              taking transcoded frames at some sample rate and uploading them
              to S3 where some workers were then analyzing the images to look
              for encoding artifacts.
              
              This would most likely be a one-time sanity check for new videos
              that have to go through some conversion pipelines. However, once
              converted to their final form I would suspect the video files are
              statically distributed using a CDN.
       
              wat10000 wrote 3 hours 48 min ago:
              Every time they order Chinese takeout, two thousand cars show up,
              each carrying one grain of rice.
       
              pythonaut_16 wrote 5 hours 7 min ago:
              Itâs more honesty that you see from most service providers,
              both dogfooding the approach and not handwaving the costs.
       
            moi2388 wrote 5 hours 22 min ago:
            Thatâs hilarious
       
          lloydatkinson wrote 5 hours 44 min ago:
          They deleted their own post?
          
          It couldnât possibly be because AWS execs were pissed or
          anythingâ¦ /s
       
            Simran-B wrote 5 hours 39 min ago:
            Archived blog post:
            
   URI      [1]: https://web.archive.org/web/20240719152109/https://www.pri...
       
        ranger_danger wrote 6 hours 43 min ago:
        sounds like "parasitic storage" and/or steganography
       
          Kye wrote 5 hours 55 min ago:
          One of my favorite dinosaurs
       
        metadat wrote 6 hours 43 min ago:
        Fiendishly outlandish idea, incredibly wrong that it should even be
        possible for the existence of Hoshino to even have ever been a thought,
        yet here we are.  I love it!
        
        On a related note, have you seen the prices at Whole Foods lately? $6
        for a packet of dehydrated miso soup.  This usually costs $2.50 served
        prepared at a sushi restaurant.  AWS network egress fees are similarly
        blasphemous.
        
        Shame on Amazon, lol.  Though it's really capitalisms fault, if you
        think it through all the way.
       
          BiteCode_dev wrote 5 hours 34 min ago:
          This is not a situation where you have zero alternatives. You have
          ton of cheap hosting out there. Most people using AWS don't need the
          level of reliability and scaling it provides, they pay the price for
          nothing.
       
          sneak wrote 6 hours 26 min ago:
          Why is it Amazonâs fault that people voluntarily choose to use
          Amazon?
          
          Even with the massive margins, cloud computing is far cheaper for
          most SMEs than hiring an FTE sysadmin and racking machines in a colo.
          
          The problem is that people forget to switch back to the old way when
          itâs time.
       
            mrkeen wrote 1 hour 14 min ago:
            Using AWS was supposed the way to avoid the cost of an ops team.
            
            Now every developer also has to be DevOps, learning docker,
            kubernetes and CI systems instead of just focusing on development.
            
            Also we all still have ops teams.
       
            devmor wrote 5 hours 20 min ago:
            > Even with the massive margins, cloud computing is far cheaper for
            most SMEs than hiring an FTE sysadmin and racking machines in a
            colo.
            
            That very much depends on your use case and billing period. Most of
            my public web applications run in a colo in Atlanta on containers
            hosted by less than $2k in hardware and cached by Cloudflare. This
            replaced an AWS/Digitalocean combination that used to bill about
            $400/mo.
            
            Definitely worth it for me, but there are some workloads that
            arenât worth it and I stick with cloud services to handle.
            
            I would estimate that a significant amount of services hosted on
            AWS are paid for by small businesses with less reliability and
            uptime requirements than I have.
       
            immibis wrote 5 hours 54 min ago:
            SMEs hire someone (an MSP) to manage their IT. They don't use AWS
            because AWS services are too low-level. AWS is chosen by people who
            should know better and mostly on the basis of marketing inertia.
            
            Edit: And by people with too much money, which was until recently
            most tech companies.
       
            shermantanktop wrote 6 hours 7 min ago:
            Another of my online lives is on guitar forums (TGP etc), populated
            by diverse set of non-geek characters.    An eternal question that
            comes up is âwhy are they charging so much for this guitar? The
            parts canât be that expensive. I bet I could justâ¦â
            
            And the only viable answer is the olâ capitalist saw: they charge
            what buyers are willing to pay.
            
            That never quite satisfies people though.
       
              DougMerritt wrote 4 hours 6 min ago:
              Why aren't they satisfied with merely pondering strats made in US
              vs Mexico vs Japan vs Indonesia? Careful reviews of quality
              versus price (which of course varied over time) always showed
              more correlation with sometimes-unwarranted reputation than with
              reality.
       
                shermantanktop wrote 1 hour 4 min ago:
                Amongst this crowd, âmy buddy saidâ is data.  Correlation
                analysis is not in the picture.
       
              ecshafer wrote 5 hours 51 min ago:
              Employing labor full time is incredibly expensive in the US. Once
              you include overhead, taxes, benefits, etc. you can easily be
              paying 2x wage for a worker. Not to mention buying the goods. So
              yeah the parts for the guitar might cost X, but then it costs Y
              to store them and Z for the space to assemble them then A to pay
              the workers and B to ship them and C to market. It adds up.
              Without jumping to the EVILS of "Capitalism" a business costs
              money to run. I can't imagine guitar manufacturer margins are
              anything close to techs, probably <5%. Gemini tells me industry
              is around 3.8% so I don't think I am far off.
       
                BiteCode_dev wrote 5 hours 32 min ago:
                In this case, you would need to pay someone anyway. I never
                heard about an AWS account that didn't require at least one
                engineer in charge of it.
       
        IIAOPSW wrote 6 hours 44 min ago:
        Even HN comment sections?
       
          npteljes wrote 4 hours 53 min ago:
          Of course. A message queue is database, and software that handles it
          in a specific way to make it a message queue. So, HN could basically
          be that database backend for that imaginary software that turns it
          into a message queue.
          
          I don't have fun examples with message queues, but I do remember some
          with filesystems - a popular target to connect cursed backends to.
          You can store data in Ping packets [0]. You can store data in the
          digits of Pi - achieving unbelievable compression [1]. You can store
          data in the metadata and other unused blocks of images - also known
          as steganography [2]. People wrote software to use Gmail emails as a
          file system [3].
          
          That's just from the top of my head, and it really shows that sky's
          the limit with software.
          
          [0] [1] [2] [3]
          
   URI    [1]: https://github.com/yarrick/pingfs
   URI    [2]: https://github.com/ajeetdsouza/pifs
   URI    [3]: https://en.wikipedia.org/wiki/Steganographic_file_system
   URI    [4]: https://lwn.net/Articles/99933/
       
          tux3 wrote 6 hours 34 min ago:
          ACK
       
            unmotivated-hmn wrote 4 hours 55 min ago:
            Even HN comment sections?
       
              therein wrote 4 hours 46 min ago:
              At least once delivery.
       
            pwagland wrote 6 hours 2 min ago:
            Although latency is shockingly bad.
       
        pluto_modadic wrote 6 hours 45 min ago:
        muahahahaha, muahahaha!
       
          lstodd wrote 6 hours 38 min ago:
          so very true.
       
        redbell wrote 6 hours 46 min ago:
        On a totally unrelated topic, I once read a meme online that says: "If
        you ever felt useless, remember ueue in queue!"
       
       
   DIR <- back to front page