_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Kafka is Fast – I'll use Postgres
       
       
        asah wrote 1 hour 0 min ago:
        "500 KB/s workload should not use Kafka" - yyyy!!!  indeed, I'm running
        5MBps logging system through a single node RDS instance costing
        <$1000/mon (plus 2x for failover). There's easily 4-10x headroom for
        growth by paying AWS more money and 3-5x+ savings by optimizing the
        data structure.
       
          EdwardDiego wrote 55 min ago:
          I've always said, don't even think about Kafka until you're into
          MiB/s territory.
          
          It's a complex piece of software that solves a complex problem, but
          there's many trade-offs, so only use it when you need to.
       
        dagss wrote 1 hour 22 min ago:
        I really believe this is the way: Event log tables in SQL. I have been
        doing it a lot.
        
        A downside is the lack of tooling client side. For many using Kafka is
        worth it simply for the tooling in libraries consumer side.
        
        If you just want to write an event handler function there is a lot of
        boilerplate to manage around it. (Persisting read cursors etc)
        
        We introduced a company standard for one service pulling events from
        another service that fit well together with events stored in SQL. [1]
        Nowhere close to Kafka's maturity in client side tooling but it is an
        approach for how a library stack could be built on top making this
        convenient and have the same library toolset support many storage
        engines. (On the server/storage side, Postgres is of course as mature
        as Kafka...)
        
   URI  [1]: https://github.com/vippsas/feedapi-spec
       
          hyperbolablabla wrote 55 min ago:
          I for one really dislike Kafka and this looks like a great
          alternative
       
        lmm wrote 2 hours 23 min ago:
        If Kakfa had come first, no-one would ever pick Postgres. Yes, it
        offers a lot of fancy functionality. But most of that functionality is
        overengineered stuff you don't need, and/or causes more problems than
        it solves (e.g. transactions sound great until you have to deal with
        the deadlocks and realise they don't actually help you solve any
        business problems). Meanwhile with no true master-master HA in the base
        system you have to use a single point of failure server or a flaky (and
        probably expensive) third-party addon.
        
        Just use Kafka. Even if you don't need speed or scalability, it's
        reliable, resilient, simple and well-factored, and gives you far fewer
        opportunities to architect your system wrong and paint yourself into a
        corner than Postgres does.
       
        woile wrote 2 hours 39 min ago:
        There are a few things missing I think.
        
        I think kafka makes easy to create an event driven architecture. This
        is particularly useful when you have many teams. They are properly
        isolated from each other.
        
        And with many teams, another problem comes, there's no guarantee that
        queries are gonna be properly written, then postgres' performance may
        be hindered.
        
        Given this, I think using Kafka in companies with many teams can be
        useful, even if the data they move is not insanely big.
       
        udave wrote 2 hours 45 min ago:
        I find the distinction between queue and pub sub system quite poor. A
        pub sub system is just a persistent queue at its core, the only
        distinction is you have multiple queues for each subscriber, hence
        multiple readers. everything else stays the same. Ordering is expected
        to be strict in both cases. The Durability factor is also baked in both
        systems. On the question of bounded and unbounded queue: does not
        message queues also spill to disk in order to prevent OOM scenarios?
       
        aussieguy1234 wrote 2 hours 51 min ago:
        I've found Kafka to be not particularly great with languages other than
        Java, if Confluent schemaregisty is involved.
        
        I had fun working with the schema registy from TypeScript.
       
        spectraldrift wrote 4 hours 0 min ago:
        > Should You Use Postgres? Most of the time - yes
        
        This made me wonder about a tangential statistic that would, in all
        likelihood, be impossible to derive:
        
        If we looked at all database systems running at any given time, what
        proportion does each technology represent (e.g., Postgres vs. MySQL vs.
        [your favorite DB])? You could try to measure this in a few ways: bytes
        written/read, total rows, dollars of revenue served, etc.
        
        It would be very challenging to land on a widely agreeable definition.
        We'd quickly get into the territory of what counts as a "database" and
        whether to include file systems, blockchains, or even paper. Still, it
        makes me wonder. I feel like such a question would be immensely
        interesting to answer.
        
        Because then we might have a better definition of "most of the time."
       
          abtinf wrote 3 hours 5 min ago:
          SQLite likely dominates all other databases combined on the metrics
          you mentioned, I would guess by at least an order of magnitude.
          
          Server side. Client side. iOS, iPad, Mac apps. Uses in every field.
          Uses in aerospace.
          
          Just think for a moment that literally every photo and video taken on
          every iPhone (and I would assume android as well) ends up stored
          (either directly or sizable amounts of metadata) in a SQLite db.
       
        smoyer wrote 6 hours 22 min ago:
        Kafka is fast ... And MongoDB is web scale [0].  I completely agree
        that we shouldn't go chasing each new technical bauble but we are also
        wasting breath on those that do.
        
        0.
        
   URI  [1]: https://youtu.be/b2F-DItXtZs?si=vrB-UxCHIgMYGKFt
       
        tarun_anand wrote 6 hours 30 min ago:
        Couldn't agree more. Have built and ran an in-house postgresql based
        queue for several years. It can handle 5-10k msg/s in our production
        workloads.
       
        brikym wrote 8 hours 11 min ago:
        If you don't mind Redis then use Redis Streams. It gives you an
        eventlog without worrying about postgres performance issues and has
        consumer groups.
       
          tele_ski wrote 7 hours 4 min ago:
          Been using valkey streams recently and loving it. Took a bit to
          understand how to to properly use it but now that I've figured it out
          I'd highly recommend trying it. It's very easy to setup and get going
          and just works.
       
        nchmy wrote 9 hours 38 min ago:
        Seems like instead of a hand-rolled, polling Pub/sub, could instead do
        CDC instead with a golang logical replication/cdc library. There's
        surely various.
        
        Or just use NATS for queues and pubsub - dead simple, can embed in your
        Go app and does much more than Kafka
       
        bmcahren wrote 9 hours 54 min ago:
        A huge benefit of single-database operations at scale is point-in-time
        recovery for the entire system thereby not having to coordinate
        recovery points between data stores. Alternatively, you can treat your
        queue as volatile depending on the purpose.
       
        jeeybee wrote 10 hours 1 min ago:
        If you like the “use Postgres until it breaks” approach, there’s
        a middle ground between hand-rolling and running Kafka/Redis/Rabbit:
        PGQueuer.
        
        PGQueuer is a small Python library that turns Postgres into a durable
        job queue using the same primitives discussed here — `FOR UPDATE SKIP
        LOCKED` for safe concurrent dequeue and `LISTEN/NOTIFY` to wake workers
        without tight polling. It’s for background jobs (not a Kafka
        replacement), and it shines when your app already depends on Postgres.
        
        Nice-to-haves without extra infra: per-entrypoint concurrency limits,
        retries/backoff, scheduling (cron-like), graceful shutdown, simple CLI
        install/migrations. If/when you truly outgrow it, you can move to Kafka
        with a clearer picture of your needs.
        
        Repo: [1] Disclosure: I maintain PGQueuer.
        
   URI  [1]: https://github.com/janbjorge/pgqueuer
       
        jackvanlightly wrote 10 hours 3 min ago:
        > A 500 KB/s workload should not use Kafka
        
        This is a simplistic take. Kafka isn't just about scale, it, like other
        messaging systems provide queue/streaming semantics for applications.
        Sure you can roll your own queue on a database for small use cases, but
        it adds complexity to the lives of developers. You can offload the
        burden of running Kafka by choosing a Kafka-as-a-service vendor, but
        you can't offload the additional work of the developer that comes from
        using a database as a queue.
       
          enether wrote 9 hours 51 min ago:
          The question is the organizational overhead in adopting yet another
          specialized distributed system, which btw frequently is about
          scalability at its core. Kafka's original paper emphasizes this ("We
          introduce Kafka, a
          distributed messaging system that we developed for collecting and
          delivering high volumes of log data with low latency. ", "We made
          quite a few unconventional yet practical design choices in Kafka to
          make our system efficient and scalable.")[1]
          
          To be honest, there isn't a large burden in running Kafka when it's
          500 KB/s. The system is so underutilized there's nothing to cause
          issues with it. But regardless, the organizational burden persists.
          As the piece mentions - "Managed SaaS offerings trade off some of the
          organizational overhead for greater financial costs - but they still
          don’t remove it all.". Some of the burden continues to exist even
          if a vendor hosts the servers for you. The API needs to be adopted,
          the clients have many configs, concepts like consumer groups need to
          be understood, the vendor has its own UI, etc.
          
          The Kafka API isn't exactly the simplest. I wouldn't recommend people
          write the pub-sub-on-postgres SQL themselves - a library should
          abstract it away. What is the complexity being added from a library
          with a simple API? Regardless if that library is based on top of
          Postgres, Kafka or another system - precisely what complexity is
          added to the lives of developers?
          
          I really don't see any complexity existing at this miniscule scale,
          neither at the app developer layer or the infra operator layer. But
          of course, I haven't run this in production so I could be wrong. [1]
          -
          
   URI    [1]: https://notes.stephenholiday.com/Kafka.pdf
       
          cyanf wrote 10 hours 2 min ago:
          There are existing solutions for queues in Postgres, notably pgmq.
       
        rjurney wrote 11 hours 19 min ago:
        One bad message in a Kafka queue and guess what? The entire queue is
        down because it kills your workers over and over. To fix it? You have
        to resize the queue to zero, which means losing requests. This KILLS
        me. Jay Kreps says there is no reason it can't be fixed, but it never
        had been and this infuriates me because it happens so often :)
       
          pram wrote 7 hours 42 min ago:
          You can modify a consumer groups offset to any value JFYI, so you
          really don’t need to purge the topic. You can just start after the
          bad message.
       
        lisbbb wrote 11 hours 23 min ago:
        If you are doing high volume, there is no way that a SQL db is going to
        keep up.  I did a lot of work with Kafka but what we  constantly ran
        into was managing expectations--costs were higher, so the business
        needs to strongly justify why they need their big data toy, and joins
        are much harder, as well as data validation in real time.  It made for
        a frustrating experience most of the time--not due to the tech as much
        as dealing with people who don't understand the costs and benefits.
        
        On the major projects I worked on, we were "instructed" to use Kafka
        for, I guess, internal political reasons.  They already had Hadoop
        solutions that more or less worked, but the code was written by idiots
        in "Spark/Scala" (their favorite buzzword to act all high and mighty)
        and that code had zero tests (it was truly a "test in prod" situation
        there).  The Hadoop system was managed by people who would parcel out
        compute resources politically, as in, their friends got all they wanted
        while everyone else got basically none.  This was a major S&P company,
        Fortune 10, and the internal politics were abusive to say the least.
       
        mbo wrote 12 hours 29 min ago:
        This is an article in desperate need for some data visualizations. I do
        not think it does an effective job of communicating differences in
        performance.
       
        nyrikki wrote 13 hours 29 min ago:
        > The claim isn’t that Postgres is functionally equivalent to any of
        these specialized systems. The claim is that it handles 80%+ of their
        use cases with 20% of the development effort. (Pareto Principle)
        
        Lots of us that built systems when SQL was the only option, know that
        doesn’t hold overtime.
        
        SStable backed systems have their applications, and I have never seen
        dedicated Kafka teams like we used to have with DBAs
        
        We have the tools to make decisions based on real tradeoffs.
        
        I highly recommend people dig into the appropriate tools to select vs
        making pre-selected products fit an unknown problem domain.
        
        Tools are tactics, not strategies, tactics should be changeable with
        the strategic needs.
       
        jdboyd wrote 14 hours 16 min ago:
        While I appreciate the Postgres for everything point of view, and most
        of the times I use other things it could fit in Postgres, there are two
        areas that keep me using RabbitMQ, Redis, or a something like Elastic.
        
        First, I frequently use Celery and Celery doesn't support using
        Postgres as a broker.  It seems like it should, but I guess no one has
        stepped up to write that. So, when I use Celery, I end up also using
        Redis or RabbitMQ.
        
        Second, if I need mqtt clients coming in from the internet at large, I
        don't feel comfortable exposing Postgres to that.  Also, I'd rather use
        the mqtt ecosystem of libraries rather than having all of those devices
        talk Postgres directly.
        
        Third, sometimes I want a size constrained memory only database or a
        database that automatically expires untouched records, and for either
        of those I usually use Redis. For these two tasks I use Redis.    I
        imagine that it would be worth making a reusable set of stored
        procedures to accomplish the auto-expiring of unused records, but I
        haven't implemented it.  I have no idea how to make Postgres be memory
        memory only with a constrained memory side.
       
        Sparkyte wrote 14 hours 49 min ago:
        You can also use Redis as a queue if the data isn't in danger of being
        too important.
       
          joaohaas wrote 14 hours 43 min ago:
          Even if the data is important, you can enable WAL and make sure the
          worker/consumer gets items by RPOPLPUSHing to a working queue. This
          way you can easily requeue the data if the worker ever goes offline
          mid-process.
       
            Sparkyte wrote 12 hours 35 min ago:
            Very true.
       
        ryandvm wrote 14 hours 54 min ago:
        I think my only complaint about Kafka is the widespread
        misunderstanding that it is a suitable replacement for a work queue. I
        should not be having to explain to an enterprise architect the
        distinction between a distributed work queue and event streaming
        platform.
       
          lisbbb wrote 11 hours 14 min ago:
          It's not so much that they don't know as it they think Kafka is
          sexier, or, in my case, it was mandated to use it for everything
          because they were paying for the cluster.  I solved one problem, very
          flexibly, in Elastic and they weren't even interested at all.  It was
          Kafka or nothing.  That's reality in a lot of companies.
       
        wagwang wrote 15 hours 2 min ago:
        Isn't listen/notify absurdly slow and lock contentious
       
        phendrenad2 wrote 15 hours 7 min ago:
        Since everyone is offering what they think the "camps" should be,
        here's another perspective. There are two camps: (A) Those who look at
        performance metrics ("96 cores to get 240MB/s is terrible") and assume
        that performance itself is enough to justify overruling any other
        concern (B) Those who look at all of the tradeoffs, including budget,
        maintenance, ease-of-use, etc.
        
        You see this a lot in the tech world. "Why would you use Python, Python
        is slow" (objectively true, but does it matter for your high-value SaaS
        that gets 20 logins per day?)
       
        psadri wrote 15 hours 12 min ago:
        A resource that would benefit the entire community is a set of ballpark
        figures for what kind of performance is "normal" given a particular
        hardware + data volume.  I know this is a hard problem because there is
        so much variation across workloads, but I think even order of magnitude
        ballparks would be useful.  For example, it could say things like:
        
        task: msg queue
        
        software: kafka
        
        hardware: m7i.xlarge (vCPUs: 4 Memory: 16 GiB)
        
        payload: 2kb / msg
        
        possible performance: ### - #### msgs / second
        
        etc…
        
        So many times I've found myself wondering: is this thing behaving
        within an order of magnitude of a correctly setup version so that I can
        decide whether I should leave it alone or spend more time on it.
       
        bleonard wrote 15 hours 27 min ago:
        I am excited about the Rails defaults where background and cache and
        sockets are all database driven. For normal-sized projects that still
        need those things, it's a huge win in simplicity.
       
        heyitsdaad wrote 15 hours 30 min ago:
        If the only tool you know is a hammer, everything starts looking like a
        nail.
       
        8cvor6j844qw_d6 wrote 15 hours 31 min ago:
        > Should You Use Postgres?
        
        > Most of the time - yes. You should always default to Postgres until
        the constraints prove you wrong.
        
        Interesting.
        
        I've also been by my seniors that I should go with PostgreSQL by
        default unless I have a good justification not to.
       
        sc68cal wrote 15 hours 32 min ago:
        > Postgres doesn’t seem to have any popular libraries for pub-sub9
        use cases, so I had to write my own.
        
        Ok so instead of running Kafka, we're going to spend development cycles
        building our own?
       
          enether wrote 15 hours 27 min ago:
          It would be nice if a library like pgmq got built. Not sure what the
          demand for that is, but it feels like there may be a niche
       
        Copenjin wrote 15 hours 37 min ago:
        I'm not really convinced by the comment on NOTIFY instead of the
        inferior (at least in theory) polling, I expect the global queue if
        it's really global to be only a temporary location to collect
        notifications before sending them and not a bottleneck. Never did any
        benchmark with PG or Oracle (that has a similar feature) but I expect
        that depending on the polling frequency and average amount of updates
        each solution could be the best depending on the circumstances.
       
        oulipo2 wrote 15 hours 45 min ago:
        I want to rewrite some of my setup, we're doing IoT, and I was planning
        on
        
        MQTT -> Redpanda (for message logs and replay, etc) ->
        Postgres/Timescaledb (for data) + S3 (for archive)
        
        (and possibly Flink/RisingWave/Arroyo somewhere in order to do some
        alerting/incrementally updated materialized views/ etc)
        
        this seems "simple enough" (but I don't have any experience with
        Redpanda) but is indeed one more moving part compared to MQTT ->
        Postgres (as a queue) -> Postgres/Timescaledb + S3
        
        Questions:
        
        1. my "fear" would be that if I use the same Postgres for the queue and
        for my business database, the "message ingestion" part could block the
        "business" part sometimes (locks, etc)? Also perhaps when I want to
        update the schema of my database and not "stop" the inflow of messages,
        not sure if this would be easy?
        
        2. also that since it would write messages in the queue and then delete
        them, there would be a lot of GC/Vacuuming to do, compared to my
        business database which is mostly append-only?
        
        3. and if I split the "Postgres queue" from "Postgres database" as two
        different processes, of course I have "one less tech to learn", but I
        still have to get used to pgmq, integrate it, etc, is that really much
        easier than adding Redpanda?
        
        4. I guess most Postgres queues are also "simple" and don't provide
        "fanout" for multiple things (eg I want to take one of my IoT message,
        clean it up, store it in my timescaledb, and also archive it to S3, and
        also run an alert detector on it, etc)
        
        What would be the recommendation?
       
        ayongpm wrote 15 hours 50 min ago:
        Just dropping this here casually:
        
          sup {
              position: relative;
              top: -0.4em;
              line-height: 0;
              vertical-align: baseline;
          }
       
        munchbunny wrote 15 hours 51 min ago:
        My general opinion, off the cuff, from having worked at both small
        (hundreds of events per hour) and large (trillions of events per hour)
        scales for these sorts of problems:
        
        1. Do you really need a queue? (Alternative: periodic polling of a DB)
        
        2. What's your event volume and can it fit on one node for the
        foreseeable future, or even serverless compute (if not too expensive)?
        (Alternative: lightweight single-process web service, or several
        instances, on one node.)
        
        3. If it can't fit on one node, do you really need a distributed queue?
        (Alternative: good ol' load balancing and REST API's, maybe with async
        semantics and retry semantics)
        
        4. If you really do need a distributed queue, then you may as well use
        a distributed queue, such as Kafka. Even if you take on the complexity
        of managing a Kafka cluster, the programming and performance semantics
        are simpler to reason about than trying to shoehorn a distributed queue
        onto a SQL DB.
       
          raducu wrote 13 min ago:
          > 1. Do you really need a queue?
          
          I'm a java dev and maybe my projects are about big integrations, but
          I've always needed queue like constructs and polling from a db was
          almost always a headache, especially with multiple consumers and
          publishers.
          
          Sure it can be done, and in many projects we do have cron-jobs on
          different pods -- not a global k8s cron-job, but legacy cron jobs and
          it works fine.
          
          Kafka does not YET support real queue (but I'm sure there's a high
          profile KIP to have true queue like behavior, per consumer group,
          with individual commits), and does not support server side filtering.
          
          But consumer groups and partitions have been such a blessing for me,
          it's very hard to overstate how useful they are with managing
          stateful apps.
       
          EdwardDiego wrote 1 hour 6 min ago:
          Semantic but important point, Kafka is not a queue, it's a
          distributed append only log. I deal with so many people who think
          it's a super-scalable replacement for an MQ, and it's such the wrong
          way to think about it.
       
            codeflo wrote 52 min ago:
            Do you mean this in the sense that listeners don't remove messages,
            as one would expect from a queue data structure?
       
              nasretdinov wrote 41 min ago:
              Well, it's impractical to try to handle messages individually in
              Kafka, it's designed to acknowledge entire batches (since it's a
              distributed append-only log). You can still do that, but the
              performance will be no better than an SQL database
       
              jen20 wrote 44 min ago:
              That is the major difference - clients track their read offsets
              rather than the structure removing messages. There aren't really
              "listeners" in the sense of a pub-sub?
       
              EdwardDiego wrote 44 min ago:
              Exactly. There's no concept in Kafka (yet...) of "acking" or
              DLQs, Kafka is very good at what it does by being deliberately
              stupid, it knows nothing about your messages or who has consumed
              them and who hasn't.
              
              That was all deliberately pushed onto consumers to manage to
              achieve scale.
       
          jsolson wrote 2 hours 10 min ago:
          I agree with nearly everything except your point (1).
          
          Periodic polling is awkward on both sides: you add arbitrary latency
          _and_ increase database load proportional to the number of interested
          clients.
          
          Events, and ideally coalesced events, serve the same purpose as
          interrupts in a uniprocess (versus distributed) system, even if you
          don't want a proper queue. This at least lets you know _when_ to poll
          and lets you set and adjust policy on when / how much your software
          should give a shit at any given time.
       
          drdaeman wrote 6 hours 58 min ago:
          > Do you really need a queue? (Alternative: periodic polling of a DB)
          
          In my experience it’s not the reads, but the writes that are hard
          to scale up. Reading is cheap and can be sometimes done off a
          replica. Writing to a PostgreSQL at high sustained rate requires
          careful tuning and designs. A stream of UPDATEs can be very painful,
          INSERTs aren’t cheap, and even a batched COPY blocks can be tricky.
       
          ozim wrote 10 hours 46 min ago:
          Periodic polling of a DB gets bad pretty quick, queues are much
          better even on small scale.
          
          But then distributed queue is most likely not needed until you hit
          really humongous scale.
       
            TexanFeller wrote 9 hours 13 min ago:
            Maybe in the past this was true, or if you’re using an inferior
            DB. I know first hand that a Postgres table can work great as a
            queue for many millions of events per day processed by thousands of
            workers polling for work from it concurrently. With more than a few
            hundred concurrent pollers you might want a service, or at least a
            centralized connection pool in front of it though.
       
              skunkworker wrote 8 hours 31 min ago:
              Millions of events per day is still in the small queue category
              in my book. Postgres LISTEN doesn't scale, and polling on hot
              databases can suddenly become more difficult, as you're having to
              throw away tuples regularly.
              
              10 message/s is only 860k/day. But in my testing (with postgres
              16) this doesn't scale that well when you are needing tens to
              hundreds of millions per day. Redis is much better than postgres
              for that (for a simple queue), and beyond that kafka is what I
              would choose in you're in the low few hundred million.
       
          javier2 wrote 10 hours 47 min ago:
          I dont disagree, and I am trying to argue for it myself, and have
          used postgres as a "queue" or the backlog of events to be sent (like
          outbox pattern). But what if I have 4 services that needs to know X
          happened to customer Y? I feel like it quickly becomes cumbersome
          with a postgres event delivery to make sure everyone gets the events
          they need delivered. The posted link tries to address this at least.
       
            dagss wrote 32 min ago:
            The standard approach, which Kafka also uses beneath all the
            libraries hiding it from you, is:
            
            The publisher has a set of tables (topics and partitions) of
            events, ordered and with each event having an assigned event
            sequence number.
            
            Publisher stores no state for consumers in any way.
            
            Instead, each consumer keeps a cursor (a variable holding an event
            sequence number) indicating how far it has read for each event log
            table it is reading.
            
            Consumer can then advance (or rewind) its own cursor in whatever
            way it wishes. The publisher is oblivious to any consumer side
            state.
            
            This is the fundamental piece of how event log publishing works (as
            opposed to queues which is something else entirely; and the article
            talks about both usecases).
       
            ThreatSystems wrote 9 hours 32 min ago:
            Call me dumb - I'll take it! But if we really are trying to keep it
            simple simple...
            
            Then you just query from event_receiver_svcX side, for events
            published > datetime and event_receiver_svcX = FALSE. Once read set
            to TRUE.
            
            To mitigate too many active connections have a polling / backoff
            strategy and place a proxy infront of the actual database to
            proactively throttle where needed.
            
            But event table:
            
            | event_id | event_msg_src | event_msg         |
            event_msg_published | event_receiver_svc1 | event_receiver_svc2 |
            event_receiver_svc3 |
            
            |----------|---------------|---------------------|-----------------
            ----|---------------------|---------------------|------------------
            ---|
            
            | evt01    | svc1       | json_message_format | datetime       
               | TRUE         | TRUE            | FALSE           
             |
       
          lumost wrote 13 hours 31 min ago:
          I suspect the common issue with small scale projects is that it's not
          atypical for the engineers involved to perform a joint optimization
          of "what will work well for this project", and "what will work well
          at my next project/job." Particularly in startups where the
          turnover/employer stability is poor - this is the optimal action for
          the engineers involved.
          
          Unless employees expect that their best rewards are from making their
          current project as simple and effective as possible - it is highly
          unlikely that the current project will be as simple as it could be.
       
            jghn wrote 7 hours 32 min ago:
            What I've found to be even more common than resume driven
            development has been people believing that they either have or will
            have "huge scale". But the problem is that their goal posts are off
            by a few orders of magnitude and they will never, ever have the
            sort of scale required for these types of tools.
       
              bcrosby95 wrote 3 hours 45 min ago:
              The problem is when discussing techniques everyone uses the same
              terms but no one actually defines them.
       
              Moto7451 wrote 5 hours 51 min ago:
              I had this very same argument today. It was claimed that a once
              per year data mapping process of unstructured data that we sell
              via our product -  would not scale. The best part is if we
              somehow had ten of these to do it would still be something that
              would take less than a year. Currently it takes a single person a
              few weeks and makes millions of dollars. This is the sort of
              fiddly work that you can find an Ontologist for and they’re
              happy to do it for the pay.
              
              I’m unsure what is unattractive about this but I guess anything
              can be a reason to spend a year playing with LLMs these days.
              
              I’ve had the same problem with compliance work (lightly
              regulated market) and suddenly the scaling complaints go away
              when the renewals stop happening.
       
            procaryote wrote 10 hours 27 min ago:
            This is something to catch in hiring and performance evaluation.
            Hire people who don't build things to pad their own CVs, tell them
            to stop if you failed, fire them if that failed
       
              lumost wrote 4 hours 57 min ago:
              Hiring irrational players, or forcing rational people to act
              outside of their own self-interest is not a winning strategy
              either.
              
              There is nothing wrong with building stuff, or career
              development. There is also nothing wrong with experimentation.
              You certainly would not want to incentivize the opposite behavior
              of never building anything unless it had 10 guarantors of revenue
              and technical soundness.
              
              If you need people to focus, then you need them to be
              incentivized to focus. Do they see growth potential? Are they
              compensated such that other employers are undesirable? Do they
              see the risk of failure?
       
          Capricorn2481 wrote 15 hours 25 min ago:
          > If it can't fit on one node, do you really need a distributed
          queue? (Alternative: good ol' load balancing and REST API's, maybe
          with async semantics and retry semantics)
          
          That sounds distributed to me, even if it wires different tech
          together to make it happen. Is there something about load balancing
          REST requests to different DB nodes that is less complicated than
          Kafka?
       
            munchbunny wrote 14 hours 27 min ago:
            > Is there something about load balancing REST requests to
            different DB nodes that is less complicated than Kafka?
            
            To be clear I wasn't talking about DB nodes, I was talking about
            skipping an explicit queue altogether.
            
            But let's say you were asking about load balancing REST requests to
            different backend servers:
            
            Yes, in the sense that "load balanced REST microservice with retry
            logic" is such a common pattern that is better understood by SWE's
            and SRE's everywhere.
            
            No, in the sense that if you really did just need a distributed
            queue then your life would be simpler reusing a battle-tested
            implementation instead of reinventing that wheel.
       
          oulipo2 wrote 15 hours 44 min ago:
          I want to rewrite some of my setup, we're doing IoT, and I was
          planning on
          
          MQTT -> Redpanda (for message logs and replay, etc) ->
          Postgres/Timescaledb (for data) + S3 (for archive)
          
          (and possibly Flink/RisingWave/Arroyo somewhere in order to do some
          alerting/incrementally updated materialized views/ etc)
          
          this seems "simple enough" (but I don't have any experience with
          Redpanda) but is indeed one more moving part compared to MQTT ->
          Postgres (as a queue) -> Postgres/Timescaledb + S3
          
          Questions:
          
          1. my "fear" would be that if I use the same Postgres for the queue
          and for my business database, the "message ingestion" part could
          block the "business" part sometimes (locks, etc)? Also perhaps when I
          want to update the schema of my database and not "stop" the inflow of
          messages, not sure if this would be easy?
          
          2. also that since it would write messages in the queue and then
          delete them, there would be a lot of GC/Vacuuming to do, compared to
          my business database which is mostly append-only?
          
          3. and if I split the "Postgres queue" from "Postgres database" as
          two different processes, of course I have "one less tech to learn",
          but I still have to get used to pgmq, integrate it, etc, is that
          really much easier than adding Redpanda?
          
          4. I guess most Postgres queues are also "simple" and don't provide
          "fanout" for multiple things (eg I want to take one of my IoT
          message, clean it up, store it in my timescaledb, and also archive it
          to S3, and also run an alert detector on it, etc)
          
          What would be the recommendation?
       
            DelaneyM wrote 10 hours 32 min ago:
            My suggestion would be even simpler:
            
            MQTT -> Postgres (+ S3 for archive)
            
            > 1. my "fear" would be that if I use the same Postgres for the
            queue and for my business database...
            
            This is a feature, not a bug.  In this way you can pair the
            handling of the message with the business data changes which result
            in the same transaction.  This isn't quite "exactly-once" handling,
            but it's really really close!
            
            > 2. also that since it would write messages in the queue and then
            delete them, there would be a lot of GC/Vacuuming
            
            Generally it's best practice in this case to never delete messages
            from a SQL "queue", but toggle them in-place to consumed and
            periodically archive to a long-term storage table.  This provides
            in-context historical data which can be super helpful when you need
            to write a script to undo or mitigate bad code which resulted in
            data corruption.
            
            Alternatively when you need to roll back to a previous state, often
            this gives you a "poor woman's undo", by restoring a time-stamped
            backup, copying over messages which arrived since the restoration
            point, then letting the engine run forwards processing those
            messages.  (This is a simplification of course, not always directly
            possible, but data recovery is often a matter of mitigations and
            least-bad choices.)
            
            Basically, saving all your messages provides both efficiency and
            data recovery optionality.
            
            > 3...
            
            Legit concern, particularly if you're trying to design your service
            abstraction to match an eventual evolution of data platform.
            
            > 4. don't provide "fanout" for multiple things
            
            What they do provide is running multiple handling of a queue,
            wherein you might have n handlers (each with its own "handled_at"
            timestamp column in the DB), and different handles run at different
            priorities.  This doesn't allow for workflows (ie a cleanup step)
            but does allow different processes to run on the same queue with
            different privileges or priorities.  So the slow process (archive?)
            could run opportunistically or in batches, where time-sensitive
            issues (alerts, outlier detection, etc) can always run instantly. 
            Or archiving can be done by a process which lacks access to any
            user data to algorithmically enforce PCI boundaries.  Etc.
       
              sarchertech wrote 1 hour 2 min ago:
              > This is a feature, not a bug. In this way you can pair the
              handling of the message with the business data changes which
              result in the same transaction.
              
              That’s a particularly nasty trap. Devs will start using this
              everywhere and it makes it very hard to move this beyond Postgres
              when you need to.
              
              I’d keep a small transactional outbox for when you really need
              it and encourage devs to use it only when absolutely necessary.
              
              I’m currently cleaning up an application that has reached the
              limit of vertical scaling with Postgres. A significant part of
              that is because it uses Postgres for every background work queue.
              Every insert into the queue is in a transaction—do you really
              want to rollback your change because a notification job
              couldn’t be enqueued? Probably not. But the ability is there
              and is so easy to do that it gets overused.
              
              Now I get to go back through hundreds of cases and try to
              determine whether the transactional insert was intentional or
              just someone not thinking.
       
            munchbunny wrote 14 hours 17 min ago:
            > I want to rewrite some of my setup, we're doing IoT, and I was
            planning on
            
            Is this some scripting to automate your home, or are you trying to
            build some multi-tenant thing that you can sell?
            
            If it's just scripting to automate your home, then you could
            probably get away with a single server and on-disk/in-memory
            queuing, maybe even sqlite, etc. Or you could use it as an
            opportunity to learn those technologies, but you don't really need
            them in your pipeline.
            
            It's amazing how much performance you can get as long as the
            problem can fit onto a single node's RAM/SSD.
       
            notepad0x90 wrote 14 hours 37 min ago:
            Another good item to consider:
            
            n) Do you really need S3? is it cheaper than NFS storage on a
            compute node with a large disk?
            
            There are many cases where S3 is absolutely cheaper though.
       
            singron wrote 14 hours 38 min ago:
            Re 1. Look up non-blocking migrations for postgres. You can
            generally do large schema migrations while only briefly taking
            exclusive locks. It's a common mistake to perform a blocking
            migration and lock up your database (e.g. using CREATE INDEX on an
            existing table instead of CREATE INDEX CONCURRENTLY).
            
            There are globally shared resources, but for the most part, locks
            are held on specific rows or tables. Unrelated transactions
            generally won't block on each other.
            
            Also running a Very High Availability cluster is non-trivial. It
            can take a minute to fail over to a replica, and a busy database
            can take a while to replay the WAL after a reboot before it's
            functional again. Most people are OK with a couple minutes of
            downtime for the occasional reboot though.
            
            I think this really depends on your scale. Are you doing <100
            messages/second? Definitely stick with postgres. Are you doing
            >100k messages/second? Think about Kafka/redpanda. If you were
            comfortable with postgres (or you will be since you are building
            the rest of your project with it), then you want to stick with
            postgres longer, but if you are barely using it and would struggle
            to diagnose an issue, then you won't benefit from consolidating.
            
            Postgres will also be more flexible. Kafka can only do partitions
            and consumer groups, so if your workload doesn't look like that
            (e.g. out of order processing), you might be fighting Kafka.
       
            singron wrote 15 hours 8 min ago:
            Re (2) there is a lot of vacuuming, but the table is small, and
            it's usually very fast and productive.
            
            You can run into issues with scheduled queues (e.g. run this job in
            5 minutes) since the tables will be bigger, you need an index, and
            you will create the garbage in the index at the point you are
            querying (jobs to run now). This is a spectacularly bad pattern for
            postgres at high volume.
       
            zozbot234 wrote 15 hours 22 min ago:
            > Also perhaps when I want to update the schema of my database and
            not "stop" the inflow of messages, not sure if this would be easy?
            
            Doesn't PostgreSQL have transactional schema updates as a key
            feature? AIUI, you shouldn't be having any data loss as a result of
            such changes.  It's also common to use views in order to simplify
            the management of such updates.
       
        jasonthorsness wrote 15 hours 58 min ago:
        Using a single DBMS for many purposes because it is so flexible and
        “already there” from an operations perspective is something I’ve
        seen over and over again. It usually goes wrong eventually with one
        workload/use screwing up others but maybe that’s fine and a normal
        part of scaling?
        
        I think a bigger issue is the DBMS themselves getting feature after
        feature and becoming bloated and unfocused. Add the thing to Postgres
        because it is convenient! At least Postgres has a decent plugin
        approach. But I think more use cases might be served by standalone
        products than by add-ons.
       
          quaunaut wrote 15 hours 53 min ago:
          It's a normal part of scaling because often bringing in the new
          technology introduces its own ways of causing the exact same
          problems. Often they're difficult to integrate into automated tests
          so folks mock them out, leading to issues. Or a configuration
          difference between prod/local introduces a problem.
          
          Your DB on the other hand is usually a well-understood part of your
          system, and while scaling issues like that can cause problems,
          they're often fairly easy to predict- just unfortunate on timing.
          This means that while they'll disrupt, they're usually solved
          quickly, which you can't always say for additional systems.
       
        losvedir wrote 16 hours 1 min ago:
        Maybe I missed it in the design here, but this pseudo-Kafka Postgres
        implementation doesn't really handle consumer groups very well. The
        great thing about Kafka consumer groups is it makes it easy to spread
        the load over several instances running your service. They'll all
        connect using the same group, and different partitions will be assigned
        to the different instances. As you scale up or down, the partition
        responsibilities will be updated accordingly.
        
        You need some sort of server-side logic to manage that, and the
        consumer heartbeats, and generation tracking, to make sure that only
        the "correct" instances can actually commit the new offsets.
        Distributed systems are hard, and Kafka goes through a lot of trouble
        to ensure that you don't fail to process a message.
       
          mrkeen wrote 15 hours 47 min ago:
          Right, the author's worldview is that Kafka is resume-driven
          development, used by people "for speed" (even though they are only
          pushing 500KB/s).
          
          Of course the implementation based off that is going to miss a bit.
       
        dzonga wrote 16 hours 13 min ago:
        what's not spoken about in the above article ?
        
        ease of use. in ruby If I want to use kafka I can use karafka. or redis
        streams via the redis library. likewise if kafka is too complex to run
        there's countless alternatives which work as well - hell even 0mq with
        client libraries.
        
        now with the postgres version I have to write my own stuff which I
        might not where it's gonna lead me.
        
        postgres is scalable, no one doubts that. but what people forget to
        mention is the ecosystem around certain tools.
       
          enether wrote 9 hours 43 min ago:
          That's true.
          
          There seems to be two planes of ease of use - the app layer (library)
          and the infra layer (hosting).
          
          The app layer for Postgres is still in development, so if you
          currently want to run pub-sub (Kafka) on it, it will be extra work to
          develop that abstraction.
          
          I hope somebody creates such a library. It's a one-time cost but then
          will make it easier for everybody.
       
          j45 wrote 16 hours 9 min ago:
          I’m not sure where it says you have to write your own stuff, there
          seem to be some of queues with libraries. [1] There is at least a
          Python example here.
          
   URI    [1]: https://github.com/dhamaniasad/awesome-postgres
       
            dagss wrote 43 min ago:
            Work queues is easy.
            
            It is significantly more work for the client side implementation of
            event log consumers, which the article also talk about. For
            instance persisting the client side cursors. And I have not seen
            widely used standard implementations of those. (I started one
            myself once but didn't finish.)
       
        justinhj wrote 16 hours 16 min ago:
        As engineers we should try to use the right tool for the job, which
        means thinking about the development team's strengths and weaknesses as
        well as differentiating factors your product should focus on. Often we
        are working in the cloud and it's much easier to use a queue or a log
        database service than manage a bunch of sql servers and custom logic.
        It can be more cost effective too once you factor in the development
        time and operational costs.
        
        The fact that there is no common library that implements the authors
        strategy is a good sign that there is not much demand for this.
       
        me551ah wrote 16 hours 21 min ago:
        Imagine if historic humans had decided that only hammers are enough.
        That there is no need for a specialized tool like Scissors, Chisel,
        Axe, Wrench, Shovel , Sickle and that a hammer and fingers are enough.
        
        Use the tool which is appropriate for the job, it is trivial to write
        code to use them with LLMs these days and these software are mature
        enough to rarely cause problems and tools built for a purpose will
        always be more performant.
       
        this_user wrote 16 hours 23 min ago:
        The real two camps seem to be:
        
        1) People constantly chasing the latest technology with no regard for
        whether it's appropriate for the situation.
        
        2) People constantly trying to shoehorn their favourite technology into
        everything with no regard for whether it's appropriate for the
        situation.
       
          PeterCorless wrote 16 hours 14 min ago:
          2) above is basically "Give a kid a hammer, and everything becomes a
          nail."
          
          The third camp:
          
          3) People who look at a task, then apply a tool appropriate for the
          task.
       
          j45 wrote 16 hours 16 min ago:
          Kafka is anything but new.  It does get shoehorned too.
          
          Postgres also has been around for a long time and a lot of people
          didn’t know all it can do which isn’t what we normally think
          about with a database.
          
          Appropriateness is a nice way to look  at it as long as it’s clear
          whether or not it’s about personal preferences and interpretations
          and being righteous towards others with them.
          
          Customers rarely care about the backend or what it’s developed in,
          except maybe for developer products.  It’s a great way to waste
          time though.
       
        shikhar wrote 16 hours 24 min ago:
        Postgres is a way better fit than Kafka if you want a large number of
        durable streams. But a flexible OLTP database like PG is bound to
        require more resources and polling loops (not even long poll!) are not
        a great answer for following live updates.
        
        Plug: If you need granular, durable streams in a serverless context,
        check out s2.dev
       
          dagss wrote 48 min ago:
          s2.dev looks cool... I jumped around the home page a bit and couldn't
          perfectly grasp what it is quickly though. But if it is about
          decoupling the Kafka approach and client side libraries from the use
          of Kafka specifically I am cheering for you.
          
          Could you see using the s2.dev protocol on top of services using SQL
          in the way of the article, assigning event sequence numbers, as a
          good fit? Or is s2 fundamentally the component that assigns event
          numbers?
          
          I feel like we tried to do something similar to you, but for SQL DBs,
          but am not sure:
          
   URI    [1]: https://github.com/vippsas/feedapi-spec
       
        misja111 wrote 16 hours 24 min ago:
        > One camp chases buzzwords .. the other common sense
        
        How is it common sense to try to re-implement Kafka in Posgres?
        You probably need something similar but more simple. Then implement
        that! But if you really need something like Kafka, then .. use Kafka!
        
        IMO the author is now making the same mistake as some Kafka evangelists
        that try to implement a database in Kafka.
       
          enether wrote 15 hours 23 min ago:
          I’m making the example of a pub sub system. I’m most familiar
          with Kafka so drew parallels to it. I didn’t actually implement
          everything Kafka offers - just two simple pub sub like queries.
       
        CuriouslyC wrote 16 hours 25 min ago:
        If you don't need all the bells and whistles of Kafka, NATS Jetstream
        is usually the way to go.
       
        loftsy wrote 16 hours 26 min ago:
        I am about to start a project. I know I want an event sourced
        architecture. That is, the system is designed around a queue, all
        actors push/pull into the queue. This article gives me some pause.
        
        Performance isn't a big deal for me. I had assumed that Kafka would
        give me things like decoupling, retry, dead-lettering, logging, schema
        validation, schema versioning, exactly once processing.
        
        I like Postgres, and obviously I can write a queue ontop of it, but it
        seems like quite a lot of effort?
       
          munchbunny wrote 12 hours 56 min ago:
          > I had assumed that Kafka would give me things like decoupling,
          retry, dead-lettering, logging, schema validation, schema versioning,
          exactly once processing.
          
          If you don't need a lot of perf but you place a premium on ergonomics
          and correctness, this sounds more like you want a workflow engine?
          
   URI    [1]: https://github.com/meirwah/awesome-workflow-engines
       
            lisbbb wrote 11 hours 12 min ago:
            One thing I learned with Kafka and Cassandra is that you are locked
            in to a design pretty early on. Then the business changes their
            mind and it take a great deal of re-work and then they're accusing
            you of being incompetent because they are used to SQL projects that
            have way more flexibility.
       
            loftsy wrote 11 hours 57 min ago:
            Perhaps I do. I know that I don't want a system defined as a graph
            in yaml. Or no code. These options are over engineered for my use
            case. I'm pretty comfortable building some docker containers and
            operating them and this is the approach I want to use.
            
            I'm checking out the list.
       
          whalesalad wrote 14 hours 34 min ago:
          If you build it right, the underlying storage engine for your event
          stream should be swappable for any other event stream tech. Could be
          SQLite, PSQL, Kafka, Kinesis, SQS, Rabbit, Redis ... really anything
          can serve this need. The right tool will appear once you dial in your
          architecture. Treat storage as a black box API that has "push", "pop"
          etc commands. When your initial engine falls over, switch to a new
          one and expose that same API.
          
          The bigger question to ask is: will this storage engine be used to
          persist and retain data forever (like a database) or will it be used
          more for temporary transit of data from one spot to another.
       
          rileymichael wrote 14 hours 50 min ago:
          if you need a durable log (which it sounds like you do for if you're
          going with event sourcing) that has those features, i'd suggest
          apache pulsar. you effectively get streams with message queue
          semantics (per-message acks, retries, dlq, etc.) from one system. it
          supports many different 'subscription types', so you can use it for a
          bunch of different use cases. running it on your own is a bit of a
          beast though and there's really only one hosted provider in the game
          (streamnative)
          
          note that kafka has recently started investing into 'queues' in
          KIP-932, but they're still a long way off from implementing all of
          those features.
       
          singron wrote 15 hours 54 min ago:
          Kafka also doesn't give you all those things. E.g. there is no
          automatic dead-lettering, so a consumer that throws an exception will
          endlessly retry and block all progress on that partition. Kafka only
          stores bytes, so schema is up to you. Exactly-once is good, but there
          are some caveats (you have to use kafka transactions, which are
          significantly different than normal operation, and any external
          system may observe at-least-once semantics instead). Similar
          exactly-once semantics would also be trivial in an RDBMS (i.e.
          produce and consume in same transaction).
          
          If you plan on retaining your topics indefinitely, schema evolution
          can become painful since you can't update existing records. Changing
          the number of partitions in a topic is also painful, and choosing the
          number initially is a difficult choice. You might want to build your
          own infrastructure for rewriting a topic and directing new writes to
          the new topic without duplication.
          
          Kafka isn't really a replacement for a database or anything
          high-level like a ledger. It's really a replicated log, which is a
          low-level primitive that will take significant work to build into
          something else.
       
            loftsy wrote 11 hours 50 min ago:
            Very interesting.
            
            I need a durable queue but not indefinitely. Max a couple of hours.
            
            What I want is Google PubSub but open source so I can self host.
       
              stackskipton wrote 7 hours 22 min ago:
              Small size, Beanstalkd ( [1] ) can get you pretty far.
              
              Larger, RabbitMQ can handle some pretty good workloads.
              
   URI        [1]: https://beanstalkd.github.io/
       
            oulipo2 wrote 15 hours 44 min ago:
            I want to rewrite some of my setup, we're doing IoT, and I was
            planning on
            
            MQTT -> Redpanda (for message logs and replay, etc) ->
            Postgres/Timescaledb (for data) + S3 (for archive)
            
            (and possibly Flink/RisingWave/Arroyo somewhere in order to do some
            alerting/incrementally updated materialized views/ etc)
            
            this seems "simple enough" (but I don't have any experience with
            Redpanda) but is indeed one more moving part compared to MQTT ->
            Postgres (as a queue) -> Postgres/Timescaledb + S3
            
            Questions:
            
            1. my "fear" would be that if I use the same Postgres for the queue
            and for my business database, the "message ingestion" part could
            block the "business" part sometimes (locks, etc)? Also perhaps when
            I want to update the schema of my database and not "stop" the
            inflow of messages, not sure if this would be easy?
            
            2. also that since it would write messages in the queue and then
            delete them, there would be a lot of GC/Vacuuming to do, compared
            to my business database which is mostly append-only?
            
            3. and if I split the "Postgres queue" from "Postgres database" as
            two different processes, of course I have "one less tech to learn",
            but I still have to get used to pgmq, integrate it, etc, is that
            really much easier than adding Redpanda?
            
            4. I guess most Postgres queues are also "simple" and don't provide
            "fanout" for multiple things (eg I want to take one of my IoT
            message, clean it up, store it in my timescaledb, and also archive
            it to S3, and also run an alert detector on it, etc)
            
            What would be the recommendation?
       
          mrkeen wrote 15 hours 58 min ago:
          Event-sourcing != queue.
          
          Event-sourcing is when you buy something and get a receipt, you go
          stick it in a shoe-box for tax time.
          
          A queue is you get given receipts, and you look at them in the
          correct order before throwing each one away.
       
            loftsy wrote 11 hours 48 min ago:
            True.
            
            I think my system is sort of both. I want to put some events in a
            queue for a finite set of time, process them as a single
            consolidated set, and then drop them all from the queue.
       
          mkozlows wrote 16 hours 12 min ago:
          If what you want is a queue, Kafka might be overkill for your needs.
          It's a great tool, but it definitely has a lot of complexity relative
          to a straightforward queue system.
       
          j45 wrote 16 hours 18 min ago:
          It might look like a lot of effort, but if you follow a
          tutorial/YouTube video step by step you will be surprised.
          
          It’s mostly registering the Postgres database functions which is
          one time.
          
          There are also pre-made Postgres extensions that already run the
          queue.
          
          These days i would like consider m starting with Supabase self hosted
          which has the Postgres ready to tweak.
       
        dangoodmanUT wrote 16 hours 34 min ago:
        96 cores to get 240MB/s is terrible. Redpanda can do this with like one
        or two cores
       
          enether wrote 9 hours 46 min ago:
          hehe, yeah it is. I could have probably got a GB/s out of that if I
          ran it properly - but it's at the scale where you expect it to be
          terrible due to the mismatch of workloads
       
          greenavocado wrote 16 hours 30 min ago:
          Redpanda might be good (I don't know) but I threw up a little in my
          mouth when I opened their website and saw "Build the Agentic Data
          Plane"
       
            umanwizard wrote 16 hours 14 min ago:
            The marketing website of every data-related startup sounds like
            that now. I agree it’s dumb, but you can safely ignore it.
       
        rudderdev wrote 16 hours 35 min ago:
        Discussion on the same topic "Postgres over Kafka" -
        
   URI  [1]: https://news.ycombinator.com/item?id=44445841
       
        uberduper wrote 16 hours 40 min ago:
        Has this person actually benchmarked kafka? The results they get with
        their 96 vcpu setup could be achieved with kafka on the 4 vcpu setup.
        Their results with PG are absurdly slow.
        
        If you don't need what kafka offers, don't use it. But don't pretend
        you're on to something with your custom 5k msg/s PG setup.
       
          ljm wrote 15 hours 31 min ago:
          I wonder if OP could have got different results if they implemented a
          different schema as opposed to mimicking Kafka's setup with the
          partitions, consumer offsets, etc.
          
          I might well be talking out of my arse but if you're going to
          implement pub/sub in Postgres, it'd be worth designing around its
          strengths and going back to basics on event sourcing.
       
          adamtulinius wrote 16 hours 2 min ago:
          I remember doing 900k writes/s (non-replicated) already back on kafka
          0.8 with a random physical server with an old fusionio drive (says
          something about how long ago this was :D).
          
          It's a fair point that if you already have a pgsql setup, and only
          need a few messages here and there, then pg is fine. But yeah, the 96
          vcpu setup is absurd.
       
          blenderob wrote 16 hours 6 min ago:
          > Has this person actually benchmarked kafka?
          
          Is anyone actually reading the full article, or just reacting to the
          first unimpressive numbers you can find and then jumping on the first
          dismissive comment you can find here?
          
          Benchmarking Kafka isn't the point here. The author isn't claiming
          that Postgres outperforms Kafka. The argument is that Postgres can
          handle modest messaging workloads well enough for teams that don't
          want the operational complexity of running Kafka.
          
          Yes, the throughput is astoundingly low for such a powerful CPU but
          that's precisely the point. Now you know how well or how bad Postgres
          performs on a beefy machine. You don't always need Kafka-level scale.
          The takeaway is that Postgres can be a practical choice if you
          already have it in place.
          
          So rather than dismissing it over the first unimpressive number you
          find, maybe respond to that actual matter of TFA. Where's the line
          where Postgres stops being "good enough"? That'll be something nice
          to talk about.
       
            uberduper wrote 15 hours 1 min ago:
            Then the author should have gone on to discuss not just the
            implementation they now have to maintain, but also all the client
            implementations they'll have to keep re-creating for their custom
            solution. Or they could talk about all the industry standard tools
            that work with kafka and not their custom implementation.
            
            Or they could have not mentioned kafka at all and just demonstrated
            their pub/sub implementation with PG. They could have not tried to
            make it about the buzzword resume driven engineering people vs.
            common sense folks such as himself.
       
            adamtulinius wrote 16 hours 1 min ago:
            The problem is benchmarking on the 96 vcpu server, because at that
            point the author seems to miss the point of Kafka. That's just a
            waste of money for that performance.
       
              blenderob wrote 15 hours 57 min ago:
              And if the OP hadn't done that, someone here would complain, why
              couldn't the OP use a larger CPU and test if Postgres performs
              better? Really, there is no way the OP can win here, can they?
              
              I'm glad the OP benchmarked on the 96 vCPU server. So now I know
              how well Postgres performs on a large CPU. Not very well. But if
              the OP had done their benchmark on a low CPU, I wouldn't have
              learned this.
       
                cheikhcheikh wrote 11 hours 36 min ago:
                you're missing the point. Postgres performs well on large CPU.
                Postgres as-used by OP does not and is a waste of money. It's
                great that he benchmarked for a larger CPU, that's not what
                people are disputing, they are disputing the ridiculous
                conclusion.
       
          jaimebuelta wrote 16 hours 10 min ago:
          I may be reading a bit extra, but my main take on this is: "in your
          app, you probably already have PostgreSQL. You don't need to set up
          an extra piece of infrastructure to cover your extra use case, just
          reuse the tool you already have"
          
          It's very common to start adding more and more infra for use cases
          that, while technically can be better cover with new stuff, it can be
          served by already existing infrastructure, at least until you have
          proof that you need to grow it.
       
          darth_avocado wrote 16 hours 10 min ago:
          The 96 vcpu setup with 24xlarge instance costs about $20k/month on
          AWS before discounts. And one thing you don’t want in a pub sub
          system is a single instance taking all the read/writes. You can run a
          sizeable Kafka cluster for that kind of money in AWS.
       
          ozgrakkurt wrote 16 hours 10 min ago:
          This is why benchmarks should be hardware limit based IMO. Like I am
          maxing IOPS/throughput of this ssd or maxing out the network card
          etc.
          
          CPU is more tricky but I’m sure it can be shown somehow
       
          roozbeh18 wrote 16 hours 21 min ago:
          Just checked my single node Kafka setup which currently handles
          695.27k e/s (average daily) into elasticsearch without breaking a
          sweat. kafka has been the only stable thing in this whole setup.
          
          zeek -> kafka -> logstash -> elastic
       
            apetrov wrote 15 hours 0 min ago:
            out of curiosity, what does your service do that it handles almost
            700K events/sec?
       
          joaohaas wrote 16 hours 25 min ago:
          Had the same thoughts, weird it didn't include Kafka numbers.
          
          Never used Kafka myself, but we extensively use Redis queues with
          some scripts to ensure persistency, and we hit throughputs much
          higher than those in equivalent prod machines.
          
          Same for Redis pubsubs, but those are just standard non-persistent
          pubsubs, so maybe that gives it an upper edge.
       
          PeterCorless wrote 16 hours 27 min ago:
          Exactly. Just yesterday someone posted how they can do 250k
          messages/second with Redpanda (Kafka-compatible implementation) on
          their laptop. [1] Getting even less than that throughput on 3x
          c7i.24xlarge — a total of 288 vCPUs – is bafflingly wasteful.
          
          Just because you can do something with Postgres doesn't mean you
          should.
          
          > 1. One camp chases buzzwords.
          
          > 2. The other camp chases common sense
          
          In this case, is "Postgres" just being used as a buzzword?
          
          [Disclosure: I work for Redpanda; we provide a Kafka-compatible
          service.]
          
   URI    [1]: https://www.youtube.com/watch?v=7CdM1WcuoLc
       
            cestith wrote 14 hours 4 min ago:
            Your name sounds familiar. I think you may be one of the people at
            RedPanda with whom I’ve corresponded. It’s been a few years
            though, so maybe not.
            
            A colleague and I (mostly him, but on my advice) worked up a set of
            patches to accept and emit JSON and YAML in the CLI tool. Our use
            case at the time was setting things up with a config management
            system using the already built tool RedPanda provides without
            dealing with unstructured text.
            
            We got a lot of good use out of RedPanda at that org. We’ve both
            moved on to a new employer, though, and the “no offering RedPanda
            as a service” spooked the company away from trying it without
            paying for the commercial package. Y’all assured a couple of us
            that our use case didn’t count as that, but upper management and
            legal opted to go with Kafka just in case.
       
            kermatt wrote 15 hours 28 min ago:
            To the issue of complexity, is Redpanda suitable as a "single node
            implementation" where a Kafka cluster is not needed due to data
            volume, but the Kafka message bus pattern is desired?
            
            AKA "Medium Data" ?
       
              cestith wrote 14 hours 11 min ago:
              Yes. I’ve run projects where it was used that way.
              
              It also scales to very large clusters.
       
            kragen wrote 16 hours 13 min ago:
            This sounded interesting to me, and it looks like the plan is to
            make Redpanda open-source at some point in the future, but there's
            no timeline:
            
   URI      [1]: https://github.com/redpanda-data/redpanda/tree/dev/license...
       
              PeterCorless wrote 16 hours 2 min ago:
              Correct. Redpanda is source-available.
              
              When you have C++ code, the number of external folks who want to
              — and who can effectively, actively contribute to the code —
              drops considerably. Our "cousins in code," ScyllaDB last year
              announced they were moving to source-available because of the
              lack of OSS contributors:
              
              > Moreover, we have been the single significant contributor of
              the source code. Our ecosystem tools have received a healthy
              amount of contributions, but not the core database. That makes
              sense. The ScyllaDB internal implementation is a C++,
              shard-per-core, future-promise code base that is extremely hard
              to understand and requires full-time devotion. Thus source-wise,
              in terms of the code, we operated as a full open-source-first
              project. However, in reality, we benefitted from this no more
              than as a source-available project.
              
              Source: [1] People still want to get free utility of the
              source-available code. Less commonly they want be able to see the
              code to understand it and potentially troubleshoot it. Yet asking
              for active contribution is, for almost all, a bridge too far.
              
   URI        [1]: https://www.scylladb.com/2024/12/18/why-were-moving-to-a...
       
                zX41ZdbW wrote 13 hours 5 min ago:
                The statement is untrue. For example, ClickHouse is in C++, and
                it has thousands of contributors with hundreds of external
                contributors every month.
       
                  kragen wrote 31 min ago:
                  I think it's reasonably common for accepting external
                  contributions to an open-source project to be more trouble
                  than it's worth, just because most programmers aren't very
                  good.
       
                rplnt wrote 14 hours 54 min ago:
                You can be open source and not take contributions. This
                argument doesn't make sense to me. Just stop doing the
                expensive part and keep the license as is.
       
                  kragen wrote 14 hours 25 min ago:
                  I think the argument is that, if they expected to receive
                  high-quality contributions, then they'd be willing to take
                  the risk of competitors using their software to compete with
                  them, which an open-source license would allow.  It usually
                  doesn't work out that way; with a strong copyleft license,
                  your competitors are just doing free R&D improving your own
                  product, unless they can convince your customers that they
                  know more about the product than the guys who wrote it in the
                  first place.  But that's usually the fear.
                  
                  On the other hand, if they don't expect people outside their
                  company to know C++ well enough to contribute usefully, they
                  probably shouldn't expect people outside their company to be
                  able to compete with them either.
                  
                  Really, though, the reason to go open-source is because it
                  benefits your customers, not because you get contributions,
                  although you might.  (This logic is unconvincing if you fear
                  they'll stop being your customers, of course.)
       
                cyphar wrote 15 hours 37 min ago:
                You are obviously free to choose to use a proprietary license,
                that's fine -- but the primary purpose of free licenses has
                very little to do with contributing code back upstream.
                
                As a maintainer of several free software projects, there are
                lots of issues with how projects are structured and user
                expectations, but I struggle to see how proprietary licenses
                help with that issue (I can see -- though don't entirely buy --
                the argument that they help with certain business models, but
                that's a completely different topic). To be honest, I have no
                interest in actively seeking out proprietary software, but I'm
                certainly in the minority on that one.
       
                zozbot234 wrote 15 hours 43 min ago:
                Note that prior to its license change ScyllaDB was using AGPL. 
                This is a fully FLOSS license but may have been viewed
                nonetheless as somewhat unfriendly by potential outside
                contributors.  The ScyllaDB license change was really more
                about not wanting to expend development effort on maintaining
                multiple versions of the code (AGPL licensed and fully
                proprietary), so they went for sort of a split-the-difference
                approach where the fully proprietary version was in turn made
                source-available.
                
                (Notably, they're not arguing that open source reusers have
                been "unfair" to them and freeloaded on their effort, which was
                the key justification many others gave for relicensing their
                code under non-FLOSS terms.)
                
                In case anyone here is looking for a fully-FLOSS contender that
                they may want to perhaps contribute to, there's the interesting
                project YugabyteDB
                
   URI          [1]: https://github.com/yugabyte/yugabyte-db
       
                  cyphar wrote 15 hours 32 min ago:
                  I think AGPL/Proprietary license split and eventual move to
                  proprietary is just a slightly less overt way of the same
                  "freeloader" argument. The intention of the original license
                  was to make the software unpalatable to enterprises unless
                  you buy the proprietary license, and one "benefit" of the
                  move (at least for the bean counters) is that it stops even
                  AGPL-friendly enterprises from being able to use the software
                  freely.
                  
                  (Personally, I have no issues with the AGPL and Stallman
                  originally suggested this model to Qt IIRC, so I don't really
                  mind the original split, but that is the modern intent of the
                  strategy.)
       
                    kragen wrote 14 hours 35 min ago:
                    I think the intention of the original license was to make
                    the software unpalatable to SaaS vendors who want to keep
                    their changes proprietary, not unpalatable to enterprises
                    in general.
       
                kragen wrote 15 hours 59 min ago:
                Right, open source is generally of benefit to users, not to the
                author, and users do get some of that benefit from being able
                to see the source.  I wouldn't want to look at it myself,
                though, for legal reasons.
       
            mxey wrote 16 hours 14 min ago:
            Doesn’t Kafka/Redpanda have to fsync for every message?
       
              noselasd wrote 10 hours 0 min ago:
              No, it's for every batch.
       
              UltraSane wrote 15 hours 33 min ago:
              On enterprise grade storage writes go to NVRAM buffers before
              being flushed to persistent storage so this isn't much of a
              bottleneck.
       
                mxey wrote 14 hours 38 min ago:
                The context was somebody doing this on their laptop.
       
                  UltraSane wrote 11 hours 1 min ago:
                  I was expanding the context
       
              uberduper wrote 16 hours 11 min ago:
              I've never looked at redpanda, but kafka absolutely does not.
              Kafka uses mmapped files and the page cache to manage durable
              writes. You can configure it to fsync if you like.
       
                mxey wrote 16 hours 10 min ago:
                If I don’t actually want durable and consistent data, I could
                also turn off fsync in Postgres …
       
                  mrkeen wrote 15 hours 35 min ago:
                  The tradeoff here is that Kafka will still work perfectly if
                  one of its instances goes down.  (Or you take it down, for
                  upgrades, etc.)
                  
                  Can you lose one Postgres instance?
       
                    zozbot234 wrote 15 hours 18 min ago:
                    AIUI Postgres has high-availability out of the box, so it's
                    not a big deal to "lose" one as long as a secondary can
                    take over.
       
                      mxey wrote 14 hours 39 min ago:
                      Only replication is built-in, you need to add a cluster
                      manager like Patroni to make it highly-available.
       
              PeterCorless wrote 16 hours 11 min ago:
              Yes, for Redpanda. There's a blog about that:
              
              "The use of fsync is essential for ensuring data consistency and
              durability in a replicated system. The post highlights the common
              misconception that replication alone can eliminate the need for
              fsync and demonstrates that the loss of unsynchronized data on a
              single node still can cause global data loss in a replicated
              non-Byzantine system."
              
              However, for all that said, Redpanda is still blazingly fast.
              
   URI        [1]: https://www.redpanda.com/blog/why-fsync-is-needed-for-da...
       
                uberduper wrote 15 hours 22 min ago:
                I'm highly skeptical of the method employed to simulate
                unsync'd writes in that example.
                Using a non-clustered zookeeper and then just shutting it down,
                breaking the kafka controller and preventing any kafka cluster
                state management (not just preventing partition leader
                election) while manually corrupting the log file. Oof. Is it
                really _that_ hard to lose ack'd data from a kafka cluster that
                you had to go to such contrived and dubious lengths?
       
                  jackvanlightly wrote 10 hours 9 min ago:
                  We fixed that particular issue:
                  
   URI            [1]: https://jack-vanlightly.com/blog/2023/8/17/kafka-kip...
       
                  mxey wrote 14 hours 15 min ago:
                  I just read the post and didn’t find it contrived at all.
                  The point is to simulate a) network isolation and b) loss of
                  recent writes.
       
                  mxey wrote 14 hours 35 min ago:
                  > while manually corrupting the log file
                  
                  To be fair, since without fsync you don't have any ordering
                  guarantees for your writes, a crash has a good chance of
                  corrupting your data, not just losing recent writes.
                  
                  That's why in PostgreSQL it's feasible to disable [1] but not
                  to disable [1] .
                  
   URI            [1]: https://www.postgresql.org/docs/18/runtime-config-wa...
   URI            [2]: https://www.postgresql.org/docs/18/runtime-config-wa...
       
                  kasey_junk wrote 15 hours 12 min ago:
                  Kafka no longer has Zookeeper dependency and RedPanda never
                  did (this is just an aside for those reading along, not a
                  rebuttal).
       
              kragen wrote 16 hours 12 min ago:
              Definitely not in the case of Kafka.  Even with SSD that would
              limit it to around 100kHz.  Batch commit allows Kafka (and
              Postgres) to amortize fsync overhead over many messages.
       
            j45 wrote 16 hours 25 min ago:
            Is it about what Kafka could get or what you need right now.
            
            Kafka is a full on steaming solution.
            
            Postgres isn’t a buzzword. It can be a capable placeholder until
            it’s outgrown. One can arrive at Kafka with a more informed run
            history from Postgres.
       
              kitd wrote 16 hours 18 min ago:
              > Kafka is a full on steaming solution.
              
              Freudian slip? ;)
       
                j45 wrote 16 hours 1 min ago:
                Haha, and a typo!
       
          010101010101 wrote 16 hours 30 min ago:
          > If you don't need what kafka offers, don't use it.
          
          This is literally the point the author is making.
       
            blenderob wrote 15 hours 52 min ago:
            >> If you don't need what kafka offers, don't use it.
            
            > This is literally the point the author is making.
            
            Exactly! I just don't understand why HN invariably always tends to
            bubble up the most dismissive comments to the top that don't even
            engage with the actual subject matter of the article!
       
            PeterCorless wrote 16 hours 20 min ago:
            But in this case, it is like saying "You don't need a fuel truck.
            You can transport 9,000 gallons of gasoline between cities by
            gathering 9,000 1-gallon milk jugs and filling each, then getting
            4,500 volunteers to each carry 2 gallons and walk the entire
            distance on foot."
            
            In this case, you do just need a single fuel truck. That's what it
            was built for. Avoiding using a design-for-purpose tool to achieve
            the same result actually is wasteful. You don't need 288 cores to
            achieve 243,000 messages/second. You can do that kind of throughput
            with a Kafka-compatible service on a laptop.
            
            [Disclosure: I work for Redpanda]
       
              ilkhan4 wrote 15 hours 52 min ago:
              I'll push the metaphor a bit: I think the point is that if you
              have a fleet of vehicles you want to fuel, go ahead and get a
              fuel truck and bite off on that expense. However, if you only
              have 1 or 2, a couple of jerry cans you probably already have + a
              pickup truck is probably sufficient.
       
              kragen wrote 16 hours 4 min ago:
              Getting a 288-core machine might be easier than setting up Kafka;
              I'm guessing that it would be a couple of weeks of work to learn
              enough to install Kafka the first time.  Installing Postgres is
              trivial.
       
                EdwardDiego wrote 49 min ago:
                Just use Strimzi if you're in a K8s world (disclosure used to
                work on Strimzi for RH, but I still think it's far better than
                Helm charts or fully self-managed, and far cheaper than fully
                managed).
       
                  kragen wrote 32 min ago:
                  Thanks! I didn't know about Strimzi!
       
                PeterCorless wrote 15 hours 51 min ago:
                The only thing that might take "weeks" is procrastination.
                Presuming absolutely no background other than general data
                engineering, a decent beginner online course in Kafka (or
                Redpanda) will run about 1-2 hours.
                
                You should be able to install within minutes.
       
                  kragen wrote 15 hours 35 min ago:
                  I mean, setting up Zookeeper, tweaking the kernel settings,
                  configuring the hardware, the kind of stuff mentioned in
                  guides like [1] and [2] .  Apparently you can do without
                  Zookeeper now, but that's another choice to make, possibly
                  doing careful experiments with both choices to see what's
                  better.  Much more discussion in [3] .
                  
                  None of this applies to Redpanda.
                  
   URI            [1]: https://medium.com/@ankurrana/things-nobody-will-tel...
   URI            [2]: https://dungeonengineering.com/the-kafkaesque-nightm...
   URI            [3]: https://news.ycombinator.com/item?id=37036291
       
                    PeterCorless wrote 15 hours 29 min ago:
                    True. Redpanda does not use Zookeeper.
                    
                    Yet to also be fair to the Kafka folks, Zookeeper is no
                    longer default and hasn't been since April 2025 with the
                    release of Apache Kafka 4.0:
                    
                    "Kafka 4.0's completed transition to KRaft eliminates
                    ZooKeeper (KIP-500), making clusters easier to operate at
                    any scale."
                    
                    Source:
                    
   URI              [1]: https://developer.confluent.io/newsletter/introduc...
       
                      EdwardDiego wrote 47 min ago:
                      Good on you for being fair in this discussion :)
       
                      kragen wrote 14 hours 34 min ago:
                      Right, I was talking about installing Kafka, not
                      installing Redpanda.  Redpanda may be perfectly fine
                      software, but bringing it up in that context is a bit
                      apples-and-oranges since it's not open-source:
                      
   URI                [1]: https://news.ycombinator.com/item?id=45748426
       
                brianmcc wrote 15 hours 53 min ago:
                "Lots of the team knows Postgres really well, nobody knows
                Kafka at all yet" is also an underrated factor in making
                choices. "Kafka was the ideal technical choice but we screwed
                up the implementation through well-intentioned inexperience"
                being an all too plausible outcome.
       
                  freedomben wrote 15 hours 21 min ago:
                  Indeed, I've seen this happen first hand where there was
                  really only one guy who really "knew" Kafka, and it was too
                  big of a job for just him.  In that case it was fine until he
                  left the company, and then it became a massive albatross and
                  a major pain point.  In another case, the eng team didn't
                  really have anyone who really "knew" Kafka but used a managed
                  service thinking it would be fine.  It was until it wasn't,
                  and switching away is not a light lift, nor is mass educating
                  the dev team.
                  
                  Kafka et al definitely have their place, but I think most
                  people would be much better off reaching for a simpler queue
                  system (or for some things, just using Postgres) unless you
                  really need the advanced features.
       
                    EdwardDiego wrote 48 min ago:
                    I'm wondering why there wasn't any push for the Kafka guy
                    to share his knowledge within his team, or to other teams?
       
            uberduper wrote 16 hours 20 min ago:
            It seems like their point was to criticize people for using new
            tech instead of hacking together unscalable solutions with their
            preferred database.
       
              EdwardDiego wrote 1 hour 1 min ago:
              Which is crazy, because Kafka is like olllld compared to
              competing tech like Pulsar and RedPanda. I'm trying to remember
              what year I started using v0.8, it was probably mid-late 2010s?
       
              blenderob wrote 15 hours 54 min ago:
              That wasn't their point. Instead of posting snarky comments,
              please review the site guidelines:
              
              "Please respond to the strongest plausible interpretation of what
              someone says, not a weaker one that's easier to criticize."
       
                lenkite wrote 15 hours 0 min ago:
                But honestly, isn't that the strongest plausible interpretation
                according to the "site guidelines" ? When one explicitly says
                that the one camp chases "buzzwords" and the other chases
                "common sense", how else are you supposed to interpret it ?
       
                  blenderob wrote 14 hours 35 min ago:
                  > how else are you supposed to interpret it?
                  
                  It's not so hard. You interpret it how it is written. Yes,
                  they say one camp chases buzzwords and another chases common
                  sense. Critique that if you want to. That's fine.
                  
                  But what's not written in the OP is some sort of claim that
                  Postgres performs better than Kafka. The opposite is written.
                  The OP acknowledges that Kafka is fast. Right there in the
                  title! What's written is OP's experiments and data that shows
                  Postgres is slow but can be practical for people who don't
                  need Kafka. Honestly I don't see anything bewildering about
                  it. But if you think they're wrong about Postgres being slow
                  but practical that's something nice to talk about. What's not
                  nice is to post snarky comments insinuating that the OP is
                  asking you to design unscalable solutions.
       
          loire280 wrote 16 hours 32 min ago:
          In fact, a properly-configured Kafka cluster on minimal hardware will
          saturate its network link before it hits CPU or disk bottlenecks.
       
            EdwardDiego wrote 57 min ago:
            Depends on how you configure the clients, ask me how I know that
            using a K8s pod id in a consumer group id is a really bad idea - or
            how setting batch size to 1 and linger to 0 is a really bad idea -
            the former blows up disk (all those unique consumer groups cause
            the backing topic to consume a lot of space, as the topic is by
            default only compacted) and the latter thrashes request handler CPU
            time.
       
            altcognito wrote 15 hours 12 min ago:
            This doesn't even make sense. How do you know what the network
            links or the other bottlenecks are like? There are a grandiose
            number of assumptions being made here.
       
              loire280 wrote 14 hours 19 min ago:
              There is a finite and relatively narrow range of ratios of CPU,
              memory, and network throughput in both modern cloud offerings and
              bare hardware configurations.
              
              Obviously it's possible to build, for example, a machine with 2
              cores, a 10Gbps network link, and a single HDD that would falsify
              my statement.
       
                altcognito wrote 6 hours 2 min ago:
                But the workload matters. Even the comment in the article
                doesn't completely make sense for me in that way -- if your
                workload is 50 operations per byte transferred versus 5000
                operations per byte transferred, there is a considerable
                difference in hardware requirements.
       
            UltraSane wrote 15 hours 32 min ago:
            A network link can be anything from 1Gbps to 800Gbps.
       
            theK wrote 16 hours 24 min ago:
            Isn't that true for everything on the cloud? I thought we are long
            into the era where your disk comes over the network there.
       
            j45 wrote 16 hours 24 min ago:
            But it can do so many processes a second I’ll be able to scale to
            the moon before I ever launch.
       
        ownagefool wrote 16 hours 40 min ago:
        The camps are wrong.
        
        There's poles.
        
        1. Is folks constantly adopting the new tech, whatever the motivation,
        and 
        2. I learned a thing and shall never learn anything else, ever.
        
        Of course nobody exists actually on either pole, but the closer you are
        to either, the less pragmatic you are likely to be.
       
          jppope wrote 4 hours 58 min ago:
          So 1. RDD 2. Curmudgeon and 3. People who rationally look at the
          problem and try to solve it in the best way possible (omitted in the
          article)
       
          binarymax wrote 16 hours 23 min ago:
          This is it right here. My foil is the Elasticsearch replacement
          because PG has inverted indices. The ergonomics and tunability of
          these in PG are terrible compared to ES.  Yes, it will search, but I
          wouldn’t want to be involved in constructing or maintaining that
          search.
       
          wosined wrote 16 hours 33 min ago:
          I am the third pole: 3. Everything we have currently sucks and what
          is new will suck for some hitherto unknown reason.
       
            antonvs wrote 14 hours 11 min ago:
            If you choose wisely, things should suck less overall as you move
            forward. That's kind of the overall goal, otherwise we'd all still
            be toggling raw machine code into machines using switches.
       
            ownagefool wrote 16 hours 25 min ago:
            Heh, me too.
            
            I think it's still just 2 poles.  However, I probably shouldn't
            have prescribed motivation to latter pole, as I purposely did not
            with the former.
            
            Pole 2 is simply never adopt anything new ever, for whatever the
            motivation.
       
        vbezhenar wrote 16 hours 40 min ago:
        How do you implement "unique monotonically-increasing offset number"?
        
        Naive approach with sequence (or serial type which uses sequence
        automatically) does not work. Transaction "one" gets number "123",
        transaction "two" gets number "124". Transaction "two" commits, now
        table contains "122", "124" rows and readers can start to process it.
        Then transaction "one" commits with its "123" number, but readers
        already past "124". And transaction "one" might never commit for
        various reasons (e.g. client just got power cut), so just waiting for
        "123" forever does not cut it.
        
        Notifications can help with this approach, but then you can't restart
        old readers (and you don't need monotonic numbers at all).
       
          dagss wrote 1 hour 27 min ago:
          The article describes using a dedicated table for the counter, one
          row per table, in the same transaction (so parallel writers to the
          same table wait for each other through a lock on that row).
          
          If you would rather have readers waiting and parallel writers there
          is a more complex scheme here:
          
   URI    [1]: https://blog.sequinstream.com/postgres-sequences-can-commit-...
       
          hunterpayne wrote 8 hours 34 min ago:
          The "unique monotonically-increasing offset number" use case works
          just fine.  I need a unique sequence number in ascending order
          doesn't (your problem).  Why you need two queue to share the same
          sequence object is your problem I think.
          
          Another way to speed it up is to grab unique numbers in batches
          instead of just getting them one at a time.  No idea why you want
          your numbers to be in absolute sequence.  That's hard in a
          distributed system.  Probably best to relax that constraint and find
          some other way to track individual pieces of data.  Or even better,
          find a way so you don't have to track individual rows in a
          distributed system.
       
          name_nick_sex_m wrote 9 hours 8 min ago:
          Funnily enough, I was just designing a queue exactly this way, thanks
          for catching this. (chat GPT meanwhile was assuring me the approach
          was airtight)
       
            1oooqooq wrote 7 hours 10 min ago:
            you're really trying to vibe architect?
       
          procaryote wrote 10 hours 24 min ago:
          In the article, they just don't and instead do "SELECT FOR UPDATE
          SKIP LOCKED" to make sure things get picked up once.
       
            dagss wrote 1 hour 15 min ago:
            The article speaks of two usecases, work queue and pub/sub event
            log. You talk about the first and the comment you reply to the
            latter. You need event sequence numbering for the pub/sub event
            log.
            
            In a sense this is what Kafka IS architecturally: The component
            that assigns event sequence numbers.
       
          grogers wrote 10 hours 42 min ago:
          You can fill in a noop for sequence number 123 after a timeout. You
          also need to be able to kill old transactions so that the transaction
          which was assigned 123 isn't just chilling out (which would block
          writing the noop).
          
          Another approach which I used in the past was to assign sequence
          numbers after committing. Basically a separate process periodically
          scans the set of un-sequenced rows, applies any application defined
          ordering constraints, and writes in SNs to them. This can be
          surprisingly fast, like tens of thousands of rows per second. In my
          case, the ordering constraints were simple, basically that for a
          given key, increasing versions get increasing SNs. But I think you
          could have more complex constraints, although it might get tricky
          with batch boundaries
       
            vbezhenar wrote 10 hours 25 min ago:
            My approach is: select max(id), and commit with id=max(id)+1. If
            commit worked, then all good. If commit failed because of unique
            index violation, repeat the transaction from the beginning. I think
            it should work correctly with proper transaction isolation level.
       
              grogers wrote 5 hours 51 min ago:
              That limits you to a few tens of TPS since everything is trying
              to write the same row which must happen serially. I wouldn't
              start out with that solution since it'll be painful to change to
              something more scalable later. Migrating to something better will
              probably involve more writes per txn during the migration, so it
              gets even worse before it gets better.
       
                dagss wrote 1 hour 12 min ago:
                The counter in another table used in the article also
                serializes all writers to the table. Probably better than the
                max() approach but still serial.
                
                There needs to be serialization happening somewhere, either by
                writers or readers waiting for their turn.
                
                What Kafka "is" in my view is simply the component that assigns
                sequential event numbers. So if you publish to Kafka, Kafka
                takes the same locks...
                
                How to increase throughput is add more shards in a topic.
       
              name_nick_sex_m wrote 9 hours 9 min ago:
              Does the additional read query cause concern? Or mostly this is
              ok? 
              (i'm sure the answer depends on scale)
       
          munchbunny wrote 14 hours 8 min ago:
          I have this problem in the system I work on - the short nuance-less
          answer from my experience is that, once your scale gets large enough,
          you can't prevent ordering issues entirely and you have to build the
          resilience into the architecture and the framing of the problem. You
          often end up paying for consistency with latency.
       
            dagss wrote 1 hour 5 min ago:
            I think you may be talking past each other. In the approach taken
            in the article and the parent comment, if the event sequence number
            allocation of the writer races the reader cursor position in the
            wrong way, events will NEVER BE DELIVERED.
            
            So it is a much more serious issue at stake here than event
            ordering/consistency.
            
            As it happens, if you use event log tables in SQL "the Kafka way"
            you actually get guarantee on event ordering too as a side effect,
            but that is not the primary goal.
            
            More detailed description of problem:
            
   URI      [1]: https://github.com/vippsas/mssql-changefeed/blob/main/MOTI...
       
          xnorswap wrote 16 hours 11 min ago:
          It's a tricky problem, I'd recommend reading DDIA, it covers this
          extensively: [1] You can generate distributed monotonic number
          sequences with a Lamport Clock. [2] The wikipedia entry doesn't
          describe it as well as that book does.
          
          It's not the end of the puzzle for distributed systems, but it gets
          you a long way there.
          
          See also Vector clocks. [3] Edit: I've found these slides, which are
          a good primer for solving the issue, page 70 onwards "logical time":
          
   URI    [1]: https://www.oreilly.com/library/view/designing-data-intensiv...
   URI    [2]: https://en.wikipedia.org/wiki/Lamport_timestamp
   URI    [3]: https://en.wikipedia.org/wiki/Vector_clock
   URI    [4]: https://ia904606.us.archive.org/32/items/distributed-systems...
       
          singron wrote 16 hours 16 min ago:
          The log_counter table tracks this. It's true that a naive solution
          using sequences does not work for exactly the reason you say.
       
          sigseg1v wrote 16 hours 19 min ago:
          What about a `DEFERRABLE INITIALLY DEFERRED` trigger that increments
          a sequence only on commit?
       
          theK wrote 16 hours 26 min ago:
          > unique monotonically-increasing offset number
          
          Isn't it a bit of a white whale thing that a umion can solve all
          one's subscriber problems? Afaik even with kafka this isn't
          completely watertight.
       
        odie5533 wrote 16 hours 44 min ago:
        How fast is failover?
       
        johnyzee wrote 16 hours 47 min ago:
        Seems like you would at the very least need a fairly thick application
        layer on top of Postgres to make it look and act like a messaging
        system. At that point, seems like you have just built another messaging
        system.
        
        Unless you're a five man shop where everybody just agrees to use that
        one table, make sure to manage transactions right, cron job retention,
        YOLO clustering, etc. etc.
        
        Performance is probably last on the list of reasons to choose Kafka
        over Postgres.
       
          j45 wrote 16 hours 8 min ago:
          You expose the api on Postgres much like any other group of
          developers use and call it a day.
          
          There’s several implementations of queues to increase the chance of
          finishing what one is after.
          
   URI    [1]: https://github.com/dhamaniasad/awesome-postgres
       
            dagss wrote 57 min ago:
            There's a lot of logic involved client side regarding managing read
            cursors and marking events as processed consumer side. Possibly
            also client side error queues and so on.
            
            I truly miss a good standard client side library following the
            Kafka-in-SQL philosophy. I started on in my previous job and we
            used it internally but it never got good enough that it would be
            widely used elsewhere, and now I work somewhere else...
            
            (PS: Talking about the pub/sub Kafka-like usecase, not the work
            queue FOR UPDATE usecase)
       
        jimbokun wrote 16 hours 51 min ago:
        For me the killer feature of Kafka was the ability to set the offset
        independently for each consumer.
        
        In my company most of our topics need to be consumed by more than one
        application/team, so this feature is a must have.  Also, the ability to
        move the offset backwards or forwards programmatically has been a life
        saver many times.
        
        Does Postgres support this functionality for their queues?
       
          altcognito wrote 16 hours 26 min ago:
          The article basically states unless you need a lot of throughput, you
          probably don't need Kafka. (my interpretation extends to say) You
          probably don't need offsets because you don't need multi-threaded
          support because you don't need multiple threads.
          
          I don't know what kind of native support PG has for queue management,
          the assumption here is that a basic "kill the task as you see it" is
          usually good enough and the simplicity of writing and running a
          script far outweighs the development, infrastructure and devops costs
          of Kafka.
          
          But obviously, whether you need stuff to happen in 15 seconds instead
          of 5 minutes, or 5 minutes instead of an hour is a business decision,
          along with understanding the growth pattern of the workload you
          happen to have.
       
            jimbokun wrote 12 hours 34 min ago:
            Well in my workplace we need all of those things.
       
            j45 wrote 16 hours 13 min ago:
            PG has several queue management extensions and I’m working my way
            through trying them out.
            
            Here is one: [1] Some others: [2] Most of my professional life I
            have considered Postgres folks to be pretty smart… while I by
            chance happened to go with MySQL and it became the rdbms I thought
            in by default.
            
            Heavily learning about Postgres recently has been okay, not much
            different than learning the tweaks for msssl, oracle or others.
            Just have to be willing to slow down a little for a bit and enjoy
            it instead of expecting to thrush thru everything.
            
   URI      [1]: https://pgmq.github.io/pgmq/
   URI      [2]: https://github.com/dhamaniasad/awesome-postgres
       
              dagss wrote 22 min ago:
              pgmq looks cool, thanks for the link!
              
              But it looks like a queue, which is a fundamentally different
              data structure from an event log, and Kafka is an event log.
              
              They are very different usecases; work distribution vs pub/sub.
              
              The article talks about both usecases, assuming the reader is
              very familiar with the distinction.
       
          Jupe wrote 16 hours 35 min ago:
          Isn't it just a matter of having each consumer use their own offset?
          I mean if the queue table is sequentially or time-indexed, the
          consumer just provides a smaller/earlier key to accomplish the
          offset?
          (Maybe I'm missing something here?)
       
            cortesoft wrote 4 hours 21 min ago:
            Kafka allows you to have a consumer group… you can have multiple
            workers processing messages in parallel, and if they all use the
            same group id, the messages will be sharded across all the workers
            using that key… so each message will only be handled by one
            worker using that key, and every message will be given to exactly
            one worker (with all the usual caveats of
            guaranteed-processed-exactly-once queues). Other consumers can use
            different group keys and they will also get every single message
            exactly once.
            
            So if you want an individual offset, then yes, the consumer could
            just maintain their own… however, if you want a group’s offset,
            you have to do something else.
       
            jimbokun wrote 12 hours 36 min ago:
            Yes.
            
            Is a queuing system baked into Postgres?  Or there client libraries
            that make it look like one?
            
            And do these abstractions allow for arbitrarily moving the offset
            for each consumer independently?
            
            If you're writing your own queuing system using pg for persistence
            obviously you can architect it however you want.
       
            altcognito wrote 16 hours 25 min ago:
            Correct, offsets and sharding aren't magic. And partitions in Kafka
            are user defined, just like they would be for postgresql.
       
        honkostani wrote 16 hours 53 min ago:
        Resume driven design, is running into the desert of moores plateau
        punishing the use of ever more useless abstractions. They get quieter,
        because their projects keep on dying after the revolutionary tech is
        introduced and they jump ship.
       
        sneilan1 wrote 16 hours 57 min ago:
        I'm starting to like mongodb a lot more given the python library
        mongomock. I find it wonderful to create tests that run my queries
        against mongo in code before I deploy them. Yes, mongo has a lot of
        quirks and you have to know aws networking to set it up with your vpc
        so you don't get nailed with egress costs. And it's not the same query
        patterns and some queries are harder and you have maintain your own
        schemas. But the ability to test mongo code with mongomock w/o having
        to run your own mongo server is SO VALUABLE. And yes, there are edge
        cases with mongomock not supporting something but the library is open
        source and pretty easy to modify. And it fails loudly which is super
        helpful. So if something is not supported you'll know. Maybe you might
        find a real nasty feature that's hard to implement but then just use a
        repository pattern like you would for testing postgres code in your
        application. [1] Extrapolating from my personal usage of this library
        to others, I'm starting to think that mongodb's 25 billion dollar
        valuation is partially based on this open source package :)
        
   URI  [1]: https://github.com/mongomock/mongomock
       
          j45 wrote 16 hours 3 min ago:
          That might work for some.
          
          I prefer not to start with a nosql database and then undertake
          odysseys to make it into a relational database.
       
            sneilan1 wrote 11 hours 2 min ago:
            This is the way.
       
          philipallstar wrote 16 hours 35 min ago:
          You can also do this with sqlite, running an in-memory sqlite is
          lightning fast and I don't think there are any edge cases. Obviously
          doesn't work for everything, but when sqlite is possible, it's great!
       
            sneilan1 wrote 16 hours 34 min ago:
            True but if you wind up using parts of postgres that aren't
            supported by sqlite then it's harder to use sqlite. I agree
            however, if I was able to just use sqlite, I would do that instead.
            But I'm using a lot of postgres extensions & fields that don't have
            direct mappings to sqlite.
            
            Otherwise SQLITE :)
       
          candiddevmike wrote 16 hours 49 min ago:
          Curious why you think the risk of edge cases from mocking is a
          worthwhile trade off vs the relatively low complexity of setting up a
          container to test against?
       
            sneilan1 wrote 14 hours 4 min ago:
            The other unspoken aspect of this is with agentic coding, the
            ability to have the ai also test queries quickly is very valuable.
            In a non-agentic coding setup, mongomock would not be as useful.
       
            sneilan1 wrote 16 hours 39 min ago:
            Because I can read the mongomock library and understand exactly
            what it's doing. And mongo's aggregation pipelines are easier to
            model than sql queries in code. Sure, it's possible to run into an
            edge case but for a lot of general queries for filtering &
            aggregation, it's just fine.
       
          pphysch wrote 16 hours 49 min ago:
          Or just use devcontainers and have an actual Postgres DB to test
          against? I've even done this on a Chromebook. This is a solved
          problem.
       
            sneilan1 wrote 16 hours 47 min ago:
            True but then my tests take longer to run. I really like having
            very fast tests. And then my tests have to make local network calls
            to a postgres server. I like my tests isolated.
       
              pphysch wrote 15 hours 9 min ago:
              They are isolated, your devcontainer config can live in your
              source repo. And you're not gonna see significant latency from
              your loopback interface... If your test suite includes billions
              of queries you may want to reassess.
       
                sneilan1 wrote 11 hours 0 min ago:
                You know what, you have a very good point. I'll give this
                another shot. Maybe it can be fast enough and I can just
                isolate the orm queries to some kind of repository pattern so
                I'm not testing sql queries over and over.
       
        guywithahat wrote 16 hours 57 min ago:
        > One camp chases buzzwords
        
        > ...
        
        > The other camp chases common sense
        
        I don't really like these simplifications. Like one group obviously
        isn't just dumb, they're doing things for reasons you maybe don't
        understand. I don't know enough about data science to make a call, but
        I'm guessing there were reasons to use Kafka due to current hardware
        limits or scalability concerns, and while the issues may not be as
        present today that doesn't mean they used Kafka just because they heard
        a new word and wanted to repeat it.
       
          temporallobe wrote 16 hours 23 min ago:
          Agree with this sentiment - it’s easy to be judgmental about these
          things, but project-level issues and decisions can be very
          complicated and engineers often have little to no visibility into
          them. We’re using Kafka for a gigantic pipeline where IMO any
          reasonably modern database would suffice (and may even be superior),
          but our performance requirements are unclear. At some point in the
          distant future, we may have a significant surge in data quantity and
          speed, requiring greater throughput and (de)serialization speed, but
          I am not convinced that Kafka ultimately helps us there. I imagine
          this is a case where the program leadership was sold a solution which
          we are now obligated to use. This happens a LOT, and I have seen
          unnecessary and unused products cost companies millions over the
          years. For example, my team was doing analysis on replacing our
          existing Atlassian Data Center with other solutions, and in doing so,
          we discovered several underused/unused Atlassian plugins for which we
          are paying very high license fees. At some point, users over the
          years had requested some functionality for a specific workflow and
          the plugins were purchased. The people and projects went away or
          otherwise processes became OBE, but the plugins happily hummed along
          while the bills were paid.
       
          sumtechguy wrote 16 hours 52 min ago:
          Kafka and other message systems like it have their uses.  But
          sometimes all you need is just need a database.  Now you start doing
          realtime streaming and notifications and event type things a
          messaging system is good.  You can even back it up with a boring
          database.  Would I start with kafka?  Probably not.  I would start
          with a boring databsee and then if if my bashing on the db over and
          over saying 'have you changed' doesnt work as good anymore then you
          put in a messaging system.
       
        agentultra wrote 16 hours 57 min ago:
        You have to be careful with the approach of using Postgres for
        everything. The way it locks tables and rows and the serialization
        levels it guarantees are not immediately obvious to a lot of folks and
        can become a serious bottle-neck for performance-sensitive workloads.
        
        I've been a happy Postgres user for several decades. Postgres can do a
        lot! But like anything, don't rely on maxims to do your engineering for
        you.
       
          skunkworker wrote 8 hours 28 min ago:
          I wish postgres would add a durable queue like data structure. But
          trying to make a durable queue that can scale beyond what a simple
          redis instance can do starts to run into problems quickly.
          
          Also, LISTEN/NOTIFY do not scale, and they introduce locks in areas
          you aren't expecting -
          
   URI    [1]: https://news.ycombinator.com/item?id=44490510
       
            abtinf wrote 6 hours 48 min ago:
            SKIP LOCKED doesn't work for your use case?
       
          AtlasBarfed wrote 9 hours 0 min ago:
          Postgres is just fantastic software.
          
          But anytime you treat a database, or a queue, like a black box
          dumpster, problems will ensue.
       
            EdwardDiego wrote 52 min ago:
            Exactly. Or worse, you treat one as a straightforward black box
            swap in replacement for another. If you're looking to scale, you
            _will_ need to code to the idiosyncraties of your chosen solution.
       
          javier2 wrote 10 hours 46 min ago:
          Postgres doesnt scale into oblivion, but it can take some serious
          chunks of data once you start batching and making sure a every
          operation only touches single row with no transactions needed.
       
            AtlasBarfed wrote 8 hours 40 min ago:
            And then you are 99% of the way to Cassandra.
            
            Of course the other 99% is the remaining 1%.
       
              javier2 wrote 7 hours 19 min ago:
              Nearly true, but you dont need to run a cassandra cluster to ship
              your 3k msg/sec and you can take smaller locks if you have a
              small number of senders that delete sent messages and send in
              chunks
       
          SoftTalker wrote 16 hours 17 min ago:
          This is true of any data storage. You have to understand the
          concurrency model and assumptions, and know where bottlenecks can
          happen. Even among relational databases there are significant
          differences.
       
          j45 wrote 16 hours 22 min ago:
          100%
          
          Postgres isn’t meant to be a guaranteed permanent replacement.
          
          It’s a common starting point for a simpler stack which can retain a
          greater deal of flexibility out of the box and increased velocity.
          
          Starting with Postgres lets the bottlenecks reveal themselves, and
          then optimize from there.
          
          Maybe a tweak to Postgres or resources, or consider a jump to Kafka.
       
          fud101 wrote 16 hours 24 min ago:
          When someone says just use Postgres, are they using the same instance
          for their data as well for the queue?
       
            Yeroc wrote 14 hours 18 min ago:
            You would typically want to use the same database instance for your
            queue as long as you can get away with it because then transaction
            handling is trivial.  As soon as you move the queue somewhere else
            you need to carefully think about how you'll deal with
            transactionality.
       
            victorbjorklund wrote 15 hours 9 min ago:
            Yes, I often use PG for queues on the same instance. Most of the
            time you dont see any negative effects. For a new project with
            barely any users it doesn’t matter.
       
            marcosdumay wrote 16 hours 7 min ago:
            When people say "just use postgres" it's because their immediate
            need is so low that this doesn't matter.
            
            And the thing is, a server from 10 years ago running postgres (with
            a backup) is enough for most applications to handle thousands of
            simultaneous users. Without even going into the kinds of
            optimization you are talking about. Adding ops complexity  for the
            sake of scale on the exploratory phase of a product is a really bad
            idea when there's an alternative out there that can carry you until
            you have fit some market. (And for some markets, that's enough
            forever.)
       
            j45 wrote 16 hours 21 min ago:
            It can be a different database in the same server or a separate
            server.
            
            When you’re doing hundreds or thousands of transactions to begin
            with it doesn’t really impact as much out of the gate.
            
            Of course there will be someone who will pull out something that
            won’t work but such examples can likely be found for anything.
            
            We don’t need to fear simplification, it is easy to complicate
            later when the actual complexities reveal themselves.
       
          fukka42 wrote 16 hours 40 min ago:
          My strategy is to use postgres first. Get the idea off the ground and
          switch when postgres becomes the bottleneck.
          
          It often doesn't.
       
            jorge-d wrote 16 hours 24 min ago:
            Definitely, this is also one of the direction Rails is heading[1]:
            provide a basis setup most of the people can use out of the box.
            And if needed you can always plug in more "mature" solutions
            afterwards.
            
   URI      [1]: https://rubyonrails.org/2024/11/7/rails-8-no-paas-required
       
          sneilan1 wrote 16 hours 52 min ago:
          Yes, performance can be a big issue with postgres. And vertical
          scaling can really put a damper on things when you have a major
          traffic hit. Using it for kafka is misunderstanding the one of the
          great uses of kafka which is to help deal with traffic bursts. All of
          a sudden your postgres server is overwhelmed and the kafka server
          would be fine.
       
            zenmac wrote 15 hours 21 min ago:
            >And vertical scaling can really put a damper on things when you
            have a major traffic hit.
            
            Wouldn't OrioleDB solve that issue though?
       
              sneilan1 wrote 15 hours 0 min ago:
              Not familiar with OrioleDB. I’ll look it up. May I ask how this
              helps? Just curious.
       
        cpursley wrote 16 hours 58 min ago:
        Related: [1] It's built on pgmq and not married to supabase (nearly
        everything is in the database).
        
        Postgres is enough.
        
   URI  [1]: https://www.pgflow.dev
       
        jjice wrote 17 hours 1 min ago:
        This is a well written addition to the list of articles I need to
        reference on occasion to keep myself from using something new.
        
        Postgres really is a startup's best friend most of the time. Building a
        new product that's going to deal with a good bit of reporting that I
        began to look at OLAP DBs for, but had hesitation to leave PG for it.
        This kind of seals it for me (and of course the reference to the class
        "Just Use Postgres for Everything" post helps) that I should Just Use
        Postgres (R).
        
        On top of being easy to host and already being familiar with it, the
        resources out there for something like PG are near endless. Plus the
        team working on it is doing constant good work to make it even more
        impressive.
       
          j45 wrote 16 hours 6 min ago:
          It’s totally reasonable to start with fewer technologies to do more
          and then outgrow them.
       
            sanskarix wrote 1 hour 40 min ago:
            This mindset is criminally underrated in the startup/indie builder
            world. There's so much pressure to architect for scale you might
            never reach, or to use "industry standard" stacks that add enormous
            complexity.
            
            I've been heads-down building a scheduling tool, and the number of
            times I've had to talk myself out of over-engineering is
            embarrassing. "Should I use Kafka for event streaming?" No. "Do I
            need microservices?" Probably not. "Can Postgres handle this?"
            Almost certainly yes.
            
            The real skill is knowing when you've actually outgrown something
            vs. when you're just pattern-matching what Big Tech does. Most
            products never get to the scale where these distinctions
            matter—but they DO die from complexity-induced paralysis.
            
            What's been your experience with that inflection point where you
            actually needed to graduate to more complex tooling? How did you
            know it was time?
       
        zer00eyz wrote 17 hours 2 min ago:
        > Should You Use Postgres? Most of the time - yes. You should always
        default to Postgres until the constraints prove you wrong.
        
        Kafka, GraphQL... These are the two technology's where my first
        question is always this: Does the person who championed/lead this
        project still work here?
        
        The answer is almost always "no, they got a new job after we launched".
        
        Resume Architecture is a real thing. Meanwhile the people left behind
        have to deal with a monster...
       
          sitestable wrote 16 hours 34 min ago:
          The best architecture decision is the one that's still maintainable
          when the person who championed it leaves. Always pretend the person
          who maintains a project after you knows where you live and all that.
       
          bencyoung wrote 16 hours 39 min ago:
          Kafka is great tech, never sure why people have an issue with it.
          Would I use it all the time? No, but where it's useful, it's really
          useful, and opens up whole patterns that are hard to implement other
          ways
       
            bonesss wrote 16 hours 6 min ago:
            Kafka also provides early architectural scaffolding for multiple
            teams to build in parallel with predictable outcomes (in addition
            to the categorical answers to hard/error-prone patterns). It’s
            been adopted in principle by the services on, and is offered
            turn-key by, all the major cloud providers.
            
            Personally I’d expect some kind of internal interface to abstract
            away and develop reusable components for such an external
            dependency, which readily enables having relational data stores
            mirroring the brokers functionality.  Handy for testing and some
            specific local scenarios, and those database backed stores can
            easily pull from the main cluster(s) later to mirror data as
            needed.
       
            evantbyrne wrote 16 hours 27 min ago:
            Managed hosting is expensive to operate and self-managing kafka is
            a job in of itself. At my last employer they were spending six
            figures to run three low volume clusters before I did some work to
            get them off some enterprise features, which halved the cost, but
            it was still at least 5x the cost of running a mainstream queue.
            Don't use kafka if you just need queuing.
       
              j45 wrote 16 hours 4 min ago:
              Engaging difficulty is a form of procrastination and avoiding
              stoking a product in some cases.
              
              Instead of not knowing 1 thing to launch.. let’s pick as many
              new to us things, that will increase the chances of success.
       
              bencyoung wrote 16 hours 5 min ago:
              Cheapest MSK cluster is $100 a month and can easily run a dev/uat
              cluster with thousands of messages a second. They go up from
              there but we've made a lot of use of these and they are pretty
              useful
       
                evantbyrne wrote 15 hours 20 min ago:
                It's not the dev box with zero integrations/storage that's
                expensive. AWS was quoting us similar numbers for MSK. Part of
                the issue is that modern kafka has become synonymous with
                Confluent, and once you buy into those features, it is very
                difficult to go back. If you're already on AWS and just need
                queuing, start with SQS.
       
                singron wrote 15 hours 21 min ago:
                I've basically never had a problem with MSK brokers. The issue
                has usually been "why are we rebalancing?" and "why aren't we
                consuming?", i.e. client problems.
       
              CuriouslyC wrote 16 hours 23 min ago:
              I always push people to start with NATS jetstream unless I 100%
              know they won't be able to live without Kafka features. It's
              performant and low ops.
       
          Groxx wrote 16 hours 46 min ago:
          having never hosted a GraphQL service, but I can see many obvious
          room for problems:
          
          is there some reason GraphQL gets so much hate?  it always feels to
          me like it's mostly just a normal RPC system but with some incredibly
          useful features (pipelining, and super easy to not request data you
          don't need), with obvious perf issues in code and obvious room for
          perf abuse because it's easy to allow callers to do N+1 nonsense.
          
          so I can see why it's not popular to get stuck with for public APIs
          unless you have infinite money, it's relatively wide open for abuse,
          but private seems pretty useful because you can just smack the people
          abusing it.  or is it more due to specific frameworks being
          frustrating, or stuff like costly parsing and serialization and
          difficult validation?
       
            marcosdumay wrote 15 hours 56 min ago:
            Take a look on how to implement access control over GraphQL
            requests. It's useless for anything that isn't public data (at
            least public for your entire network).
            
            And yes, you don't want to use it for public APIs. But if you have
            private APIs that are so complex that you need a query language,
            and still want use those over web services, you are very likely
            doing something really wrong.
       
              Groxx wrote 15 hours 39 min ago:
              I'm honestly not seeing much here that isn't identical to almost
              all other general purpose RPC systems: [1] "check that the user
              matches the data they're requesting by comparing the context and
              request field by hand" is ultra common - there are some real
              benefits to having authorization baked into the language, but it
              seems very rare in practice (which is part of why it's often
              flawed, but following the overwhelming standard is hardly
              graphql's mistake imo).  I'd personally think capabilities are a
              better model for this, but that seems likely pretty easy to chain
              along via headers?
              
   URI        [1]: https://graphql.org/learn/authorization/
       
                marcosdumay wrote 8 hours 50 min ago:
                > identical to almost all other general purpose RPC systems
                
                The problem is that GraphQL doesn't behave like all other
                general purpose RPC systems. As a rule, authorization does not
                work on the same abstraction level as GraphQL.
                
                And that explanation you quoted is disingenuous, because
                GraphQL middleware and libraries don't usually export places
                where you can do anything by hand.
       
            twodave wrote 16 hours 21 min ago:
            As someone who works with GraphQL daily, many of the criticisms out
            there are from before the times of persisted queries, query cost
            limits, and composite schemas. It’s a very mature and useful
            technology. I agree with it maybe being less suitable for a public
            API, but less because of possible abuse and more because simple
            HTTP is a lot more widely known. It depends on the context, as in
            all things, of course.
       
              Groxx wrote 16 hours 4 min ago:
              yeah, I took one look at it and said "great, so add some cost
              tracking and kill requests before they exceed it" because like. 
              obviously.  it's similar to exposing a SQL endpoint: you need to
              build for that up front or the obvious results will happen.
              
              which I fully understand is more work than "it's super easy just
              X" which it gets presented as, but that's always the cost of
              super flexible things.    does graphql (or the ecosystem, as that's
              part of daily life of using it) make that substantially worse
              somehow? because I've dealt with people using protobuf to avoid
              graphql, then trying to reimplement parts of its features, and
              the resulting API is always an utter abomination.
       
          janwijbrand wrote 16 hours 49 min ago:
          "resume" as in "resumé" not as in "begin again or continue after a
          pause or interruption" - it took me longer than I care to admit to
          get that.
       
          forgetfulness wrote 16 hours 52 min ago:
          We’re all passing through our jobs, the value of the solutions
          remains in the hands of the shareholders, if you don’t try to
          squeeze some long-term value for your resume and long-term
          employability, you’re assuming a significant opportunity cost on
          their behalf
          
          They’ll be fine if you made something that works, even if it was a
          bit faddish, make sure you take care of yourself along the way (they
          won’t)
       
            candiddevmike wrote 16 hours 48 min ago:
            Attitudes like this are why management treats developers like
            children who constantly need to be kept on task, IMO.
       
              forgetfulness wrote 16 hours 41 min ago:
              Software is a line of work that has astounding amounts of
              autonomy, if you compare it to working in almost anything else.
              
              My point stands, company loyalty tallies up to very little when
              you’re looking for your next job; no interviewer will care much
              to hear of how you stood firm, and ignored the siren song of tech
              and practices that were more modern than the one you were handed
              down (the tech and practices they’re hiring for).
              
              The moment that reverses, I will start advising people not to
              skill up, as it will look bad in their resumes.
       
          darkstar_16 wrote 16 hours 56 min ago:
          GraphQL sure, but I'm not sure I'd put kafka in the same bucket. It
          is a nice technology that has it's use in some cases, where
          postgresql would not work. It is also something a small team should
          not start with. Start with postgres and then move on to something
          else when the need arises.
       
          kvdveer wrote 16 hours 57 min ago:
          To be fair, this is true for all technologically interesting
          solutions, even when they use postgres. People championing novel
          solutions typically leave after the window for creativity has closed.
       
        qsort wrote 17 hours 2 min ago:
        I feel so seen lol. I work in data engineering and the first paragraph
        is me all the time. There are a lot of cool technologies (timeseries
        databases, vector databases, stuff like Synapse on Azure, "lakehouses"
        etc.) but they are mostly for edge cases.
        
        I'm not saying they're useless, but if I see something like that lying
        around, it's more likely that someone put it there based on vibes
        rather than an actual engineering need. Postgres is good enough for
        OpenAI, chances are it's good enough for you.
       
       
   DIR <- back to front page