_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Slow deployment causes meetings (2015)
       
       
        austin-cheney wrote 10 min ago:
        While this is mostly correct it’s also just as irrelevant.
        
        TLDR; software performance, thus human performance, is all that
        matters.
        
        Risk management/acceptance can be measured with numbers. In software
        this is actually far more straightforward than in many other careers,
        because software engineers can only accept risk within the restrictions
        of their known operating constraints and everything else is deferred.
        
        If you want to go faster you need to maximize the frequency of human
        iteration above absolutely everything else. If a person cannot iterate,
        such as waiting on permissions, they are blocked. If they are waiting
        on a build or screen refresh they are slowed. This can also be measured
        with numbers.
        
        If person A can iterate 100x faster than person B correctness becomes
        irrelevant. Person B must maximize upon correctness because they are
        slow. To be faster and more correct person A has extreme flexibility to
        learn, fail, and improve beyond what person B can deliver.
        
        Part of iterating faster AND reducing risk is fast test automation. If
        person A can execute 90+% test coverage in time of 4 human iterations
        then that test automation is still 25x faster than one person B
        iteration with a 90+% lower risk of regression.
       
        andy_ppp wrote 44 min ago:
        The organisation will actively prevent you from trying to improve
        deployments though, they will say things like “Jenkins shouldn’t be
        near production” or “we can’t possibly put things live without QA
        being involved” or “we need this time to make sure the quality of
        the software is high enough”. All with a straight face while having
        millions of production bugs and a product that barely meets any user
        requirements (if there are any).
        
        In the end fighting the bureaucracy is actually impossible in most
        organisations, especially if you’re not part of the 200 layers of
        management that create these meetings. I would sack everyone but
        programmers and maybe two designers and let everyone fight it out
        without any agile coaches and product owners and scrum master and
        product experts.
        
        Slow deployment is a problem but it’s not the problem.
       
        sourceless wrote 3 hours 24 min ago:
        I think unfortunately the conclusion here is a bit backwards;
        de-risking deployments by improving testing and organisational
        properties is important, but is not the only approach that works.
        
        The author notes that there appears to be a fixed number of changes per
        deployment and that it is hard to increase - I think the 'Reversie
        Thinkie' here (as the author puts it) is actually to decrease the
        number of changes per deployment.
        
        The reason those meetings exist is because of risk! The more changes in
        a deployment, the higher the risk that one of them is going to
        introduce a bug or operational issue. By deploying small changes often,
        you get deliver value much sooner and fail smaller.
        
        Combine this with techniques such as canarying and gradual rollout, and
        you enter a world where deployments are no longer flipping a switch and
        either breaking or not breaking - you get to turn outages into
        degradations.
        
        This approach is corroborated by the DORA research[0], and covered well
        in Accelerate[1]. It also features centrally in The Phoenix Project[2]
        and its spiritual ancestor, The Goal[3].
        
        [0] [1] [2] [3]
        
   URI  [1]: https://dora.dev/
   URI  [2]: https://www.amazon.co.uk/Accelerate-Software-Performing-Techno...
   URI  [3]: https://www.amazon.co.uk/Phoenix-Project-Helping-Business-Anni...
   URI  [4]: https://www.amazon.co.uk/Goal-Process-Ongoing-Improvement/dp/0...
       
          ricardobeat wrote 17 min ago:
          > By deploying small changes often, you get deliver value much sooner
          and fail smaller.
          
          Which increases the number of changes per deployment, feeding the
          overhead cycle.
          
          He is describing an emergent pattern here, not something that
          requires intentional culture change (like writing smaller changes).
          You’re not disagreeing but paraphrasing the article’s conclusion:
          
          > or the harder way, by increasing the number of changes per
          deployment (better tests, better monitoring, better isolation between
          elements, better social relationships on the team)
       
          ozim wrote 2 hours 8 min ago:
          I am really interested in  organizations capacity of soaking the
          changes.
          
          I live in B2B SaaS space and as much as development goes we could
          release daily. But on the receiving side we get pushback. Of course
          there can be feature flags but then it would cause “not enabled
          feature backlog”.
          
          In the end features are mostly consumed by people and people need
          training on the changes.
       
          motorest wrote 2 hours 39 min ago:
          > The reason those meetings exist is because of risk! The more
          changes in a deployment, the higher the risk that one of them is
          going to introduce a bug or operational issue.
          
          Having worked on projects that were perfectly full CD and also
          projects that had biweekly releases with meetings with release
          engineers, I can state with full confidence that risk management is
          correlated but an indirect and secondary factor.
          
          The main factor is quite clearly how much time and resources an
          organization invests in automated testing. If an organization has the
          misfortune of having test engineers who lack the technical background
          to do automation, they risk never breaking free of these meetings.
          
          The reason why organizations need release meetings is that they lack
          the infrastructure to test deployments before and after rollouts, and
          they lack the infrastructure to roll back changes that fail once
          deployed. So they make up this lack of investment by adding all these
          ad-hoc manual checks to compensate for lack of automated checks. If
          QA teams lack any technical skills, they will push for manual
          processes as self-preservation.
          
          To make matters worse, there is also the propensity to pretend that
          having to go through these meetings is a sign of excellence and best
          practices, because if you're paid to mitigate a problem obviously you
          have absolutely no incentive to fix it. If a bug leaks into
          production, that's a problem introduced by the developer that wasn't
          caught by QAs because reasons. If the organization has automated
          tests, it's even hard to not catch it at the PR level.
          
          Meetings exist not because of risk, but because organizations employ
          a subset of roles that require risk to justify their existence and
          lack skills to mitigate it. If a team organizes it's efforts to add
          the bare minimum checks to verify a change runs and works once
          deployed, and can automatically roll back if it doesn't, you do not
          need meetings anymore.
       
          tomxor wrote 2 hours 51 min ago:
          I tend to agree. Whenever I've removed artificial technical friction,
          or made a fundamental change to an approach, the processes that grew
          around them tend to evaporate, and not be replaced. I think many of
          these processes are a rational albeit non-technical response to
          making the best of a bad situation in the absence of a more
          fundamental solution.
          
          But that doesn't mean they are entirely harmless. I've come across
          some scenarios where the people driving decisions continued to reach
          for human processes as the solution rather than a workaround, for
          both new projects and projects designated specifically to remove
          existing inefficiencies. They either lacked the technical
          imagination, or were too stuck in the existing framing of the
          problem, and this is where people who do have that imagination need
          to speak up and point out that human processes need to be minimised
          with technical changes where possible. Not all human processes can be
          obviated through technical changes, but we don't want to spread
          ourselves thin on unnecessary ones.
       
        jojobas wrote 3 hours 41 min ago:
        Fast deployment causes incident war rooms.
       
          DougBTX wrote 2 hours 44 min ago:
          Maybe the opposite, slow rollbacks cause escalating incidents.
       
        qaq wrote 6 hours 18 min ago:
        A bit tangential but why is CloudFormation so slowww?
       
          motorest wrote 55 min ago:
          > A bit tangential but why is CloudFormation so slowww?
          
          It's not that CloudFormation is slow. It's that the whole concept of
          infrastructure-as code-as-codd is slow by nature.
          
          Each time you deploy a change to a state as a transaction, you need
          to assert preconditions and post-conditions at each step. If you have
          to roll out a set of changes that have any semblance of
          interdependence, you have no option other than to deploy each change
          as sequential steps. Each step requires many network calls to apply
          changes, go through auth, poll state, each one taking somewhere
          between 50-200ms. That quickly adds up.
          
          If you deploy the same app on a different cloud provider with
          Terraform or Ansible, you get the same result. If you deploy the same
          changes manually you turn a few minutes into a day-long ordeal.
          
          The biggest problem with IaC is that it is so high-level and does so
          much under the hood that some people have no idea what changes they
          are actually applying or what they are doing. Then they complain it
          takes so long.
       
          hk1337 wrote 3 hours 44 min ago:
          This is just anecdotal but I have found anytime a network interface
          is involved, it can slow down the deployment. I had a case where I
          was deleting lambdas in a VPC, and connected to EFS, that the
          deployment was rather quick but it took ~20 minutes for
          cloudformation to cleanup and finish.
       
          Aeolun wrote 4 hours 17 min ago:
          The reason by boss tends to give is that it’s made by AWS, so it
          cannot possibly be bad. Also, it’s free. Which is never given as
          anything more than a tangentially related reason, but…
       
          justin_oaks wrote 5 hours 48 min ago:
          I figure it's because AWS can get away with it.
       
        lizzas wrote 6 hours 47 min ago:
        Microservices lets you horizontally scale deployment frequency too.
       
          devjab wrote 3 hours 33 min ago:
          You can do this with a monolith architecture as others point out. It
          always comes down to governance. With monoliths you risk slowing
          yourself down in a huge mess of SOLID, DRY and other “clean code”
          nonsense which means nobody can change anything without it breaking
          something. Not because any of the OOP principles are wrong on face
          value, but because they are so extremely vague that nobody ever gets
          them right. It’s always hilarious to watch Uncle Bob dismiss any
          criticism with a “they misunderstood the principles” because
          he’s always completely right. Maybe the principles are just bad
          when so many people get them wrong? Anyway, microservices don’t
          protect you from poor governance it just shows up as different
          problems. I would argue that it’s both extremely easy and common to
          build a bunch of micro services where nobody knows what effect a
          change has on others. It comes down to team management, and this is
          where our industry sucks the most in my experience. It’ll be better
          once the newer generations of “Team Topologies” enter, but
          it’ll be a struggle for decades to come if it’ll ever really end.
          Often it’s completely out of the hands of whatever digitalisation
          department you have because the organisation views any “IT” as a
          cost center and never requests things in a way that can be
          incorporated in any sort of SWE best practice process.
          
          One of the reasons I like Go as a general purpose language is that it
          often leads to code bases which are easy to change by its simplicity
          by design. I’ve seen an online bank and a couple of landlord
          systems (sorry I can’t find the English word for asset and tenant
          management in a single platform) explode in growth. Largely because
          switching to Go has made it possible for them to actually deliver
          what the business needs. Mean while their competition remains stuck
          with unruly Java or C# code bases where they may be capable of
          rolling out buggy additions every half year if their organisation is
          lucky. Which has nothing to do with Go, Java or C# by the way, it has
          to do with old fashioned OOP architecture and design being way too
          easy to fuck up. In one shop I worked they had over a thousand C#
          interfaces which were never consumed by more than one class… Every
          single one of their tens of thousands of interfaces was in the same
          folder and namespace… good luck finding the one you need. You could
          do that with Go, or any language, but chances are you won’t do it
          if you’re not rolling with one of those older OOP clean code
          languages. Not doing it with especially C# is harder because
          abstraction by default is such an ingrained part of the culture
          around it.
          
          Personally I have a secret affection for Python shops because they
          are always fast to deliver and terrible in the code. Love it!
       
          punnerud wrote 4 hours 21 min ago:
          As long as every team managing the different APIs/services don’t
          have to be consulted for others to get access.
          You then get both the problems of distributed data and even more
          levels of complexity (more meetings than with a monolith)
       
            motorest wrote 2 hours 19 min ago:
            > As long as every team managing the different APIs/services
            don’t have to be consulted for others to get access.
            
            Worst-case scenario, those meetings take place only when a new
            consumer starts consuming a producer managed by an external team
            well outside your org.
            
            Once that rolls out, you don't need any meeting anymore beyond
            hypothetical SEVs.
       
          faizshah wrote 4 hours 26 min ago:
          It’s a monkey’s paw solution, now you have 15 kinda slow
          pipelines instead of 3 slow deployment pipelines. And you get to have
          the fun new problem of deployment planning and synchronizing feature
          deployments.
       
            motorest wrote 1 hour 45 min ago:
            > It’s a monkey’s paw solution, now you have 15 kinda slow
            pipelines instead of 3 slow deployment pipelines.
            
            Not a problem. In fact, they are a solution to a problem.
            
            > And you get to have the fun new problem of deployment planning
            and synchronizing feature deployments.
            
            Not a problem too. You don't need to synchronize anything if you're
            consuming changes that are already deployed and running. You also
            do not need to synchronize feature deployment if you know the very
            basics of your job. Worst case scenario, you have to move features
            behind a feature flag, which requires zero synchronization.
            
            This sort of discussion feels like people complaining about
            perceived problems they never bothers to think about, let alone
            tackle.
       
          fulafel wrote 4 hours 33 min ago:
          I think this was the meme before moduliths[1][2] where people
          conflated the operational and code change aspects of microservices.
          But it's just additional incidental complexity that you should
          resist.
          
          IOW you can do as many deploys without microservices if you organize
          your monolithic app as independent modules, while keeping out the
          main disadvantages of the microservice (infra/cicd/etc complexity,
          and turning your app's function calls into a unreliable distributed
          system communication problem). [1]
          
   URI    [1]: https://www.fearofoblivion.com/build-a-modular-monolith-firs...
   URI    [2]: https://ardalis.com/introducing-modular-monoliths-goldilocks...
       
            motorest wrote 2 hours 25 min ago:
            > I think this was the meme before moduliths[1][2] where people
            conflated the operational and code change aspects of microservices.
            
            People conflate the operational and code change aspects of
            microservices just like people conflate that the sky is blue and
            water is wet. It's a statement of fact that doesn't go away with
            buzzwords.
            
            > IOW you can do as many deploys without microservices if you
            organize your monolithic app as independent modules, while keeping
            out the main disadvantages of the microservice (infra/cicd/etc
            complexity, and turning your app's function calls into a unreliable
            distributed system communication problem).
            
            This personal opinion is deep within "not even false" territory.
            You can also deploy as many times as you'd like with any monolith,
            regardless of what buzzwords you tack on that.
            
            What you're completely missing from your remark is the loosely
            coupled nature of running things on a separate service, how trivial
            it is to do blue-green deployments, and how you can do gradual
            rollouts that you absolutely cannot do with a patch to a monolith,
            no matter what buzzwords you tack on it. That is the whole point of
            mentioning microservices: you can do all that without a single
            meeting.
       
            trog wrote 4 hours 9 min ago:
            An old monolithic PHP application I worked on for over a decade
            wasn't set up with independent modules and the average deploy
            probably took a couple seconds, because it was an svn up which only
            updated changed files.
            
            I frequently think about this when I watch my current workplace's  
            node application go through a huge build process, spitting out a
            70mb artifact which is then copied multiple times around the entire
            universe as a whole chonk before finally ending up where it needs
            to be several tens of minutes later.
       
              withinboredom wrote 3 hours 57 min ago:
              Even watching how php applications get deployed these days, where
              it goes through this huge thing and takes about the same amount
              of time to replace all the docker containers.
       
              fulafel wrote 3 hours 59 min ago:
              Yeah, if something even simpler works, that's of course even
              better.
              
              I'd argue the difference between that PHP app and the Node app
              wasn't the lack of modularity, you could have a modulith with the
              same fast deploy.
              
              (But of course modulith is too just extra complexity if you don't
              need it)
       
          theptip wrote 6 hours 17 min ago:
          Not a silver bullet; you increase api versioning overhead between
          services for example.
       
            motorest wrote 4 hours 31 min ago:
            > Not a silver bullet; you increase api versioning overhead between
            services for example.
            
            That's actually a good thing. That ensures clients remain backwards
            compatible in case of a rollback. The only people who don't notice
            the need for API versionin are those who are oblivious to the
            outages they create.
       
            whateveracct wrote 5 hours 40 min ago:
            True but your API won't be changing that rapidly especially in a
            backwards-incompatible way.
       
              dhfuuvyvtt wrote 5 hours 8 min ago:
              What's that got to do with microservices?
              
              Edit, because you can avoid those things in a monolith.
       
        Sparkyte wrote 6 hours 56 min ago:
        Sounds like a process problem. 2024 development cycles should be able
        to handle multiple lanes of development and deployments. Also why
        things moved to microservices so you can deploy with minimal impact as
        long as you don't tightly couple your dependencies.
       
          m00x wrote 6 hours 53 min ago:
          You don't need microservices to do this. It's actually easier
          deploying a monolith with internal dependencies than deploying
          microservices that depend on each other.
       
            adrianpike wrote 5 hours 34 min ago:
            This is very accurate - microservices can be great as a forcing
            function to revisit your architectural boundaries, but if all you
            do is add a network hop and multiple components to update when you
            tweak a data model, all you'll get is headcount sprawl and deadlock
            to the moon.
            
            I'm a huge fan of migrating to microservices as a secondary outcome
            of revisiting your component boundaries, but just moving to
            separate repos & artifacts so we can all deploy independently is a
            recipe for pain.
       
        yarg wrote 7 hours 11 min ago:
        I had a boss who actually acknowledged that he was deliberately holding
        up my development process - this was a man who refused to allow me a
        four day working week.
       
        dang wrote 7 hours 14 min ago:
        Related:
        
        Slow Deployment Causes Meetings - [1] - Nov 2015 (26 comments)
        
   URI  [1]: https://news.ycombinator.com/item?id=10622834
       
       
   DIR <- back to front page