_______ __ _______
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----.
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --|
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____|
on Gopher (inofficial)
URI Visit Hacker News on the Web
COMMENT PAGE FOR:
URI Slow deployment causes meetings (2015)
tpoacher wrote 21 hours 8 min ago:
Meetings (used right) are a great tool, in the same sense that project
planners (used right) are a great tool.
But then there's Jira.
/s
braza wrote 1 day ago:
A marginally related point but I do not know if others faced the
following situation: I worked in a place with a CI pipeline room ~25
minutes with the unit/integration tests (3000+) taking 18 minutes.
When something happens in production we ended up placing more tests;
and of course when things goes south at least 50 minutes were necessary
to recover.
After a lot of consideration we decided to focus on the recovery and
relax and simply some tests and focus on recovery (i.e. have the full
thing in less than 5 minutes) combined with a canary as deployment
strategy (instead rolling updates).
At least for us was a so refreshing experience but sounded wrong in
some ways.
wussboy wrote 22 hours 6 min ago:
Iâve often said that it is the speed of deployment that matters. If
it takes you 50 minutes to deploy, it takes you 50 minutes to fix a
problem. If it takes you 50 seconds to deploy, it takes you 50
seconds to fix a problem.
Of course all kinds of things are rolled up in that speed to deploy,
but almost all of them are good.
lifeisstillgood wrote 1 day ago:
I am trying to expound a concept I call âsoftware literacyâ - where
a business can be run via code just as much as today a company can be
run by English words (policy documents, emails etc).
This leads to a few corollaries - things like âIf GPUs do the work
then coders are the new managersâ or we need whole-org-test-rigs to
be clear about the impacts of chnages.
This seems directly related to this excellent article - to my mind if
all the decision makers are not looking at the code as the first class
object in a chnage process (is opposed to Jiras or project plans) then
not all decision makers are (software) literate - and this comes up a
lot in the threads here (âhow do I discuss with non-technical
managementâ) - the answer is you cannot - that management must be
changed. This is an enormous generational road block that I thought was
a problem thirty years ago but naively assumed would disappear as
coders grew up. Of course the problem is that to ârunâ a company
one does not need to code - so until not coding is something
embarrassing like not writing is for a newspaper editor we wonât get
past it.
The main point is that we need companies that can be run with the new
set of self-reinforcing concepts - sops, testing, not meetings but
systems as communication.
I will try and rewrite this comment later - it needs work
fduran wrote 13 hours 43 min ago:
I've called that "Organization as Code" some years back :-)
gavmor wrote 17 hours 28 min ago:
You had me at "whole org test" harness. This is a very, very
interesting idea. Especially in conjunction with the concept of
corporation as "slow AI" that I don't hear referenced often enough.
I don't see why you call it "literacy," though. I think Maturana &
Varela's term "autopoiesis" more closely orbits the kernel, and I'll
bet Stafford Beer's Autopoietic Systems would contribute to a good
intellectual foundation.
At a certain point, though, I wonder if a purely software "business"
doesn't just look like... SaaS?
vegetablepotpie wrote 1 day ago:
I have personal experience with this in my professional career. Before
Christmas break I had a big change, and there was fear. My org
responded by increasing testing (regression testing, which increased
overhead). This increased the risk that changes on dev would break
changes on my branch (not a code merging way, but in a complex adaptive
system way).
I responded to this risk by making a meeting. I presented our project
schedule, and told my colleagues about their expectations, I.e. if they
drop code style comments on the PRs they will be deferred to a future
PR (and then ignored and never done).
What we needed is fine grained testing with better isolation between
components. The problem is is that our management is at a high level,
they donât see meetings as a means to an end, they see meetings as a
worthy goal in and of itself self to achieve. More meetings means more
collaboration, means good. Iâd love to see advice on how to lead
technical changes with non-technical management.
austin-cheney wrote 1 day ago:
While this is mostly correct itâs also just as irrelevant.
TLDR; software performance, thus human performance, is all that
matters.
Risk management/acceptance can be measured with numbers. In software
this is actually far more straightforward than in many other careers,
because software engineers can only accept risk within the restrictions
of their known operating constraints and everything else is deferred.
If you want to go faster you need to maximize the frequency of human
iteration above absolutely everything else. If a person cannot iterate,
such as waiting on permissions, they are blocked. If they are waiting
on a build or screen refresh they are slowed. This can also be measured
with numbers.
If person A can iterate 100x faster than person B correctness becomes
irrelevant. Person B must maximize upon correctness because they are
slow. To be faster and more correct person A has extreme flexibility to
learn, fail, and improve beyond what person B can deliver.
Part of iterating faster AND reducing risk is fast test automation. If
person A can execute 90+% test coverage in time of 4 human iterations
then that test automation is still 25x faster than one person B
iteration with a 90+% lower risk of regression.
andy_ppp wrote 1 day ago:
The organisation will actively prevent you from trying to improve
deployments though, they will say things like âJenkins shouldnât be
near productionâ or âwe canât possibly put things live without QA
being involvedâ or âwe need this time to make sure the quality of
the software is high enoughâ. All with a straight face while having
millions of production bugs and a product that barely meets any user
requirements (if there are any).
In the end fighting the bureaucracy is actually impossible in most
organisations, especially if youâre not part of the 200 layers of
management that create these meetings. I would sack everyone but
programmers and maybe two designers and let everyone fight it out
without any agile coaches and product owners and scrum master and
product experts.
Slow deployment is a problem but itâs not the problem.
xorcist wrote 20 hours 50 min ago:
> Jenkins shouldnât be near production
All of which sounds completely reasonable to me, in many situations.
Jenkins is the Wordpress of software development. It's gigantic state
loop that runs plugins with no privilege separation. Giving your
jenkins instance administrative credentials in production might very
well be equivalent to giving root keys to that lone guy in Romania
who authored that plugin you never audited. I can understand
perfectly why that might not be desirable to everyone.
.. which neatly leads on to
> we canât possibly put things live without QA being involved
If you deploy stuff in production that never passes QA, why do you
even have QA? To fix stuff later?
If they are not empowered they will never have the chance to do a
good job or have any pride in their work.
lifeisstillgood wrote 22 hours 37 min ago:
This is more or less Muskâs approach at Twitter - and ignoring the
enormous baggage any discussion with Musk brings (if possible) - I
would love to see a real academic case study on the effects of that
to Twitter - there will be a lot to unpick but my bias is on your
side here.
gavmor wrote 23 hours 13 min ago:
> Jenkins shouldnât be near production
> we canât possibly put things live without QA being involved
> we need this time to make sure the quality of the software is high
enough
I've only developed software professionally since 2012, but in that
time not only have I never encountered such sentiments, but (and,
perhaps, because) it has always been a top priority of leadership to
emphatically insist on the very opposite: day one of any initiative
is Jenkins to productionâoften directly via trunk-based
developmentâand quality is every developer's responsibility.
At the IC level, there was no "fighting bureaucracy," although I
don't doubt leadership debated these things vigorously, from time to
time, especially as external partners and stakeholders were often
intimately involved.
> I would sack everyone but programmers and maybe two designers and
let everyone fight it out
That works for me! But it doesn't scale. We definitely have to keep
at least one product "owner" or "expert" or "manager" to enqueue
stakeholder priorities and, while this can be a "hat" that devs and
designers trade off, it's also a skill at which some individuals
uniquely excel.
All that being said, I don't want to come across as pearl-clutching,
shocked Pikachu face about this. I understand that many organizations
don't operate this way. The way I've helped firms make this change is
via the introduction of a single, experimental team of volunteers
dedicated to these practicesâone protected (but not dictated to) by
a mandate from on high.
But, then again, this is California.
gleenn wrote 1 day ago:
You sound very defeatist about fighting bureaucracy. If you work at
an org with too much management, you can slowly push to move it in
the direction you hope for or leave. If you keep ending up at places
that seem impossible to change, perhaps you should ask more questions
about this during the interview. I've worked at many small companies
where there wasn't crazy bureaucracy because that's definitely what I
preferred. I also currently work at a megacorp and yes there is
difficulty, but being consistent and persuasive has lead to many
things slowly heading in the right direction. Things take time. You
have to realize why people have made things some way and then find
convincing arguments to make things better. Sometimes places do just
suck so don't stick around. But being hopeless doesn't seem helpful.
mewpmewp2 wrote 6 hours 5 min ago:
My issue is that it just pays more. I have no similar paying option
that would have good experience in that sense.
sourceless wrote 1 day ago:
I think unfortunately the conclusion here is a bit backwards;
de-risking deployments by improving testing and organisational
properties is important, but is not the only approach that works.
The author notes that there appears to be a fixed number of changes per
deployment and that it is hard to increase - I think the 'Reversie
Thinkie' here (as the author puts it) is actually to decrease the
number of changes per deployment.
The reason those meetings exist is because of risk! The more changes in
a deployment, the higher the risk that one of them is going to
introduce a bug or operational issue. By deploying small changes often,
you get deliver value much sooner and fail smaller.
Combine this with techniques such as canarying and gradual rollout, and
you enter a world where deployments are no longer flipping a switch and
either breaking or not breaking - you get to turn outages into
degradations.
This approach is corroborated by the DORA research[0], and covered well
in Accelerate[1]. It also features centrally in The Phoenix Project[2]
and its spiritual ancestor, The Goal[3].
[0] [1] [2] [3]
URI [1]: https://dora.dev/
URI [2]: https://www.amazon.co.uk/Accelerate-Software-Performing-Techno...
URI [3]: https://www.amazon.co.uk/Phoenix-Project-Helping-Business-Anni...
URI [4]: https://www.amazon.co.uk/Goal-Process-Ongoing-Improvement/dp/0...
manvillej wrote 23 hours 37 min ago:
this isn't even a software things. Its any production process. The
greater amount of work in progress items, the longer the work in
progress items, the greater risk, the greater amount of work. Shrink
the batch, shorten the release window window.
It infuriates me that software engineering has had to rediscover
these facts when the Toyota production system was developed between
1948-1975 and knew all these things 50 years ago.
lifeisstillgood wrote 1 day ago:
So this seems quantifiable as well - there must be a number of
processes / components that a business is made up of, and those
presumably are also weighted (payment processing has weight 100, HR
holiday requests weight 5 etc).
I would conjecture that changing more than 2% of processes in any
given period is âtoo muchâ - but one can certainly adjust that.
And I suspect that this modifies based on area (ie the payment
processing code has a different team than the HR code) - so it would
be sensible to rotate releases (or possibly teams) - this period this
team is working on the hard stuff, but once that goes live the team
is rotated back out to tackle easier stuff - either payment
processing or HR
The same principle applies to attacking a trench, moving battalions
forward and combined arms operations.
Now that is of course a âmanagementâ problem - but one can easily
see how to automate a lot of it - and how other âsensoryâ inputs
are useful (ie which teams have committed code to these sensitive
modules recently
One last point is it makes nonsense of âsprintsâ in Agile/Scrum -
we know you cannot sprint a whole marathon, so how do you prepare the
sprints for rotation?
gavmor wrote 22 hours 55 min ago:
There are no sprints in agile. ;)
On the contrary, per the Manifesto:
> Agile processes promote sustainable development.
> The sponsors, developers, and users should be able
to maintain a constant pace indefinitely.
vasco wrote 1 day ago:
I agree entirely - I use the same references, I just think it's
bordering on sacrilege what you did to Mr. Goldratt. He has been
writing about flow and translating the Toyota Production System
principles and applying physics to business processes way before
someone decided to write The Phoenix Project.
I loved the Phoenix Project don't get me wrong, but compared to The
Goal it's a like a cheaply produced adaptation of a "real" book so
that people in the IT industry don't get scared when they read about
production lines and run away saying "but I'm a PrOgrAmmEr, and
creATIVE woRK can't be OPtiMizEd like a FactOry".
So The Phoenix Project if anything is the spiritual successor to The
Goal, not the other way around.
grncdr wrote 1 day ago:
Thatâs exactly what the GP wrote: The Goal is the spiritual
ancestor of The Phoenix Project.
vasco wrote 1 day ago:
Well now I can't tell if it was edited or if I just misread and
decided to correct my own mistake. I'll leave it be so I remember
next time, thanks.
sourceless wrote 1 day ago:
That's indeed how I wrote it, but I could have worded it
better. Very much agree that the insights in The Goal go far
beyond the scope of The Phoenix Project.
mrbluecoat wrote 1 day ago:
I totally read it as successor as well. Interesting how the
brain fills in what we expect to see :)
ricardobeat wrote 1 day ago:
> By deploying small changes often, you get deliver value much sooner
and fail smaller.
Which increases the number of changes per deployment, feeding the
overhead cycle.
He is describing an emergent pattern here, not something that
requires intentional culture change (like writing smaller changes).
Youâre not disagreeing but paraphrasing the articleâs conclusion:
> or the harder way, by increasing the number of changes per
deployment (better tests, better monitoring, better isolation between
elements, better social relationships on the team)
sourceless wrote 1 day ago:
I am disagreeing with the conclusion of the article, and asserting
that more and smaller deployments are the better way to go.
ricardobeat wrote 23 hours 20 min ago:
You are not. The conclusion of the article is the same, you "need
to expand the far end of the hose" by increasing deployment rate
or making more, smaller changes. What was your interpretation?
sourceless wrote 19 hours 24 min ago:
My reading was that there were two paths the author highlights:
1) Increase deployment capacity (which I'm reading as
frequency, and I fully agree with)
2) Increase change capacity per deployment by making it less
likely that a set of changes will fail through tests,
monitoring, structural, and team changes
#2 is very much geared to "ship more changes in one deployment"
which is where my disagreement lies. I think you should still
do all those things, but that increasing the size of the bundle
is explicitly an anti-goal.
I think you're better off, as a rule of thumb, making fewer
changes per deployment if you want to reduce risk.
But -- that is my particular reading of it.
sciurus wrote 16 hours 19 min ago:
My reading is that the author posits there is a fixed amount
of change that can be safely made in a single deployment. The
solution is to make it possible to deploy more frequently.
This is hard, so organizations will often instead introduce
overhead that slows down changes. Engineers might be tempted
to blame the overhead and try to eliminate it, but that won't
be successful and may even backfire. They need to tackle the
underlying issue of deployment capacity instead.
ozim wrote 1 day ago:
I am really interested in organizations capacity of soaking the
changes.
I live in B2B SaaS space and as much as development goes we could
release daily. But on the receiving side we get pushback. Of course
there can be feature flags but then it would cause ânot enabled
feature backlogâ.
In the end features are mostly consumed by people and people need
training on the changes.
ajmurmann wrote 1 day ago:
I think that really depends on the product. I worked on a on-prem
data product for years and it was crucial to document all changes
well and give customers time to prepare. OTOH I also worked on a
home inspection app and there users gave us pushback on training
because the app was seen as intuitive
paulryanrogers wrote 1 day ago:
> ...there users gave us pushback on training because the app was
seen as intuitive
I would weep with joy to receive such feedback! Too often the
services I work on have long histories with accidental UIs, built
to address immediate needs over and over.
ajmurmann wrote 2 hours 53 min ago:
This was a greenfield app. For all I know by now accommodating
edge cases that almost never matter has made the thing
unusable.
motorest wrote 1 day ago:
> The reason those meetings exist is because of risk! The more
changes in a deployment, the higher the risk that one of them is
going to introduce a bug or operational issue.
Having worked on projects that were perfectly full CD and also
projects that had biweekly releases with meetings with release
engineers, I can state with full confidence that risk management is
correlated but an indirect and secondary factor.
The main factor is quite clearly how much time and resources an
organization invests in automated testing. If an organization has the
misfortune of having test engineers who lack the technical background
to do automation, they risk never breaking free of these meetings.
The reason why organizations need release meetings is that they lack
the infrastructure to test deployments before and after rollouts, and
they lack the infrastructure to roll back changes that fail once
deployed. So they make up this lack of investment by adding all these
ad-hoc manual checks to compensate for lack of automated checks. If
QA teams lack any technical skills, they will push for manual
processes as self-preservation.
To make matters worse, there is also the propensity to pretend that
having to go through these meetings is a sign of excellence and best
practices, because if you're paid to mitigate a problem obviously you
have absolutely no incentive to fix it. If a bug leaks into
production, that's a problem introduced by the developer that wasn't
caught by QAs because reasons. If the organization has automated
tests, it's even hard to not catch it at the PR level.
Meetings exist not because of risk, but because organizations employ
a subset of roles that require risk to justify their existence and
lack skills to mitigate it. If a team organizes it's efforts to add
the bare minimum checks to verify a change runs and works once
deployed, and can automatically roll back if it doesn't, you do not
need meetings anymore.
gavmor wrote 22 hours 48 min ago:
> The main factor is quite clearly how much time and resources an
organization invests in automated testing.
For context, I think it's worth reflecting on Beck's background, eg
as the author of XP Explained. I suspect he's taking even TDD for
granted, and optimizing what's left. I think even the name of his
new blogâ"Tidy First"âis in reaction to a saturation, in his
milieu, of the imperative to "Test First".
vegetablepotpie wrote 1 day ago:
This is very well said and succinctly summarizes my frustrations
with QA. My experience has been that non-technical staff in
technical organizations create meetings to justify their existence.
Iâm curious if you have advice on how to shift non-technical QA
towards adopting automated testing and fewer meetings.
phatskat wrote 18 hours 0 min ago:
We are in the early stages of something like this in my org. QA
has been writing tests in some form for a while, and itâs
mostly been at a self-led level. We have a senior engineer
per-application responsible for tooling and guidance, and the QA
testers have been learning Java/script (depending on the
application, teams we donât interface with are writing theirs
in C# iirc). With the new year, we are starting a phased
initiative to ramp up all of QA to be Software Engineers in
Testing - each phase will teach and guide and impart the skills
needed to be fully sufficient to write automation tests in tandem
with engineers writing features.
Itâs an interesting and bold initiative imo, as Iâve often
worked at places that let QA do whatever felt best which is good
from the standpoint of letting them work within their comfort
zone, and it also means that testing will largely plateau. I
havenât seen a real push for automation _not_ come out of the
engineering department personally (because Iâm the one pushing
it every time), though I know this place has at least done some
work with various automation systems in the past.
blackjack_ wrote 1 day ago:
Hi, senior SRE here who was a QA, then QA lead, then lead
automation / devops engineer.
QA engineers with little coding experience should be given simple
automation tasks with similar tests and documentation/ people to
ask questions to. I.e. setup a pytest framework that has a few
automated test examples, and then have them write similar tests.
The automated tests are just TAC (tests as code) versions of the
manual test cases they should already write, so they should have
some idea of what they need to do, and then google / ChatGPT/
automation engineers should be able to help them start to
translate that to code.
People with growth mindsets and ambitions will grow from the
support and being given the chance to do the things, while some
small number will balk and not want anything to do with it. You
can lead a horse to water and all that.
sourceless wrote 1 day ago:
I think we may be violently agreeing - I certainly agree with
everything you have said here.
tomxor wrote 1 day ago:
I tend to agree. Whenever I've removed artificial technical friction,
or made a fundamental change to an approach, the processes that grew
around them tend to evaporate, and not be replaced. I think many of
these processes are a rational albeit non-technical response to
making the best of a bad situation in the absence of a more
fundamental solution.
But that doesn't mean they are entirely harmless. I've come across
some scenarios where the people driving decisions continued to reach
for human processes as the solution rather than a workaround, for
both new projects and projects designated specifically to remove
existing inefficiencies. They either lacked the technical
imagination, or were too stuck in the existing framing of the
problem, and this is where people who do have that imagination need
to speak up and point out that human processes need to be minimised
with technical changes where possible. Not all human processes can be
obviated through technical changes, but we don't want to spread
ourselves thin on unnecessary ones.
jojobas wrote 1 day ago:
Fast deployment causes incident war rooms.
wasmitnetzen wrote 20 hours 51 min ago:
Yeah, and slow ones as well.
wussboy wrote 21 hours 58 min ago:
That is the opposite of my experience. Slow deploys mean bigger
deploys mean more complexity going live mean more nervousness and
more testing mean more hesitation mean more chance that something
unforeseen mean errors that no one understands mean war rooms.
boxed wrote 1 day ago:
I was on a team that went from every 3 weeks to multiple times per
day. The number of incidents in production dropped drastically.
But much more important than that drop, was that when things went
wrong is was MUCH MUCH faster to find the problem. It was also much
safer and easier to roll back, since there were so few changes that
would be rolled back. No one wants to back off 3 weeks of work.
That's chaos.
Trasmatta wrote 1 day ago:
In my experience, there's very little correlation. I've been on
projects with 1 deployment every six weeks, and there were just as
many production incidents as projects with daily deployments.
DougBTX wrote 1 day ago:
Maybe the opposite, slow rollbacks cause escalating incidents.
qaq wrote 1 day ago:
A bit tangential but why is CloudFormation so slowww?
motorest wrote 1 day ago:
> A bit tangential but why is CloudFormation so slowww?
It's not that CloudFormation is slow. It's that the whole concept of
infrastructure-as code-as-codd is slow by nature.
Each time you deploy a change to a state as a transaction, you need
to assert preconditions and post-conditions at each step. If you have
to roll out a set of changes that have any semblance of
interdependence, you have no option other than to deploy each change
as sequential steps. Each step requires many network calls to apply
changes, go through auth, poll state, each one taking somewhere
between 50-200ms. That quickly adds up.
If you deploy the same app on a different cloud provider with
Terraform or Ansible, you get the same result. If you deploy the same
changes manually you turn a few minutes into a day-long ordeal.
The biggest problem with IaC is that it is so high-level and does so
much under the hood that some people have no idea what changes they
are actually applying or what they are doing. Then they complain it
takes so long.
mlhpdx wrote 19 hours 3 min ago:
FWIW, my approach to IaC has been to focus on the âIâ with
CloudFormation â the networking, storage, IAM, other AWS
primitives and etc. This stuff doesnât change as often, and
safe/reliable deployments are more valuable than quick ones.
The behavioral parts (aka. application, stuff running in a VM of
some kind or something declarative like EventBridge rules or
StepFunctions) I keep separate and prioritize quick turns.
CodeDeploy can, for example, update code on EC2s in single-digit
seconds.
Iâm building systems that are a little more integrated in AWS
than most folks, perhaps, which makes this approach a good fit. I
do dozens of deployments a day (not an exaggeration â 21 so far
today on a light day), including a couple infrastructure updates.
I think the secret here is not buying into meme-like
simplifications and instead deliberately design an approach that
works for your goals.
Uehreka wrote 21 hours 6 min ago:
> It's that the whole concept of infrastructure-as code-as-codd is
slow by nature.
> If you deploy the same app on a different cloud provider with
Terraform or Ansible, you get the same result.
Nope, Terraform is way faster. Anyone who has switched between them
on the same project can attest to this.
Also, Terraform does not get into
âUPGRADE_ROLLBACK_FAILEDâ-style unrecoverable states nearly as
easily. This happens to me all the time with Cloudformation/CDK. So
my second question after âWhy is Cloudformation so slow?â would
be âWhy is Cloudformation more error-prone when itâs also
slower?â
mlhpdx wrote 18 hours 50 min ago:
It very much depends on the project. TF has all sorts of slowness
and failure modes all its own.
maccard wrote 22 hours 44 min ago:
50-200ms per poll is one thing, but realistically weâre talking
30+ seconds for the smallest of changes even on new resources. Why
does it take so long to spin up an ec2 instance (when fargate can
do it in seconds assuming youâre not rate limited by the API) or
lambda can do it also in milliseconds. Those machines are already
running, why does it take 3 minutes to deploy Ubuntu or Debian from
a blessed AMI?
ianburrell wrote 19 hours 16 min ago:
Fargate is running containers, Lambda functions. They use
Firecracker microVM while EC2 uses full VM. EC2 instances does
lot more setup, using bigger image, and user setup. My guess is
Firecracker is designed for smaller VMs and canât support EC2
features that people need.
qaq wrote 1 day ago:
Thing is Terraform is faster
hk1337 wrote 1 day ago:
This is just anecdotal but I have found anytime a network interface
is involved, it can slow down the deployment. I had a case where I
was deleting lambdas in a VPC, and connected to EFS, that the
deployment was rather quick but it took ~20 minutes for
cloudformation to cleanup and finish.
Aeolun wrote 1 day ago:
The reason by boss tends to give is that itâs made by AWS, so it
cannot possibly be bad. Also, itâs free. Which is never given as
anything more than a tangentially related reason, butâ¦
Uehreka wrote 21 hours 12 min ago:
It⦠definitely isnât free. Have you ever looked at the
âConfigâ category of your AWS bill?
justin_oaks wrote 1 day ago:
I figure it's because AWS can get away with it.
shepherdjerred wrote 1 day ago:
AWS deploys using cfn internally
bobnamob wrote 1 hour 11 min ago:
This is only partly true.
The foundational services do not use cfn for the vast majority of
their deployments.
lizzas wrote 1 day ago:
Microservices lets you horizontally scale deployment frequency too.
devjab wrote 1 day ago:
You can do this with a monolith architecture as others point out. It
always comes down to governance. With monoliths you risk slowing
yourself down in a huge mess of SOLID, DRY and other âclean codeâ
nonsense which means nobody can change anything without it breaking
something. Not because any of the OOP principles are wrong on face
value, but because they are so extremely vague that nobody ever gets
them right. Itâs always hilarious to watch Uncle Bob dismiss any
criticism with a âthey misunderstood the principlesâ because
heâs always completely right. Maybe the principles are just bad
when so many people get them wrong? Anyway, microservices donât
protect you from poor governance it just shows up as different
problems. I would argue that itâs both extremely easy and common to
build a bunch of micro services where nobody knows what effect a
change has on others. It comes down to team management, and this is
where our industry sucks the most in my experience. Itâll be better
once the newer generations of âTeam Topologiesâ enter, but
itâll be a struggle for decades to come if itâll ever really end.
Often itâs completely out of the hands of whatever digitalisation
department you have because the organisation views any âITâ as a
cost center and never requests things in a way that can be
incorporated in any sort of SWE best practice process.
One of the reasons I like Go as a general purpose language is that it
often leads to code bases which are easy to change by its simplicity
by design. Iâve seen an online bank and a couple of landlord
systems (sorry I canât find the English word for asset and tenant
management in a single platform) explode in growth. Largely because
switching to Go has made it possible for them to actually deliver
what the business needs. Mean while their competition remains stuck
with unruly Java or C# code bases where they may be capable of
rolling out buggy additions every half year if their organisation is
lucky. Which has nothing to do with Go, Java or C# by the way, it has
to do with old fashioned OOP architecture and design being way too
easy to fuck up. In one shop I worked they had over a thousand C#
interfaces which were never consumed by more than one class⦠Every
single one of their tens of thousands of interfaces was in the same
folder and namespace⦠good luck finding the one you need. You could
do that with Go, or any language, but chances are you wonât do it
if youâre not rolling with one of those older OOP clean code
languages. Not doing it with especially C# is harder because
abstraction by default is such an ingrained part of the culture
around it.
Personally I have a secret affection for Python shops because they
are always fast to deliver and terrible in the code. Love it!
punnerud wrote 1 day ago:
As long as every team managing the different APIs/services donât
have to be consulted for others to get access.
You then get both the problems of distributed data and even more
levels of complexity (more meetings than with a monolith)
motorest wrote 1 day ago:
> As long as every team managing the different APIs/services
donât have to be consulted for others to get access.
Worst-case scenario, those meetings take place only when a new
consumer starts consuming a producer managed by an external team
well outside your org.
Once that rolls out, you don't need any meeting anymore beyond
hypothetical SEVs.
faizshah wrote 1 day ago:
Itâs a monkeyâs paw solution, now you have 15 kinda slow
pipelines instead of 3 slow deployment pipelines. And you get to have
the fun new problem of deployment planning and synchronizing feature
deployments.
motorest wrote 1 day ago:
> Itâs a monkeyâs paw solution, now you have 15 kinda slow
pipelines instead of 3 slow deployment pipelines.
Not a problem. In fact, they are a solution to a problem.
> And you get to have the fun new problem of deployment planning
and synchronizing feature deployments.
Not a problem too. You don't need to synchronize anything if you're
consuming changes that are already deployed and running. You also
do not need to synchronize feature deployment if you know the very
basics of your job. Worst case scenario, you have to move features
behind a feature flag, which requires zero synchronization.
This sort of discussion feels like people complaining about
perceived problems they never bothers to think about, let alone
tackle.
fulafel wrote 1 day ago:
I think this was the meme before moduliths[1][2] where people
conflated the operational and code change aspects of microservices.
But it's just additional incidental complexity that you should
resist.
IOW you can do as many deploys without microservices if you organize
your monolithic app as independent modules, while keeping out the
main disadvantages of the microservice (infra/cicd/etc complexity,
and turning your app's function calls into a unreliable distributed
system communication problem). [1]
URI [1]: https://www.fearofoblivion.com/build-a-modular-monolith-firs...
URI [2]: https://ardalis.com/introducing-modular-monoliths-goldilocks...
motorest wrote 1 day ago:
> I think this was the meme before moduliths[1][2] where people
conflated the operational and code change aspects of microservices.
People conflate the operational and code change aspects of
microservices just like people conflate that the sky is blue and
water is wet. It's a statement of fact that doesn't go away with
buzzwords.
> IOW you can do as many deploys without microservices if you
organize your monolithic app as independent modules, while keeping
out the main disadvantages of the microservice (infra/cicd/etc
complexity, and turning your app's function calls into a unreliable
distributed system communication problem).
This personal opinion is deep within "not even false" territory.
You can also deploy as many times as you'd like with any monolith,
regardless of what buzzwords you tack on that.
What you're completely missing from your remark is the loosely
coupled nature of running things on a separate service, how trivial
it is to do blue-green deployments, and how you can do gradual
rollouts that you absolutely cannot do with a patch to a monolith,
no matter what buzzwords you tack on it. That is the whole point of
mentioning microservices: you can do all that without a single
meeting.
jmulho wrote 1 day ago:
Blue-green deployments is a buzzword no matter what color you
tack on it.
fulafel wrote 1 day ago:
I seem to have struck a nerve!
While there may be some things that can come for free with
microservices (and not moduliths), your mentioned ones don't
sound convincing. Blue-green deployments and gradual rollouts can
be done with modulith and can't think of any reason that would be
harder than with microservices (part of your running instances
can run with a different version of module X). The coupling can
be just as loose as with microservices.
trog wrote 1 day ago:
An old monolithic PHP application I worked on for over a decade
wasn't set up with independent modules and the average deploy
probably took a couple seconds, because it was an svn up which only
updated changed files.
I frequently think about this when I watch my current workplace's
node application go through a huge build process, spitting out a
70mb artifact which is then copied multiple times around the entire
universe as a whole chonk before finally ending up where it needs
to be several tens of minutes later.
stickfigure wrote 13 hours 45 min ago:
> which only updated changed files
You pay for this with the inability to maintain instance state
(even caches) and a glacially slow runtime. It's a tradeoff.
trog wrote 8 hours 57 min ago:
Not sure what you mean about either of those two things? Never
had any issues with instance state in our primary production
environments, which were several instances of load balanced web
servers. No idea what you're referring to as "slow"?
withinboredom wrote 1 day ago:
Even watching how php applications get deployed these days, where
it goes through this huge thing and takes about the same amount
of time to replace all the docker containers.
trog wrote 21 hours 16 min ago:
I avoid Docker for precisely that reason! I have one system
running on Docker across our whole org - Stirling-PDF providing
some basic PDF services for internal use. Each time I update it
I have to watch it download 700mb of Docker stuff, instead of
just doing an in-place upgrade of a few files.
I get that there are advantages in shipping stuff like this.
But having seen PHP stuff work for decades with in-place
deploys and no build process I am just continually disappointed
with how much worse the experience has become.
withinboredom wrote 18 hours 10 min ago:
One approach I've seen rather successfully is to have a
container that just contains the files to deploy, and another
one for the runtime. You only need to update the runtime
container ~ once a week or so (to get OS security updates),
and the files container is literally just a COPY command to a
volume.
I've only seen that in one place, ever. Most people just do
the insane 40 minute docker build -- though I've also seen
some that take over 4 hours...
trog wrote 8 hours 59 min ago:
That makes a lot of sense to me!
fulafel wrote 1 day ago:
Yeah, if something even simpler works, that's of course even
better.
I'd argue the difference between that PHP app and the Node app
wasn't the lack of modularity, you could have a modulith with the
same fast deploy.
(But of course modulith is too just extra complexity if you don't
need it)
theptip wrote 1 day ago:
Not a silver bullet; you increase api versioning overhead between
services for example.
motorest wrote 1 day ago:
> Not a silver bullet; you increase api versioning overhead between
services for example.
That's actually a good thing. That ensures clients remain backwards
compatible in case of a rollback. The only people who don't notice
the need for API versionin are those who are oblivious to the
outages they create.
whateveracct wrote 1 day ago:
True but your API won't be changing that rapidly especially in a
backwards-incompatible way.
dhfuuvyvtt wrote 1 day ago:
What's that got to do with microservices?
Edit, because you can avoid those things in a monolith.
Sparkyte wrote 1 day ago:
Sounds like a process problem. 2024 development cycles should be able
to handle multiple lanes of development and deployments. Also why
things moved to microservices so you can deploy with minimal impact as
long as you don't tightly couple your dependencies.
m00x wrote 1 day ago:
You don't need microservices to do this. It's actually easier
deploying a monolith with internal dependencies than deploying
microservices that depend on each other.
Sparkyte wrote 20 hours 35 min ago:
I know microservices and monoliths are a heated topic. However
breaking up complicated code to preserve user experience is
sometimes essential. However you can have machines that contain
many services and that interact with each for performance if
needed. You would put them into pod groups while deploying to
kubernetes and have them call their service inside of the pod. This
can increase performance and through put.
adrianpike wrote 1 day ago:
This is very accurate - microservices can be great as a forcing
function to revisit your architectural boundaries, but if all you
do is add a network hop and multiple components to update when you
tweak a data model, all you'll get is headcount sprawl and deadlock
to the moon.
I'm a huge fan of migrating to microservices as a secondary outcome
of revisiting your component boundaries, but just moving to
separate repos & artifacts so we can all deploy independently is a
recipe for pain.
Sparkyte wrote 20 hours 33 min ago:
Network hop isn't needed if you're deploying your microservices
correctly. So you can make pod groups inside of kubernetes and
application that depends on another can call that lightweight
container contained in that pod group. Pods inherently know the
other is there in their group it has some or like network call
without traversing hardware.
jrs235 wrote 1 day ago:
and a recipe for "career" driven managers and directors to grow
department head count, budget oversight, and self importance.
yarg wrote 1 day ago:
I had a boss who actually acknowledged that he was deliberately holding
up my development process - this was a man who refused to allow me a
four day working week.
dang wrote 1 day ago:
Related:
Slow Deployment Causes Meetings - [1] - Nov 2015 (26 comments)
URI [1]: https://news.ycombinator.com/item?id=10622834
DIR <- back to front page