fawadkhaliq's comments

fawadkhaliq · 2025-08-03T18:11:01 1754244661

Couldn't agree more.

fawadkhaliq · on Nov 3, 2024

I agree that the broader software ecosystem has been slow to recognize the importance of Operational Safety. The CrowdStrike outage, while unfortunate, has indeed served as a wake-up call, elevating Operational Safety to a priority for software leaders and CIOs alike.

As you pointed out, the reliance on complex, mission-critical systems is only increasing, and cascading failures are an inherent risk we must address proactively. By learning from organizations like AWS that have successfully integrated Operational Safety into their practices, we can work towards a more resilient and reliable software ecosystem. Let's continue to advocate for making Operational Safety a foundational element in software operations across the industry.

fawadkhaliq · on Feb 21, 2024

How are users handling this change? With Buoyant cutting down on their investment, does Linkerd OSS have enough contributors to sustain it?

"As of Linkerd 2.15.0, the open source project no longer publishes stable releases. Instead, the vendor community around Linkerd is responsible for supported, stable releases."

ref: https://linkerd.io/releases/

fawadkhaliq · on Dec 7, 2023

Couldn't agree more. As someone who used to work at AWS, I've seen it from both sides. AWS has valid reasons (business and technical) for not taking responsibility for all the layers on top. The missing piece is the operational knowledge AWS possesses but platform teams elsewhere lack access to. That's one reason to bring in a “trusted broker” to bridge this gap.

fawadkhaliq · on Dec 7, 2023

k8s complexity is a challenge at scale, but its growth seems likely. Reasons include strong community/support, continuous innovation ensures new capabilities regularly added, overall standardization across various layers of substrate to name a few. It's not for everyone though. Teams with simpler needs might find k8s overkill and opt out for valid reasons. Overall, benefits + community support make it a go-to for many, despite the challenges.

fawadkhaliq · on Aug 15, 2023

What are the factors that lead to ClickOps not scaling? Company size? When should a company know that it’s time to move to IaC?

imahat · on Aug 15, 2023

2 factors imo - scope of permissions & time spent on manual configuration/approvals as a company grows in size of employees and the number of resources it manages. ClickOps doesn't scale well since more employees requesting access to more resources will result in more context switches to grant/revoke those requests. At my last large fintech company, this led to lots of ad-hoc access request or time spent tracking which file contained the permissions and reviewer for which resource - time that could've been used developing features instead.

As for when - when the devs in a company find themselves spending much of their time manually approving permissions requests or just broadly granting access without paying attention to the scope is a good sign they're ready to move to Iac