Hacker Newsnew | past | comments | ask | show | jobs | submit | alex-mohr's commentslogin

Process is useful for raising the lowest deliveries quality, for making former-unknowns into knowns, and for preventing misaligned behavior when culture alone becomes insufficient.

If you have need for speed, a team that knows the space, and crucially a leader who can be trusted to depart from the usual process when that tradeoff better meets business needs, it can work really well. But also comes with increased risk.


And you could write a similar blog post about why Google "failed" at AI productization (at least as of a year ago). For some of the same and some completely different reasons.

  - two competing orgs via Brain and DeepMind.

  - members of those orgs were promoted based on ...?  Whatever it was, something not developing consumer or enterprise products, and definitely not for cloud.

  - Nvidia is a Very Big Market Cap company based on selling AI accelerators.  Google sells USB Coral sticks.  And rents accelerators via Cloud.  But somehow those are not valued at Very Big Market Cap.
Of course, they're fixing some of those problems: brain and DeepMind merged and Gemini 2.5 pro is a very credible frontier model. But it's also a cautionary tale about unfettered research focus insufficiently grounded in customer focus.


> But it's also a cautionary tale about unfettered research focus insufficiently grounded in customer focus.

I got the exact opposite takeaway: despite Amazon and Google being pioneers in related areas, both failed to capitalize on their headstarts and kickstart the modern AI revolution because they were hobbled by being grounded in customer focus.


The code in question reminds me a lot of my favorite Kubernetes bug:

  if (request.authenticationData) {
    ok := validate(etc);
    if (!ok) {
      return authenticationFailure;
    }
  }
Turns out the same meme spans decades.


This is a nice example of why one should parse, not validate. If every function that requires some kind of permission takes that permission as an argument, say (pseudocode)

  void doFoo(PermissionToDoFoo permission, ...){...}
and then, the only way to call it is through something like

  from request import getAuth, respond
  \\  Maybe<AuthenticationData> getAuth(Request request)
  \\  void respond(String response)
  from permissions import askForPermissionToDoFoo
  \\  Maybe<PermissionToDoFoo> askForPermissionToDoFoo(AuthenticationData auth)

  response =
    try
      auth <- getAuth(request)
      permission <- askForPermissionToDoFoo(auth)
      doFoo(permission)
      "Success!"
    fail
      "Oopsie!"

  respond(response)
It becomes impossible to represent the invalid state of doing Foo without permission.


This is also known as capability-based access control. It was implemented in Project Midori [1] — Microsoft’s flopped managed microkernel OS

[1] - https://en.wikipedia.org/wiki/Midori_(operating_system)


Where can I read about the bug? And what is the bug? If there is no authenticationData it is authenticated by default or what?


It was in the early days of Kubernetes and long since fixed. I don't recall the precise details, but it was likely the first official CVE we published: https://kubernetes.io/docs/reference/issues-security/officia...

Link to the patch fixing it: https://github.com/kubernetes/kubernetes/commit/7fef0a4f6a44...

Of course, we'd already fixed other issues like Kubelet listening on a secondary debug port with no authentication. Those problems stemmed from its origins as a make-it-possible hacker project and it took a while to pivot it to something usable in an enterprise.


I don't know where you can read about this, but you are in the good track

If there is no authenticationData then the if !Ok is never run and the code continues execution as it were authenticated.


The way software is built hasn't changed in decades.


> The way software is built hasn't changed in decades.

Correct. The only thing that changed is the number of level of abstractions.


As far as I could tell, its main goal was to have fun writing an OS. At that, it seems to have succeeded for a number of the people involved?

In terms of impact or business case, I'm missing what the end goal for the company or execs involved is. It's not re-writing user-space components of AOSP, because that's all Java or Kotlin. Maybe it's a super-longterm super-expensive effort to replace Linux underlying Android with Fuchia? Or for ChromeOS? Again, seems like a weird motivation to justify such a huge investment in both the team building it and a later migration effort to use it. But what else?


When I worked at $GOOG my manager left the team to work on Fuchsia and he described it as a "senior engineer retention project", but also the idea was to come up with a kernel that had a stable interface so that vendors could update their hardware more easily compared to linux.

Many things that google did when I was there was simply to have a hedge, if the time/opportunity arose, against other technologies. For example they kept trying to pitch non-Intel hardware, at least partly so they could have more negotiation leverage over Intel. It's amazing how much wasted effort they have created following bad ideas.


The problem with Fuchsia is it went from that to "We're taking all your headcount and rewriting your entire project on Fuchsia" and then started making deadline promises to upper management that it couldn't fulfill.

They seemed to have unlimited headcount to go rewrite the entire world to put on display assistant devices that had already shipped and succeeded with an existing software stack that Google then refused to evolve or maintain.

Fuchsia itself and the people who started it? Pretty nifty and smart. Fuchsia the project inside Google? Fuck.


  > the idea was to come up with a kernel that had a stable interface so that vendors could update their hardware more easily
interesting... if that was a big goal, i wonder why they didn't go with some kind of adapter/meta-driver (and just maintain that) to the kernel that has a stable interface.

maybe long-term not viable i guess...?


Clearly the next step after building your own CPU and SOC is to start Apple Foundry and become totally vertically integrated?


Also a myth for GCE.

From a technical perspective, App Engine and Compute Engine were built on top of internal infrastructure (borg), but did not expose borg directly. And there were a number of interesting mismatches between the semantics that customers expected of VMs and what borg offered to its containers that eventually resulted in dedicated borg clusters with different configs for cloud. And some retrospectives on whether building on borg was a better option than going bare metal directly.

Org-wise, the App Engine team was first and not part of the internal-focused Technical Infrastructure teams. GCS came next, and it too was not part of the canonical storage org. Then GCE, which was only possible because it was either written off or at least tolerated as an experiment by most, with a few key people providing behind-the-scenes support to make it happen -- especially in networking. It likely also helped that GAE was in SF and the rest of GCP in Seattle/Kirkland initially, so geo provided some insulation too.

The dominant perspective internally was that Google's technical infrastructure was its secret sauce, so why would they give it away to others? It took a long time to change that.

[Disclosure/source: I was on GCE and helped get it launched.]


If Netflix were working correctly and could handle the load, you'd absolutely be correct.

But it does seem the capacity of a hybrid system of Netflix servers plus P2P would be strictly greater than either alone? It's not an XOR.

And note that in this case of "live" streaming, it still has a few seconds of buffer, which gives a bandwidth-delay product of a few MB. That's plenty to have non-stale blocks and do torrent-style sharing.


If switching to a peer causes increased buffering (which it will, because you still have to wait for the peer to download from Netflix) then you will still have the original problem Netflix is suffering from.

If the solution to users complaining about buffering is to build a system with more inherent buffering then you are back at square one.

I think it’s might be helpful to look at netlfix’s current system as already a distributed video delivery system in which they control the best seeds. Adding more seeds may help, but if Netflix is underprovisioned from the start you will have users who cannot access the streams


Yes, the properties about scaling do hold even with near-real-time streams. [1]

The problems with using it as part of a distributed service have more to do with asymmetric connections: using all of the limited upload bandwidth causes downloads to slow. Along with firewalls.

But the biggest issue: privacy. If I'm part of the swarm, maybe that means I'm watching it?

[1]: Chainsaw: P2P streaming without trees, https://link.springer.com/chapter/10.1007/11558989_12


J. Shallit (2003). "What this country needs is an 18c piece" (PDF). Mathematical Intelligencer. 25 (2): 20–23.


What readers of your comment need is two HN thread references :)

What This Country Needs is an 18¢ Piece (2002) [pdf] - https://news.ycombinator.com/item?id=38665334 - Dec 2023 (272 comments)

What This Country Needs Is an 18¢ Piece [pdf] - https://news.ycombinator.com/item?id=14579635 - June 2017 (45 comments)


Willow appears closer to a "Protocol Construction Kit" than a protocol itself.

As a construction kit, it has value for people who want to make protocols where they'll control both ends, but don't have to re-implement basic table stakes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: