Used stubby at Google (mainly java), and was intimidated first, then saw the lig...

jeffbee · on Oct 30, 2020

Unfortch, gRPC brings none of these things. If you want delegated caller, originator, trace ID, or any other kind of baggage propagated down your RPC call graph, you are doing it yourself with metadata injectors and extractors at every service boundary.

gen220 · on Oct 30, 2020

Depending on your perspective, though, this can be seen as a positive thing: gRPC is extensible enough that all of this can be built on top.

I'm sure that in 10 years, there will be more concepts like "trace IDs" that we will consider minimally necessary for distributed service architectures, that don't exist today.

FWIW, writing the libs to do the metadata injection/extraction is pretty straightforward and transparent to application developers if they're done right.

llimllib · on Oct 30, 2020

Here's my real-world grpc request ID propagation middleware in go, it's extremely simple: https://gist.github.com/llimllib/d0840eaee14411a50201960615d...

this function gets called a couple hundred million times per week, and has never failed as far as I know

gen220 · on Oct 30, 2020

Yep! The internal libs at our company look very very similar.

You can also do fun middleware things like rate limiting and ACLs, but I haven't seen those in the wild.

If somebody has links to examples those, please share. :)

jrockway · on Oct 30, 2020

I wrote this authorization code last night: https://github.com/jrockway/jsso2/blob/master/pkg/internalau...

Obviously it's quite natural to just add interceptors as you need them and no doubt there are hundreds of things like this across the Internet.

To some extent, I can't get over how much of a mess you can make by doing things like this. Because generated service definitions have a fixed schema (func (s Service) Method(context.Context, Request) (Reply, error)), you have to resort to hacks like propagating the current user through the context, instead of the easy-to-read and easy-to-test alternative of just passing in the parameters explicitly, as in func (s Service) Method(context.Context, types.Session, Request) (Reply, error). If I was going to spend time on infrastructure, that's the thing I'd fix first.

Some other frameworks do a little better here. We use GraphQL at work, and the methods are typically of the pattern:

    func AutogeneratedMethod(context.Context, ...) {
       foo := MustGetFoo(ctx)
       bar := MustGetBar(ctx)
       return ActualThingThatDoesTheWork(ctx, foo, bar, ...)
    }

This makes testing the Actual Thing That Does The Work easier, and the reader of that method knows exactly what sort of state the result depends on (the most important goal in my opinion).

sneak · on Oct 31, 2020

I assume your Must functions panic. Is panic-ing in a handler considered okay? I was under the impression that using panics as exceptions in this way was un-idiomatic and thus frowned upon.

If it’s the common practice then I’ll happily jump on board, I miss exceptions. I’m just curious because I’m starting to do more web stuff with Go and had been treating handler recovery as something I should endeavor to never touch, as a final guardrail to keep a server process from exiting.

jrockway · on Oct 31, 2020

Generally I agree with you and tend to avoid writing or calling functions that can panic. I make a slight exception in this case because it will "never" hit the error case; you are "always" calling the RPC through an interceptor that adds the necessary value to the context, so your MustGet functions will "always" work. I use "quotes" around never and always because ... sometimes someone edits the code to delete this invariant, and it is very easy to do that. But, it explodes loudly when it panics, so at least you can go in and fix the problem without much effort. (It would be really insidious to return an error and not check it -- some code three functions deep would then blow up with a nil pointer exception; or to return a subtly-not-working default value that cause subtle broken behavior that's hard to detect. Panics are preferred because the program crashes at the exact line of code that's broken.)

For any case where the error would depend on runtime input, rather than compile-time structure, you should return and check an error. If hitting the error case means "we need to edit the code and release a new version", panic is a fine tool to surface that problem.

All in all, I'd basically be happy either way. If you make your GetFoo function return an error that all the handlers check, I'd approve that PR. If you judiciously panic when something that can "never" happen happens, I'd approve that PR. The obvious preference is to update the codegen to pass those parameters in explicitly, so that if the interceptor structure changes, the program will simply not compile. But, the design of things like net/http and grpc kind of preclude easily doing that, and I'm not sure it's worth the effort to write something on top of those that fix the problem, when it's simple enough to do the unpacking manually and your integration tests test it for free anyway.

sneak · on Nov 1, 2020

That’s a wonderful explanation, thank you.

jrockway · on Oct 30, 2020

Yeah, I think you just need to invest a small amount of time to set up your clients and servers, and you reap the benefits for a long time.

I use https://github.com/jrockway/opinionated-server as the foundation of my server-side apps. I thought about monitoring, tracing, and logging once... and then get it for free in every subsequent project. I explode the app into my cluster, and traces are immediately available in Jaeger, I can read/analyze my logs, and it's all a pleasure to debug. I think everyone should just have their own thing like this, because it's definitely a pain to do it more than once, but a joy once you have it working.

(My thing specifically is missing some things I now need, though, like multiple loggers so you can turn on/off rpc tracing at runtime, support for auto-refreshing TLS certs from a volume mount now that everything is mTLS, etc. Will be added when I am less lazy ;)

While I'm here I'll also plug https://github.com/jrockway/json-logs for making the structured logs enjoyable to read. You can query them with jq, it understands all the popular JSON log formats, and I can't live without it. It's the only client-side app I've ever written that passes the "toothbrush test" -- I use it twice a day, everyday. Recommended ;) (Someday I will write docs and an automated release process.)

jeffbee · on Oct 30, 2020

There's always two sides to extensibility. One the one hand, you have the opportunity to do it your way. On the other, you have to do it. The extreme of extensibility is always an empty file. You get to pick the language and the architecture and everything!

gravypod · on Oct 31, 2020

I've started implementing some of this at my job. We have an internal proto that describes the config of a gRPC service. We then have a library for all of our languages that turns that proto into a Channel that's instrumented with everything from middlewares to low level socket settings (keep alive, idle timeouts, retries, hedging, etc). Makes our deployments super easy as well since every service is configured to talk to every other service in almost a 100% identical way.

malkia · on Oct 30, 2020

I've only used grpc for C# talking to C++ and wanted to plug this in, saw other users posting about injectors - so gonna look into it :) - I really want to avoid yet another proxy/"service-mesh" process for something that could be done on the client side (I wish I could inject the http2 library to propagate this always with headers somehow, but it's not a single thing, also too low level for me - Tools/app developer pretty much)

random3 · on Oct 30, 2020

jeffbee, actually, you do and it works well, you don't have to do more than use a library. You get it out of the box in GCP. It works with Opentelemetry / OpenCensus Dapper was not embedded in Stubby either.