Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Used stubby at Google (mainly java), and was intimidated first, then saw the light - when almost everything uses the same way of talking, not only you get C++, java, python, go and other languages speaking freely to each other, but other extra benefits - for example each RPC can carry as a "tag" (key/value?) the user/group it came from, and this can be used for budgeting:

For example - your internal backend A, calls service B, which then calls some other service C - so it's easy to log that A called B, but the fact that C was called because of what A asked is not, though if propagated (through the tag) then C can report - well I was called by B, but that was on on A to pay.

Then dapper, their distributed tracing was helpful in the few times I had to deal with oncall (actually them asking me to do it). And in general, it felt like you don't have to write any low-level sockets code (which I love)



Unfortch, gRPC brings none of these things. If you want delegated caller, originator, trace ID, or any other kind of baggage propagated down your RPC call graph, you are doing it yourself with metadata injectors and extractors at every service boundary.


Depending on your perspective, though, this can be seen as a positive thing: gRPC is extensible enough that all of this can be built on top.

I'm sure that in 10 years, there will be more concepts like "trace IDs" that we will consider minimally necessary for distributed service architectures, that don't exist today.

FWIW, writing the libs to do the metadata injection/extraction is pretty straightforward and transparent to application developers if they're done right.


Here's my real-world grpc request ID propagation middleware in go, it's extremely simple: https://gist.github.com/llimllib/d0840eaee14411a50201960615d...

this function gets called a couple hundred million times per week, and has never failed as far as I know


Yep! The internal libs at our company look very very similar.

You can also do fun middleware things like rate limiting and ACLs, but I haven't seen those in the wild.

If somebody has links to examples those, please share. :)


I wrote this authorization code last night: https://github.com/jrockway/jsso2/blob/master/pkg/internalau...

Obviously it's quite natural to just add interceptors as you need them and no doubt there are hundreds of things like this across the Internet.

To some extent, I can't get over how much of a mess you can make by doing things like this. Because generated service definitions have a fixed schema (func (s Service) Method(context.Context, Request) (Reply, error)), you have to resort to hacks like propagating the current user through the context, instead of the easy-to-read and easy-to-test alternative of just passing in the parameters explicitly, as in func (s Service) Method(context.Context, types.Session, Request) (Reply, error). If I was going to spend time on infrastructure, that's the thing I'd fix first.

Some other frameworks do a little better here. We use GraphQL at work, and the methods are typically of the pattern:

    func AutogeneratedMethod(context.Context, ...) {
       foo := MustGetFoo(ctx)
       bar := MustGetBar(ctx)
       return ActualThingThatDoesTheWork(ctx, foo, bar, ...)
    }
This makes testing the Actual Thing That Does The Work easier, and the reader of that method knows exactly what sort of state the result depends on (the most important goal in my opinion).


I assume your Must functions panic. Is panic-ing in a handler considered okay? I was under the impression that using panics as exceptions in this way was un-idiomatic and thus frowned upon.

If it’s the common practice then I’ll happily jump on board, I miss exceptions. I’m just curious because I’m starting to do more web stuff with Go and had been treating handler recovery as something I should endeavor to never touch, as a final guardrail to keep a server process from exiting.


Generally I agree with you and tend to avoid writing or calling functions that can panic. I make a slight exception in this case because it will "never" hit the error case; you are "always" calling the RPC through an interceptor that adds the necessary value to the context, so your MustGet functions will "always" work. I use "quotes" around never and always because ... sometimes someone edits the code to delete this invariant, and it is very easy to do that. But, it explodes loudly when it panics, so at least you can go in and fix the problem without much effort. (It would be really insidious to return an error and not check it -- some code three functions deep would then blow up with a nil pointer exception; or to return a subtly-not-working default value that cause subtle broken behavior that's hard to detect. Panics are preferred because the program crashes at the exact line of code that's broken.)

For any case where the error would depend on runtime input, rather than compile-time structure, you should return and check an error. If hitting the error case means "we need to edit the code and release a new version", panic is a fine tool to surface that problem.

All in all, I'd basically be happy either way. If you make your GetFoo function return an error that all the handlers check, I'd approve that PR. If you judiciously panic when something that can "never" happen happens, I'd approve that PR. The obvious preference is to update the codegen to pass those parameters in explicitly, so that if the interceptor structure changes, the program will simply not compile. But, the design of things like net/http and grpc kind of preclude easily doing that, and I'm not sure it's worth the effort to write something on top of those that fix the problem, when it's simple enough to do the unpacking manually and your integration tests test it for free anyway.


That’s a wonderful explanation, thank you.


Yeah, I think you just need to invest a small amount of time to set up your clients and servers, and you reap the benefits for a long time.

I use https://github.com/jrockway/opinionated-server as the foundation of my server-side apps. I thought about monitoring, tracing, and logging once... and then get it for free in every subsequent project. I explode the app into my cluster, and traces are immediately available in Jaeger, I can read/analyze my logs, and it's all a pleasure to debug. I think everyone should just have their own thing like this, because it's definitely a pain to do it more than once, but a joy once you have it working.

(My thing specifically is missing some things I now need, though, like multiple loggers so you can turn on/off rpc tracing at runtime, support for auto-refreshing TLS certs from a volume mount now that everything is mTLS, etc. Will be added when I am less lazy ;)

While I'm here I'll also plug https://github.com/jrockway/json-logs for making the structured logs enjoyable to read. You can query them with jq, it understands all the popular JSON log formats, and I can't live without it. It's the only client-side app I've ever written that passes the "toothbrush test" -- I use it twice a day, everyday. Recommended ;) (Someday I will write docs and an automated release process.)


There's always two sides to extensibility. One the one hand, you have the opportunity to do it your way. On the other, you have to do it. The extreme of extensibility is always an empty file. You get to pick the language and the architecture and everything!


I've started implementing some of this at my job. We have an internal proto that describes the config of a gRPC service. We then have a library for all of our languages that turns that proto into a Channel that's instrumented with everything from middlewares to low level socket settings (keep alive, idle timeouts, retries, hedging, etc). Makes our deployments super easy as well since every service is configured to talk to every other service in almost a 100% identical way.


I've only used grpc for C# talking to C++ and wanted to plug this in, saw other users posting about injectors - so gonna look into it :) - I really want to avoid yet another proxy/"service-mesh" process for something that could be done on the client side (I wish I could inject the http2 library to propagate this always with headers somehow, but it's not a single thing, also too low level for me - Tools/app developer pretty much)


jeffbee, actually, you do and it works well, you don't have to do more than use a library. You get it out of the box in GCP. It works with Opentelemetry / OpenCensus Dapper was not embedded in Stubby either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: