Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I'm reading it right, the directory service today is a single host. That was very misleading after these statements (which suggested something closer to netflix eureka):

"Zookeeper provides a strongly consistent model; the directory service focuses on availability."

"Baker Street doesn't use Zookeeper or the other popular service discovery frameworks because we wanted a simple, highly available service, not a strongly consistent one."

Edit: Which is not to say that the project isn't interesting, just that some of the copy felt like a bait and switch. :)



I had the same reaction, and I have severe reservations about availability-focused service location. The potential of firing traffic at the wrong nodes and having it dropped on the floor is a real red flag for me. A failure of a directory service due to a lack of consistency allows an application to, if not trivially, at least reliably cache requests to be pushed later when the health of the overall architecture can be established.


In a distributed architecture it is very difficult to avoid the possibility you mention even with a strongly consistent store at the center of your service discovery mechanism. The consistency the store provides doesn't necessarily extend to the operational state of your system.

For example, your zookeeper nodes may all be consistent with each other, but given that a server can fail at anytime, that information while consistent may still be stale. Likewise, if a client is caching connections outside of zookeeper's consensus mechanism, then these connections will also become stale in the face of changes.

Given these possibilities, there is always the potential for traffic to be dropped on the floor regardless of how consistent your store is, so ultimately what matters is how to minimize the probability of this occurring and whether your system can cope when it does.


We didn't intend to do a bait and switch. We mentioned this in the docs, but perhaps it was a little too buried. Our plan is to support multiple instances of the directory server for high availability. This is similar in principle to how systems like DNS or NSQ function.


Yep, I did eventually find that. Having to search for it was frustrating; so much of the copy is devoted to describing what Baker Street isn't (hey, doesn't use consensus!) and not what it is (uses a single node, TODO: master/slaves or chain replication or blah blah blah). And it's kind of an important point, because it changes this from "might give this a go for a less critical service" to "unusable in the short term."


It's a fair point, so we'll clarify this (and we're working on the replication bit too). Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: