Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Usually people find that there are simpler/better ways to get the job done.

CRDTs can do one magic trick no other technology solutions can do: They can let users collaboratively edit without needing a centralized server. And this works between users, between devices or between processes.

Its just, most of the time that doesn't matter. Servers are cheap, and big companies frankly seem to love it when we're dependent on their infrastructure.

I want CRDTs for more "personal computing". I want my data to be shared between my devices seamlessly. And I want that to keep working even when I can't contact google's computers. I also really like something about Git - which is the fact that my computers are first class citizens in a repository. Unlike Google Docs, if github ever goes down, I don't lose any data and I can still get my work done, and collaborate with my teammates.

All software should work like that. I'm fighting to make CRDTs work well so we can bring that future within reach.



To OP's point, which I secretly wished I had the courage to post when CRDTs came up a couple times last week: I've been seeing these posts, and your comment, for what feels like my entire 13 years on HN. (That can't be true. But I distinctly recall someone saying "if they can do text this fast hey'll be able to do arbitrary JSON soon!" at least 5 years ago)

At the beginning of that period I made and sold a startup on having a dead simple sync strategy, by myself, while I watched 3 funded competitors try to build sync, fail and go under. I had no funding and no CS education, I was a waiter.


Arbitrary JSON has been done plenty of times, and it works great. Automerge, Yjs and ShareDB all manage it.


I don't quite understand what's going on then because I swear I just read in one of those threads last week that even the best text automerger still needs manual intervention to avoid creating nonsense...which, to be honest, sounded off because I thought this whole thing was popularized by Google Docs? Wave? and it didn't have that issue


Can you link the other comment? Its hard to know what you're talking about without that context.

In any case, you can't use a text "automerger" to collaboratively edit JSON text. (Whats an "automerger"?)

To understand why, suppose we had an empty (JSON) list: []. Two users concurrently inserted two items, "a" and "b". If those inserts get translated into string inserts, you end up with a JSON string containing ["a""b"] - which is a syntax error!

The trick to making this work correctly is that the CRDT / OT system needs to understand that you're editing a list. Don't collaboratively edit the JSON string. Collaboratively edit a list then JSON.stringify the result.


There's a possibility he's talking about what I wrote about CRDTs for syntax trees - https://writer.zohopublic.com/writer/published/grcwy5c699d67...

Related HN thread: https://news.ycombinator.com/item?id=29433896


> the best text automerger still needs manual intervention to avoid creating nonsense

AFAIK the best production-ready CRDT solution for text is Yjs [0], and in some specific cases, the result of the merge might be something unexpected (one example [1]). However, there is research-quality CRDT called Peritext [2] which is specifically designed to handle styled text in an intuitive way, so the merge is more predictable.

[0]: https://github.com/yjs/yjs

[1]: https://github.com/yjs/yjs/issues/382

[2]: https://www.inkandswitch.com/peritext/


I don't think this comment makes much sense, arbitrary unicode text is very different than JSON. What application domain are you talking about?


> Servers are cheap, and big companies frankly seem to love it when we're dependent on their infrastructure.

I've worked on a few different sync engines for mobile apps over the years, it's also a matter of user expectations in consumer apps. Users expect to seamlessly pick up where they left off on another device and the absence of an intermediary server makes that difficult.


You can always have an intermediary server with CRDTs is you want. They're just another peer on the network.


Yeah I'm aware - I got the impression from your comment you were advocating for a more pure P2P approach (not necessarily CRDT related).


I am. I guess thats something I really like about git + github - you get the best of both worlds. Github is fantastic, and you can use github all day long. But you don't depend on it. If github had an outage like atlassian is having at the moment, there's no downtime. There's no risk of data loss. You can keep working and if you need to you can always move to gitlab or self host.

The downside is you sort of have the worst of both worlds in terms of complexity. You need a working, performant CRDT and a working centralized server. But I think its a great model that we (as an industry) should explore more.


Yeah I agree it's great from that perspective but the business goals very rarely make it a priority given the added complexity. Even supporting offline-first style workflows is step up as you can tolerate more downtime on the backend as your app doesn't become completely useless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: