Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How is a graph database built under the hood?

I know that a decent RDBMS (simplified) will consist of the following:

- data in blocks organised with a block-size that the underlying filesystem likes

- a cache for the most frequently used blocks

- every index is a B-Tree with pointers to the blocks containing the tuples

Then there are column stores as well as row stores, and for compression you might have some dictionary encoding going on.

Now, how does the Graph Database look under the hood and what are the complexities involved? How is the Graph persisted?



Graph databases are built a lot of different ways; for example Neo4j's architecture is very, very different than something like an RDF triple store, or datastax on top of cassandra.

[Neo4j internals can be seen here](https://www.slideshare.net/thobe/an-overview-of-neo4j-intern...)...it's a bit old but I think mostly still accurate.

In graphs you have to persist nodes and edges, though you may partition nodes by label/category. In the case of neo4j there is a property store rather than a set of columns.


Thanks, very helpful. I am just looking at it and will have a bit of a think about this later :)


A graph database is similar, only it uses direct-record-ids for linking connected entities and not indexes. So instead of doing joins on indexes it follows record-pointers during graph traversals.

In the graph databases book (graphdatabases.com) there is a chapter on the internal architecture of Neo4j.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: