Graph databases are built a lot of different ways; for example Neo4j's architecture is very, very different than something like an RDF triple store, or datastax on top of cassandra.
In graphs you have to persist nodes and edges, though you may partition nodes by label/category. In the case of neo4j there is a property store rather than a set of columns.
A graph database is similar, only it uses direct-record-ids for linking connected entities and not indexes.
So instead of doing joins on indexes it follows record-pointers during graph traversals.
In the graph databases book (graphdatabases.com) there is a chapter on the internal architecture of Neo4j.
I know that a decent RDBMS (simplified) will consist of the following:
- data in blocks organised with a block-size that the underlying filesystem likes
- a cache for the most frequently used blocks
- every index is a B-Tree with pointers to the blocks containing the tuples
Then there are column stores as well as row stores, and for compression you might have some dictionary encoding going on.
Now, how does the Graph Database look under the hood and what are the complexities involved? How is the Graph persisted?