Checkpoint only allows specifying that all packages as they existed on a particular data in CRAN be downloaded and utilized within a project. This is limiting in that an R developer may want to utilize very specific versions of packages that span multiple epochs.
I would really like to see R develop functionality akin to Maven or SBT, such that an R developer can explicitly specify the exact versions of all dependencies, which will then be installed at the first run.
I disagree with you on the statement that Neo4j does not work well in life sciences. I am a data scientist building large scale systems for mining genomic data, and we built a fairly critical piece of that infrastructure around Neo4j. I actually presented an overview of that work at GraphConnect this week:
Many meaningful lineages in life sciences can be hundreds to thousands of levels deep (our datasets are great examples). Neo4j is the only graph database I have evaluated that handles traversals across lineages of this depth while still achieving the performance scalability promised by maintaining index-free adjacency across which ever node in the cluster a traversal is sent to.
I am just going to point to our work at sparql.uniprot.org.
A graph database with 17 billion edges and 3 billion+ nodes.
Containing in its whole the NCBI tax and GO tax trees. That you can access for free over HTTP using standard SPARQL 1.1.
This does not run on a cluster but single nodes with Virtuoso 7.2.1.
I am not saying that Neo4J is a bad choice, I am just saying that it due to its lack of federation support it is an expensive choice for the life sciences. i.e. an economic argument over a technical one, and not even looking at 1 project a time but in general for the community. Neo4J and Cypher will never support federation in the way that SPARQL allows. This is because all this URI business in RDF is annoying when modelling your data but critical when merging datasets on demand between separate databases. e.g. joining ChEMBL & UniProt & MeSH & PubChem etc...
We in the life sciences rarely do graph traversals for graph traversal sake, but tend to join trees. e.g. intersect a branch of a taxonomic tree with a branch of the GO tree. There are cases where real graph traversals are being done (assembly&variation graphs).
OpenCypher is a great step forward. Now Neo4J needs a open public standard for serializing graphs to disk that can imported into Neo4J and other databases. RDF being supported by so many different databases allows us to support many more of our users (at UniProt) even if they don't use SPARQL or our choice of Graph database themselves.