Large Graph; how to structure it for Grakn


Hi all,

I’m working on a quite large graph (17M nodes) and I’d love to give it the structure of a 3M node ontology + 13M node instances. Is 3M node too much for the ontology ?

The other way I could do it is by having a 200 node ontology and more relationships than I have right now… but that wouldn’t be that handy.

I’d love to test and compare both approaches, but i’ll do it on mid-june that i’ll have time to benchmark queries, until then, what’s your take on this ?

Thanks !


Hi @merqurio,

17 million node graph is perfectly fine - not large at all for Grakn. :wink:
13 million node for instances is therefore also fine.

Now for the 3 Million nodes of ontology … :slight_smile:
a) in terms of size is fine. No one has ever asked this before, but the limit ontology size in Grakn is a 2^31 types.
b) in terms of writing it, though, I don’t think you will be writing this by hand? I assume you’re going to migrate an OWL ontology? Are you migrating the SNOMED to Grakn?

In any case, the numbers all seem okay to me. I’m happy to discuss further with you, Gabriel.

Let me know how I can help!



Great to hear that !

yes is medical term graph db ! it cross-references the snomed among others :grin:
I checked the repo of the grakn implementation of snomed and that made me want to migrate to something I could scale better.

I’ve most parts on sql and nquads, but moving data around is fun :v:

One of my main worries is being able to work with cassandra directly. I’m very used to perform SQL queries on the data (with dask mostly); I haven’t yet looked to the data stored in cassandra but I don’t know if working with Spark on it is feasible or with CQL directly.

Bests !