MongoDB as Storage?


Hi all.

Does GRAKN already support MongoDB as storage backend? if so, can you please indicate the steps to follow to set up Mongo as the storage of GRAKN?



Hi @Pegazux, we don’t support MongoDB, since GRAKN.AI itself is “a database”. It is a (superior**) alternative to MongoDB, though. :slight_smile:

**if you have large, diverse and complex data.


@haikal would be great to have a python client lib given python is the most wanted language in 2017 according to Stack Overflow and #3 for data scientists:


we do have a python driver @johnwoo1. But i do imagine it may need some update and maintenance. We’ll build the properly library (not just a wrapper) at some point. @miko or @felix can point you to where the python driver is.

ps: miko/felix we should put the python driver in a repository on our github.


yes I chatted with miko a few weeks ago about the driver this one I think:

it says “Quick and dirty scripts”, so I don’t know if it qualifies for serious prod use

what’s your ‘ETA’ on this? 3 months? 6 ? 12? 24? just asking, I know you have probably a lot to do


We’ll make sure to add to our development requirements to release and maintain an official python driver, @johnwoo1 - I promise! I’ll let you know when that happens. :slight_smile:


I’ve been playing with NetworkX (A well know python library for graphs) to load and store data from/to grakn. I do not have anything shareable yet, but I was planning on sharing a package with the community. If there is interest in the community we can create a focus group.


that’s awesome @merqurio! yes, it will be definitely useful to the community. Do share when you can and we can also help polishing things up!


Followon Question - I have a MongoDB with about a terabyte of stuff that was migrated from a neo4j database - are you serious about size and performance ? I too was thinking of using mongo to back grakn (don’t need the images and detailed descriptive data until the bitter end, first need to use feature details to figure out what the machine is looking at, i.e. solve from a low level set of known features to what high level object has those features) Its grakl I want - this db powers a machine vision system, it has a very serious load on it - are you claiming performance numbers equivalent to mongo, and if so, can you provide links to test cases and measurements ? Thanks