Migrating entire graph from one csv table..?


#1

Hi,

Is it possible to migrate to Grakn in one go from one large CSV table containing all entities and relations that I’m interested in? I’ve been trying to do that is several ways using the templating language but kept getting errors…

Cheers,
Marcel


#2

Hi Marcel

Please could you post a few more details. Which release are you using? How many rows in your large CSV table, and how many columns? Maybe you could post your ontology and template too? We can probably help better if we have a little more idea on what you are trying to achieve.

Thanks in advance!
Jo


#3

Hi Jo,

I think the gist of the problem can be captured with this example. Consider a (many-to-one) relation, e.g.:

Person | Company

John | Apple
Mark | Microsoft
Rick | Apple
George | Facebook

The goal would be to populate the following ontology in one go:

person isa entity-type
has resource name;

company isa entity-type
has-resource name;

employment isa relation-type
has-role employee
has-role employer;

name isa resource-type datatype string;

employee isa role-type;
employer isa role-type;

The intuitive template should probably look something like this:

insert
$x isa person, has name ;
$y isa company, has name ;
(employee:$x, employer:$y) isa employment;

But this didn’t go through on .5 version. I’m guessing this is because of retrying to create companies that already exist, e.g. the second occurrence of Apple in the table above…?

Cheers,
Marcel


#4

Hi Marcel

That’s correct, you are creating duplicates. I have hit on this myself in a similar example, and it’s because we don’t have unique IDs. The best way to resolve this is to split your data into several CSV files, and import them separately. So have a CSV containing a set of companies, specified uniquely, and migrate that to create the company entities. Then migrate the original file and use your template to inspect each row and build a relationship with the correct company entity.

I hope that makes sense but, if not, I’m writing an example up as we speak and will post a link here to illustrate, when it’s ready.

Best wishes,
Jo


#5

Hi Jo,

Yes, I see what you mean, that makes sense - thanks a lot!

Cheers,
Marcel


#6

Hi @Marcel ,

Glad to see that you are using our stack!

You were missing some plays-role specifications in your ontology:

person isa entity-type
     has-resource name
     plays-role employee;

company isa entity-type
     has-resource name
     plays-role employer;

employment isa relation-type
     has-role employee
     has-role employer;

name isa resource-type datatype string;

employee isa role-type;
employer isa role-type;

Like Jo said, we cannot migrate multiple unique entities from one file at this time. I would suggest splitting up your data as follows:

Company CSV:

Company
Microsoft
Apple
Facebook

Which can be migrated using a very simple template:

insert $x isa company has name <Company>;

Now all the companies will already exist in your graph when you move on to the next migration:

match $c isa company has name <Company>;
insert 
     $p isa person has name <Person>;
     (employee: $p, employer: $c) isa employment;

The reason we have done this is because we want to ensure that all of the data in our graph is immutable. To insert a relation that is not between resources, as you are doing, you must first fetch references to the existing entities.

Hope this helps!

Alexandra