CSV migration issues


I’m having two issues with CSV migration. The first is key does not seem to work and the second is the -v,–verbose option causes issues during import.

First issue, even though I’m using key on almost every attribute I’m still getting duplicate entities. I’m using the schema: https://github.com/BFergerson/grakn-speed-test/blob/master/grakn-schema.gql

Notice the attributes name, qualified_name, and commit_sha1 are all labeled as key. From my understanding of the key keyword this should prevent entities from having either the same name, qualified_name, or commit_sha1. So having them all being exactly the same on the same entity should definitely not work.

However, running this script twice: https://github.com/BFergerson/grakn-speed-test/blob/master/src/main/java/TestImport.java results in the following logs:

First run: https://gist.github.com/BFergerson/a8e4f0705a7657ddf24e52d0af26a40a#file-first-run
Second run: https://gist.github.com/BFergerson/5c934fe7dc3e55afc05473bef7e90a65#file-second-run

Notice all the counts increase. How is this possible when I’m using key?

Second issue, same script: https://github.com/BFergerson/grakn-speed-test/blob/master/src/main/java/TestImport.java but now uncomment //"-v", //<< doesn’t work. This results in: https://gist.github.com/BFergerson/c02fd945069ee3df5bb948545541635d#file-verbose-run

Not sure what that’s about but seems erroneous specially since simply uncommenting that option causes the script to run fine.


hi @BFergerson! Sorry for the issues with the Graql migrator. It’s been unreliable since our last release. We’re planning to rewrite it next month but for now, the work around is to:

Output your migration queries to a graql file, and load the file normally.

The way to do this is by adding --no > somefilename.gql to the command in the shell. For example:
./graql migrate csv -i dataset.csv -t migration-script.gql -k grakn --no > somefilename.gql

You can then verify if your queries are looking correct from that file. Let me know if that helps!


@haikal, besides the –verbose option not working the migrator seems to work fine. Regardless, I used the –no option and received the following output:

match $p0 isa project has name “bfergerson/myproject”;
insert $f0 isa file has name “kythe://github?lang=java?com/gitdetective/App.java” has qualified_name “kythe://github?lang=java#App.java”;
(has_defines: $p0, is_defines: $f0) isa defines;
match $p0 isa project has name “bfergerson/myproject”;
insert $f0 isa file has name “kythe://github?lang=java?com/gitdetective/MyClass.java” has qualified_name “kythe://github?lang=java#MyClass.java”;
(has_defines: $p0, is_defines: $f0) isa defines;

No surprise there, however running the above query twice still results in the duplicate entities. I’m not as concerned with the –verbose option not working as I am with the key keyword not working.

Please see the below image (shows running the above query followed by an aggregate query twice):

The migrator doesn’t seem to have anything to do with duplicate entities being inserted when using the key keyword. Could you please address that issue?

What does "key" actually do?

Hi @BFergerson, sorry my late reply! It’s been hectic. It seems like we have a bug with keys. I’ll start working on this soon!