CSV migration issues


#1

I’m having two issues with CSV migration. The first is key does not seem to work and the second is the -v,–verbose option causes issues during import.

First issue, even though I’m using key on almost every attribute I’m still getting duplicate entities. I’m using the schema: https://github.com/BFergerson/grakn-speed-test/blob/master/grakn-schema.gql

Notice the attributes name, qualified_name, and commit_sha1 are all labeled as key. From my understanding of the key keyword this should prevent entities from having either the same name, qualified_name, or commit_sha1. So having them all being exactly the same on the same entity should definitely not work.

However, running this script twice: https://github.com/BFergerson/grakn-speed-test/blob/master/src/main/java/TestImport.java results in the following logs:

First run: https://gist.github.com/BFergerson/a8e4f0705a7657ddf24e52d0af26a40a#file-first-run
Second run: https://gist.github.com/BFergerson/5c934fe7dc3e55afc05473bef7e90a65#file-second-run

Notice all the counts increase. How is this possible when I’m using key?

Second issue, same script: https://github.com/BFergerson/grakn-speed-test/blob/master/src/main/java/TestImport.java but now uncomment //"-v", //<< doesn’t work. This results in: https://gist.github.com/BFergerson/c02fd945069ee3df5bb948545541635d#file-verbose-run

Not sure what that’s about but seems erroneous specially since simply uncommenting that option causes the script to run fine.


#2

hi @BFergerson! Sorry for the issues with the Graql migrator. It’s been unreliable since our last release. We’re planning to rewrite it next month but for now, the work around is to:

Output your migration queries to a graql file, and load the file normally.

The way to do this is by adding --no > somefilename.gql to the command in the shell. For example:
./graql migrate csv -i dataset.csv -t migration-script.gql -k grakn --no > somefilename.gql

You can then verify if your queries are looking correct from that file. Let me know if that helps!


#3

@haikal, besides the –verbose option not working the migrator seems to work fine. Regardless, I used the –no option and received the following output:

match $p0 isa project has name “bfergerson/myproject”;
insert $f0 isa file has name “kythe://github?lang=java?com/gitdetective/App.java” has qualified_name “kythe://github?lang=java#App.java”;
(has_defines: $p0, is_defines: $f0) isa defines;
match $p0 isa project has name “bfergerson/myproject”;
insert $f0 isa file has name “kythe://github?lang=java?com/gitdetective/MyClass.java” has qualified_name “kythe://github?lang=java#MyClass.java”;
(has_defines: $p0, is_defines: $f0) isa defines;

No surprise there, however running the above query twice still results in the duplicate entities. I’m not as concerned with the –verbose option not working as I am with the key keyword not working.

Please see the below image (shows running the above query followed by an aggregate query twice):

The migrator doesn’t seem to have anything to do with duplicate entities being inserted when using the key keyword. Could you please address that issue?


What does "key" actually do?
#4

Hi @BFergerson, sorry my late reply! It’s been hectic. It seems like we have a bug with keys. I’ll start working on this soon!