Inconsistent batch import results


#1

While trying to import a large amount of entities (500k~) using:
./graql console -b entities.gql

and a large amount of relationships (600k~) between those entities using:
./graql console -b relationships.gql

I noticed Grakn was trying to create relationships between entities that shouldn’t have any relationship. Each entity is given a UUID attribute and somehow Grakn was mixing them up.

In an effort to try to figure out what’s going on I created a little program which imports entities and then right after does a match for each insert it performed before. If it doesn’t find the entity then it fails. I started with 500K~ entities and ended up being able to reproduce the issue at 100 entities. Although it’s not as consistent with failing it still does it ~4/5 times.

I’m using Ubuntu 16.04 and Grakn 1.2.0.
I’ve uploaded the project able to reproduce it here: https://github.com/BFergerson/grakn-import-verifier

I’m trying to look into this to find out what it could possibly be but since it’s inconsistent I’ve not been able to. It seems to work the best after I restart my computer. Not exactly a pleasant solution but even that isn’t 100 effective.

If you run this and it works the first time please try again after:
./grakn server stop && ./grakn server clean && ./grakn server start

Try at least 3 times. I haven’t been able to make it fail on command yet.


#2

Hi @BFergerson, thank you for reporting the issue.

I will try to reproduce it and let you know how it goes.


#3

@ganesh, were you able to recreate this issue? I wanted to make sure I was dealing with an actual bug and not just something wonky in my OS so I created Digital Ocean droplets and recorded the terminal as I ran the code I provided above using the script here: https://gist.github.com/BFergerson/80b8c82ef42bce2517a1ebbd2e0010a3

Doing so I was able to see it run correctly:
asciicast

And incorrectly:
asciicast

There was no change in code or script between the two recordings above. They were run on the same Digital Ocean image (Ubuntu 16.04.4 x64) with the size set to 4GB RAM and 2vCPUs. They were also hosted in the same data center (New York, region 3). The first two times it ran good and the third time it ran bad. Above is the final two recordings.

I really want to start integrating Grakn into more of my applications but I’m currently stuck with figuring out how to import data successfully consistently. I would be more than willing to offer more logs or information required to recreate and diagnose this issues. If you have any clue where I should look at to further diagnose this issue please let me know. I’m pretty wicked with a debugger :).


#4

@BFergerson not yet unfortunately! We’re currently busy ironing out bootup issues as part of the next release.

We realise the issue people are having with the graql migrate tool obviously. People have even had more successes importing simply using the Java API…


#5

@ganesh, this isn’t the first time I’ve heard of the Graql migrate tool being sub par but I’ve personally found it to be fairly stable. Granted I’ve only now begun verifying the batch imports and prior was more or less visually verifying the produced graphs.

Unfortunately, the Grakn Java loader API suffers the same problem as the migrate tool. This would make sense if they shared any code (not sure if they do).

Using the same code, schema, and entities script I created the following project to import data via the Java loader API: https://github.com/BFergerson/grakn-batch-import-verifier

Using the same Digital Ocean setup with the modified test script of: https://gist.github.com/BFergerson/6099693f0d5a4e549aa2c888edaea0ff

I was able to recreate the exact situation I describe above. Ran the above script on 3 separate Digital Ocean droplets and the first two times it worked and the third time it failed.

Here is it running correctly (the second run):
asciicast

And it running incorrectly (the third run):
asciicast

I’ll be downloading and building Grakn locally soon to try to debug this issue myself but would appreciate any pointers in the right direction.


#7

My bad, I was specifically referring to the Java Graql API, not the Loader API. It is simpler and more robust.

I would advise you to use it instead rather than spending your time scratching heads on the batch loader until we’re done working on the bootup issues and start working on the migrations.


#8

Same drill, this time using the Java Graql API. Specifically the code on the Advanced page.

The code: https://github.com/BFergerson/grakn-api-import-verifier
The script: https://gist.github.com/BFergerson/65e46cac156b2efbb5ab5351d3272da2

Ran it on 4 droplets. First 3 were successful and 4th wasn’t.

3rd run:
asciicast

4th run:
asciicast

At this point I just want to make sure it’s not something I’m doing incorrectly so I don’t mind digging through the debugger for awhile. Having trouble setting up Grakn in IntelliJ though and there doesn’t seem to be a developer guide on getting it up and running locally.


#9

Thank you for clarifying that the problem is not the migrator API specific. I will have a look in the code and test it out on our system and got back to you in a couple of days.

These screencasts are really helpful btw!


#10

@BFergerson I am curious, does the problem exist when you open the transaction using GraknTxType.WRITE instead of GraknTxType.BATCH when writing?


#11

@ganesh, many runs later… It doesn’t care about WRITE/BATCH mode.

I can’t get it to fail with 1 thread:

ExecutorService pool = Executors.newFixedThreadPool(1);

and I can’t get it not to fail with 100 threads:

ExecutorService pool = Executors.newFixedThreadPool(100);

Would love to debug this locally. Got a question in the developers channel about how to debug Grakn locally.


#12

Hi @BFergerson, in case you missed my response in developers channel:

when we need to debug Engine without starting other processes 
we usually write integration tests that use mock or in-memory Cassandra,
 you can see an example in `GrpcServerIT` where we use an `EngineContext`

otherwise if you don’t want to write tests, you can `mvn package`,
 unzip the grakn-dist and run it with 

ENGINE_JAVAOPTS='-Xdebug -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005' ./grakn server start

 this will start Grakn with remote debugging enabled, 
so that from Intellij you will be able to create a remote debugger to attach 
to the running instance of Grakn and you can then use breakpoints where
 needed in the code 

I hope this can help you,

let us know otherwise