Graql query frequency


#1

Currenty I am trying to couple a Grakn knowledge graph to a ROS environment. From this ROS environment values of attributes are received, which I want to update in our Grakn database.

ROS works with a structure comparable to interrupts. When a new message is received, I start a Graql query to update the values in the database. To be able to update the values, I first need to delete them and thereafter add them. For this interface I use python, our delete and add queries are thus send from the python API.

When I try this, I get some strange behaviour, which is probably encoutered by the speed of the Graql queries I am doing. Depending on the order of the things I am doing, the new values are not updated and the old ones stay for ever. Or the old values are not deleted, resulting in new added values and new added attributes.

Therefore I have some questions:

  • What is the maximum number of queries that can be applied in Graql per second? How long should I wait before entering the new query?
  • Is it possible to update values in one query, instead of using two queries?
  • Do you have experience in coupling Grakn to a ROS like structure and some tips to help me out?

I like working with Grakn and hope you can help me out!
Regards Fieke


#2

Hi @Fieke,

I’m truly super excited hearing someone is integrating Grakn with ROS! (I spent 5 years working with ROS between 2010-14. I hope the platform and community has continued to grow!)

Because of that, I personally would like to see you succeed in integrating Grakn with ROS.

When I try this, I get some strange behaviour, which is probably encoutered by the speed of the Graql queries I am doing. Depending on the order of the things I am doing, the new values are not updated and the old ones stay for ever. Or the old values are not deleted, resulting in new added values and new added attributes.

That sounds like a bug (and one that we’ve resolved in 1.5). @joshua can you help look into it? @Fieke if you can share more details on the queries being run and the responses / error messages, that would be helpful.

Regarding write performance, there’s some significant performance boost in the next 1.5 release (which we’re behind deadline in releasing, so you can expect it hopefully in the coming days).

Regarding updating values, it’s not possible at the moment as the Grakn’s data model is designed based on the fact that we treat every piece of information (i.e. “fact”) as “atomic”. We may find a solution to simplify this or abstract this away from the user in the near future. For the moment, can you think of a way where you can model your knowledge graph such that you can do a “one-way” data insert to update the knowledge?

Regarding coupling Grakn with ROS, we have not done this yet, but if you could share what you’re trying to build with ROS and Grakn perhaps I can jog my memory on ROS and visualise how to align the two architectures.

Looking forward to hearing more about your development!


#3

Thank you for the fast reply and information. I appreciate all the help! I hope Grakn 1.5 can resolve the problem.
What do you mean by ‘atomic’? Is there a reference page where I can read more about this?


There are two processes that interfere with the dataset:

  • Process A inserts a new object with some attributes or updates the object.
  • Process B updates the object with some overlapping attributes and some non-overlapping attributes.
  • Both processes are running in parallel and may occur fast after each other.

For deleting attributes I made the following python function (item_list contains the items to delete):

def construct_deletepart_message(self, item_list):
        message_find = ''
        message_delete = ' delete'
        first = True
        for item in item_list:
            message_find += ', has {delete_item} $n via $r_{delete_item}'.format(delete_item=item)
            if first:
                first = False
            else:
                message_delete += ','
            message_delete += ' $r_{delete_item}'.format(delete_item=item)
            
        return message_find + ';', message_delete + ';'

Example process A:

  • If object is not already represented:
    insert $x isa object, has name 'test', has id_ref 0, has lastSeen 0.0, has location 0.6;
  • If object is already represented:
    • First delete the attributes that will be updated:
      delete_items_list = ['name', 'lastSeen']
      del_message_find, del_message_delete = self.construct_deletepart_message(delete_items_list)
      delete_message = 'match $x isa object, has id_ref 0' + del_message_find + del_message_delete
    • Secondly insert the new values of the attributes:
      match $x isa object, has id_ref 0; insert $x has lastSeen 0.5, has name 'test2';

Example process B:

  • When object is not yet present in Grakn: Wait until present (depends on process A).
  • Thereafter:
    • Delete the attributes that will be updated:
      delete_item_list = ['name', 'speed']
      del_message_find, del_message_delete = self.construct_deletepart_message(delete_items_list)
      delete_message = 'match $x isa object, has id_ref 0' + del_message_find + del_message_delete
    • Insert the new values of the attributes:
      insert_message = match $x isa object, has id_ref 0; insert $x has name 'test3', has speed 10;

I do not get a real error from Grakn, but I do not get the behaviour I am expecting.

The desired behaviour:

  • Process A inserts new objects when not already presents and updates the values of the attributes otherwise.
  • Process B adds / updates values of other attributes, while there could be some overlap with attributes of process A (the name for example).

I have seen that the attributes can be added for both processes, but not the working of the system as a whole.
When I do not try to delete the objects, attributes are added, but keep getting added of course. When I try to delete the attributes first, the attributes values are not updated and stay on the first added value. Even running process A alone does not work as desired.


I hope this gives more insight in my desires and the problems I encounter. If more information is needed I can ofcourse look into it!


#4

To clarify a few points a bit further:
From what we’ve encountered, Grakn is best suited for applications where data is streamed into Grakn without modifying values too frequently. New data is represented as new instances and new relationships more often than deleting and recreating data, though it is possible to delete of course.

We don’t currently have support for direct modification of attributes, because under the hood, every time you write insert A has attribute X; and insert B has attribute X;, both A and B point to the same X attribute using a hidden relationship (that you’ve recovered using via in your delete queries!). This is what Haikal means by atomic data - it’s not meant to be modified in place. Modifying an attribute value for us is actually the process of adding a different relationship to Y and removing the relationship to X, as you are doing.

The write speed upgrades Haikal mentioned are referring to concurrent inserts, and we haven’t tested deletes extensively as it stands. That being said, concurrent writes are more stable in Grakn 1.5 than before so you might have better luck with the unpredictable end states.

Here are some basic things I can advise trying:

  1. Ensure you commit and use a new transaction after each change - it’s expected not to see the new data in Process B until after Process A has committed its transaction with the new data
  2. Serialize data changes from process A and B to the same data object concepts. In other words, is it currently possible to have two messages happening from A at the same time to the same object? If yes, serializing should help. At this point you may hit limitations on Grakn’s write speed: if the round trip to Grakn takes 30 milliseconds but your messages are arriving every 10 ms, we might have to think of more complex solutions.

I’m quite interested in your comment that running Process A on its own fails. Can you confirm it fails if you perform poitns 1. and 2. above as well?


#5

Thank you for the fast reply. Your explanation makes the working of Grakn more clear.

I will test process A with your two suggested points and let you know whether it is working as expected or not.


#6

This morning I tested running only Process A, with a delete and insert query. Sending the new insert / delete queries was done more than a second after the last one, to make sure there were no parallel processes.

The first time an insert query is done. This is working:
insert $x isa object, has id_ref 0, has name 'test', has lastSeen 0.5;

delete_query:
match $x isa object, has id_ref 0, has name $n via $r_name, has lastSeen $n via $r_lastSeen; delete $r_name, $r_lastSeen;

insert_query:
match $x isa object, has id_ref 0; insert $x has name 'test2', has lastSeen 1;

Python code to apply the changes:

with self.client.session(keyspace=self.keyspace) as session:
	with session.transaction(grakn.TxType.WRITE) as write_transaction:
		write_transaction.query(delete_message)
		write_transaction.commit()

with self.client.session(keyspace=self.keyspace) as session:
	with session.transaction(grakn.TxType.WRITE) as write_transaction:
		write_transaction.query(insert_message)
		write_transaction.commit()

In the visualiser when I looked-up my object, it contained multiple names and lastSeen items.
What am I doing wrong is this case for updating the attribute values?


#7

There’s an error here - if you run this as a match...; get...; query instead of a delete to debug, you’ll see it doesn’t return anything at all! The problem is that you’ve re-used the variable $n twice in the match portion. I’ve run your queries locally and if you rename one of the $n to something unused this process works :slight_smile:

The other error that popped up when I was running your queries is that lastSeen must be a double, ie. written as 1.0 instead of 1. This doesn’t cause an exception in 1.4.3, but in the upcoming Grakn 1.5, we are more strict with the required input, so no harm preparing your code for it ahead of time!


#8

Thank you very much for the help. With the error fixed, it works!
It works with our ROS update-rate, at this moment we do not reach the query-time-limit.

I will continue my work and add some more rules to the system.
I am curious what Grakn 1.5 will bring us and am looking forward to it!


#9

Do let us know (in the forum, or in Slack for example) if there’s anything you’d like to show us, or even present to the wider user base - we love to hear how the community is using Grakn :smiley: