Minimize the use of keys related to index operations #4

WolfDan · 2019-01-27T01:27:14Z

Right now any data type index operation could face a duplicate key prefix which is quite inefficient

Let's take for example 2 record

user_a = %{name: "Artorias", deaths: 50}
user_b = %{name: "Chosen Undeath", deaths: 50}

the index result of the deaths record would looks like

("user", "deaths", 50, user_a_node_uid) = ''
("user", "deaths", 50, user_b_node_uid) = ''

As you can see we have repeated the prefix "node_name", "deaths", 50 which is quite long by itself, containing bit_string data as node_name, property_name and property_value, so this proposal is to change it this way

("user", "deaths", 50, random_id) = [user_a_node_uid, user_b_node_uid]

we use the random_id in order to "extend" the index, since FDB has a limitation on value size, so when it reach this size we split the index in order to add more uids into the result of the index

This way a single key prefix can contain a quite big amount of node uids

I don't think it will have any repercution, normally on any index you need to query them all in order to bring any query result

The text was updated successfully, but these errors were encountered:

WolfDan · 2019-02-02T21:39:31Z

I'm doing some tests on the topic, there's a problem with this aproach and is that increase the transaction conflics

The logical steps looks like this:

Get the range by the node_name, property_name and property_value
If empty then write the key index
If data, we check the keys size and select a key that has space left
Add the uid to the value of the key and update

The problem resides on the last step, since the first one is getting the key range and the last one is writting on one of those keys it creates a transaction conflict

I'll be doing more testings in order to fix those issues

WolfDan · 2019-03-12T15:24:29Z

With the rewrite now a big part of the tuple is converted into a Directory, that means that this part will be compressed into a single small integer and can be easier to move or rename later on

Atm this is the best way to reduce space usage without making too much transaction conflicts

WolfDan added the block This feature could be important to the project but is blocked due given reason label Feb 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimize the use of keys related to index operations #4

Minimize the use of keys related to index operations #4

WolfDan commented Jan 27, 2019

WolfDan commented Feb 2, 2019

WolfDan commented Mar 12, 2019

Minimize the use of keys related to index operations #4

Minimize the use of keys related to index operations #4

Comments

WolfDan commented Jan 27, 2019

WolfDan commented Feb 2, 2019

WolfDan commented Mar 12, 2019