Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive ArrayIndexOutOfBoundsException when inserting to type with full text index with multiple properties defined. #1435

Closed
alan-strickland-red opened this issue Jan 19, 2024 · 16 comments
Assignees
Labels
bug Something isn't working fixed
Milestone

Comments

@alan-strickland-red
Copy link

ArcadeDB Version:

ArcadeDB Server v23.12.1 (build df4e3d5/1704296563159/main)

OS and JDK Version:

Linux 5.10.0-26-amd64 - OpenJDK 64-Bit Server VM 11.0.21 (Temurin-11.0.21+9)

Expected behavior

No errors when inserting new records to a type that has a full text index containing multiple fields.

Actual behavior

For the most part this works as expected, however now and again we find when inserting to a document type we starting receiving ArrayIndexOutOfBoundsException errors.

We then drop and create the full text index which resolves the error.

I don't seem to get the same issue with full text indexes with a single property however because it feels quite random it might be that I haven't come across it yet.

Steps to reproduce

I have not yet found a way to consistently reproduce the issue, I will keep trying though.

@lvca lvca self-assigned this Jan 19, 2024
@lvca
Copy link
Contributor

lvca commented Jan 19, 2024

@alan-strickland-red are you inserting with the local or remote interface?
Also, are you executing operations in parallel or single thread?

@alan-strickland-red
Copy link
Author

I am inserting values via the http client from nodejs, and I also tried it via the http client in studio and it gives the same error.

Also, are you executing operations in parallel or single thread?

Single thread I believe, asyncWorkerThreads is set to 1, typeDefaultBuckets is set to 1

@lvca
Copy link
Contributor

lvca commented Jan 19, 2024

I'd love to reproduce the issue. Do you have a stack trace of the exception to understand where it's thrown?

@alan-strickland-red
Copy link
Author

@lvca Sorry for the delay in getting back to you, been working some short weeks recently and had other commitments.

This issue has cropped up again this morning, on a little used database.

I have attached a backup of the database and the command that is causing the error.

I can't see a stack trace anywhere, the error in the client does not have one and I can't see the error in the logs.

image

Command causing the error:

INSERT INTO `authentication-token` CONTENT {"id":"613b7796-87f2-4bc6-8491-f11215995aff","issued":"2024-01-29T14:42:36.465Z","expires":"2024-01-29T14:57:36.466Z","ip":"::ffff:10.99.2.3","authenticated":false,"mfaCompleted":false,"revoked":false,"userAction":"","_sys":{"rev":1,"hash":"authentication-token!613b7796-87f2-4bc6-8491-f11215995aff","createdon":"2024-01-29T14:42:36.467Z","updatedon":"2024-01-29T14:42:36.467Z"},"$$id":"613b7796-87f2-4bc6-8491-f11215995aff","@type":"authentication-token"}

Database backup with error

@lvca
Copy link
Contributor

lvca commented Jan 29, 2024

Reproduced, thanks!

@lvca lvca added the bug Something isn't working label Jan 29, 2024
@lvca lvca added this to the 24.1.1 milestone Jan 29, 2024
@alan-strickland-red
Copy link
Author

Now you have reproduced the issue you might not need this but I noticed that if I create a database with a full text index then restart the docker container the reported type of the index in studio switches from FULL_TEXT to LSM_TREE.

Before the restart

                {
                    "name": "authentication-token[ip,userAction]",
                    "typeName": "authentication-token",
                    "type": "FULL_TEXT",
                    "unique": false,
                    "properties": [
                        "ip",
                        "userAction"
                    ],
                    "automatic": true
                }

After the restart

                {
                    "name": "authentication-token[ip,userAction]",
                    "typeName": "authentication-token",
                    "type": "LSM_TREE",
                    "unique": false,
                    "properties": [
                        "ip",
                        "userAction"
                    ],
                    "automatic": true
                }

@lvca
Copy link
Contributor

lvca commented Jan 29, 2024

We have zeros test case with composite full text indexes. It was easy to reproduce your issue in a test case. I'll look also into the type issue after restart

@lvca
Copy link
Contributor

lvca commented Jan 30, 2024

the composite index is not supported in the full text index: it always creates a sub LSMTree index with 1 string property:

underlyingIndex = new LSMTreeIndex(database, name, false, filePath, mode, new Type[] { Type.STRING }, pageSize, nullStrategy);
.

The creation should throw an exception, but instead it completely ignores the passed key types (assuming it's always one string).

About the type showing LSM_TREE it's because it's a LSM_TREE under the hood, but this should be hidden for the end user that should see FULL_TEXT instead.

What's your use case where you have a composite full text index? Can you have 2 indexes?

@alan-strickland-red
Copy link
Author

alan-strickland-red commented Jan 30, 2024

I think originally I did have multiple indexes and then I went through a period where I was trying to reduce the overall disk/memory footprint of the database and realised that changing it to one index helped to reduce the disk usage.

Although maybe my recollection is wrong because if I try to create a type that has multiple full text indexes I can't query using CONTAINSTEXT I get an error, so perhaps that's why I changed it.

Multiple Indexes Example

CREATE DOCUMENT TYPE MultipleIndexes IF NOT EXISTS;
CREATE PROPERTY MultipleIndexes.String1 IF NOT EXISTS string ;
CREATE PROPERTY MultipleIndexes.String2 IF NOT EXISTS string ;
CREATE INDEX ON MultipleIndexes (String1) FULL_TEXT;
CREATE INDEX ON MultipleIndexes (String2) FULL_TEXT;

INSERT INTO MultipleIndexes CONTENT {String1: 'Joe', String2: 'Bloggs'};

SELECT * FROM MultipleIndexes WHERE `String1` CONTAINSTEXT 'Joe';
SELECT * FROM MultipleIndexes WHERE `String2` CONTAINSTEXT 'Bloggs';

Multiple Indexes Query Result

{
    "error": "Internal error",
    "detail": "Cannot execute index query with `String1` CONTAINSTEXT \u0027Joe\u0027",
    "exception": "java.lang.UnsupportedOperationException"
}

Single Index Example

CREATE DOCUMENT TYPE SingleIndex IF NOT EXISTS;
CREATE PROPERTY SingleIndex.String1 string;
CREATE PROPERTY SingleIndex.String2 string;
CREATE INDEX ON SingleIndex (String1, String2) FULL_TEXT;

INSERT INTO SingleIndex CONTENT {String1: 'Joe', String2: 'Bloggs'};

SELECT * FROM SingleIndex WHERE `String1` CONTAINSTEXT 'Joe';
SELECT * FROM SingleIndex WHERE `String2` CONTAINSTEXT 'Bloggs';

Single Index Query Result - before a restart of the server

{
    "user": "root",
    "version": "23.12.1 (build df4e3d56e9063a245b46bca14001e393dc7f8001/1704296563159/main)",
    "serverName": "ArcadeDB_0",
    "result": {
        "vertices": [],
        "edges": [],
        "records": [
            {
                "@rid": "#25:0",
                "@type": "SingleIndex",
                "@cat": "d",
                "String2": "Bloggs",
                "String1": "Joe"
            }
        ]
    },
    "explain": "+ DDL\n  create document type SingleIndex IF NOT EXISTS\n+ DDL\n  CREATE PROPERTY SingleIndex.String1 string\n+ DDL\n  CREATE PROPERTY SingleIndex.String2 string\n+ DDL\n  CREATE INDEX SingleIndex[String1,String2] ON SingleIndex (String1, String2) FULL_TEXT NULL_STRATEGY SKIP\n+ CREATE EMPTY RECORDS\n  1 record\n+ SET TYPE\n  SingleIndex\n+ UPDATE CONTENT\n  {\"String1\": \u0027Joe\u0027, \"String2\": \u0027Bloggs\u0027}\n+ SAVE RECORD\n+ FETCH FROM TYPE SingleIndex\n  + FETCH FROM BUCKET 25 (SingleIndex_0) ASC \u003d 1 RECORDS\n  + FETCH FROM BUCKET 26 (SingleIndex_1) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 27 (SingleIndex_2) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 28 (SingleIndex_3) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 29 (SingleIndex_4) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 30 (SingleIndex_5) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 31 (SingleIndex_6) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 32 (SingleIndex_7) ASC \u003d 0 RECORDS\n+ FILTER ITEMS WHERE \n  `String1` CONTAINSTEXT \u0027Joe\u0027\n+ CALCULATE PROJECTIONS\n  *\n+ FETCH FROM TYPE SingleIndex\n  + FETCH FROM BUCKET 25 (SingleIndex_0) ASC \u003d 1 RECORDS\n  + FETCH FROM BUCKET 26 (SingleIndex_1) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 27 (SingleIndex_2) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 28 (SingleIndex_3) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 29 (SingleIndex_4) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 30 (SingleIndex_5) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 31 (SingleIndex_6) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 32 (SingleIndex_7) ASC \u003d 0 RECORDS\n+ FILTER ITEMS WHERE \n  `String2` CONTAINSTEXT \u0027Bloggs\u0027\n+ CALCULATE PROJECTIONS\n  *"
}

lvca added a commit that referenced this issue Jan 30, 2024
@lvca
Copy link
Contributor

lvca commented Jan 30, 2024

@alan-strickland-red I just pushed a check to return an error in case you create a full-text composite index.

Try to create 2 indexes, one per property.

@lvca lvca added the fixed label Jan 30, 2024
@alan-strickland-red
Copy link
Author

@lvca in the above message I noted that when I create 2 indexes, one per property I get a different error java.lang.UnsupportedOperationException.

@alan-strickland-red
Copy link
Author

@lvca Did you get chance to look at this?

For now I have removed the full text indexes from my database and am instead using ILIKE but would prefer to use full text if I can.

@lvca
Copy link
Contributor

lvca commented Feb 5, 2024

You can use 2 different full text indexes if you like, one per property. The code is already changed and thrown an exception in case you try to create a full text index with multiple properties.

@lvca lvca closed this as completed Feb 5, 2024
@alan-strickland-red
Copy link
Author

You can use 2 different full text indexes if you like, one per property. The code is already changed and thrown an exception in case you try to create a full text index with multiple properties.

If I try to create a type that has 2 full text indexes I can't query using CONTAINSTEXT I get an error.

Multiple Indexes Example

CREATE DOCUMENT TYPE MultipleIndexes IF NOT EXISTS;
CREATE PROPERTY MultipleIndexes.String1 IF NOT EXISTS string ;
CREATE PROPERTY MultipleIndexes.String2 IF NOT EXISTS string ;
CREATE INDEX ON MultipleIndexes (String1) FULL_TEXT;
CREATE INDEX ON MultipleIndexes (String2) FULL_TEXT;

INSERT INTO MultipleIndexes CONTENT {String1: 'Joe', String2: 'Bloggs'};

SELECT * FROM MultipleIndexes WHERE `String1` CONTAINSTEXT 'Joe';
SELECT * FROM MultipleIndexes WHERE `String2` CONTAINSTEXT 'Bloggs';

Multiple Indexes Query Result

{
    "error": "Internal error",
    "detail": "Cannot execute index query with `String1` CONTAINSTEXT \u0027Joe\u0027",
    "exception": "java.lang.UnsupportedOperationException"
}

Single Index Example

Just for reference a single index with multiple properties does work, until the server is restarted.

CREATE DOCUMENT TYPE SingleIndex IF NOT EXISTS;
CREATE PROPERTY SingleIndex.String1 string;
CREATE PROPERTY SingleIndex.String2 string;
CREATE INDEX ON SingleIndex (String1, String2) FULL_TEXT;

INSERT INTO SingleIndex CONTENT {String1: 'Joe', String2: 'Bloggs'};

SELECT * FROM SingleIndex WHERE `String1` CONTAINSTEXT 'Joe';
SELECT * FROM SingleIndex WHERE `String2` CONTAINSTEXT 'Bloggs';

Single Index Query Result - before a restart of the server

{
    "user": "root",
    "version": "23.12.1 (build df4e3d56e9063a245b46bca14001e393dc7f8001/1704296563159/main)",
    "serverName": "ArcadeDB_0",
    "result": {
        "vertices": [],
        "edges": [],
        "records": [
            {
                "@rid": "#25:0",
                "@type": "SingleIndex",
                "@cat": "d",
                "String2": "Bloggs",
                "String1": "Joe"
            }
        ]
    },
    "explain": "+ DDL\n  create document type SingleIndex IF NOT EXISTS\n+ DDL\n  CREATE PROPERTY SingleIndex.String1 string\n+ DDL\n  CREATE PROPERTY SingleIndex.String2 string\n+ DDL\n  CREATE INDEX SingleIndex[String1,String2] ON SingleIndex (String1, String2) FULL_TEXT NULL_STRATEGY SKIP\n+ CREATE EMPTY RECORDS\n  1 record\n+ SET TYPE\n  SingleIndex\n+ UPDATE CONTENT\n  {\"String1\": \u0027Joe\u0027, \"String2\": \u0027Bloggs\u0027}\n+ SAVE RECORD\n+ FETCH FROM TYPE SingleIndex\n  + FETCH FROM BUCKET 25 (SingleIndex_0) ASC \u003d 1 RECORDS\n  + FETCH FROM BUCKET 26 (SingleIndex_1) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 27 (SingleIndex_2) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 28 (SingleIndex_3) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 29 (SingleIndex_4) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 30 (SingleIndex_5) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 31 (SingleIndex_6) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 32 (SingleIndex_7) ASC \u003d 0 RECORDS\n+ FILTER ITEMS WHERE \n  `String1` CONTAINSTEXT \u0027Joe\u0027\n+ CALCULATE PROJECTIONS\n  *\n+ FETCH FROM TYPE SingleIndex\n  + FETCH FROM BUCKET 25 (SingleIndex_0) ASC \u003d 1 RECORDS\n  + FETCH FROM BUCKET 26 (SingleIndex_1) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 27 (SingleIndex_2) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 28 (SingleIndex_3) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 29 (SingleIndex_4) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 30 (SingleIndex_5) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 31 (SingleIndex_6) ASC \u003d 0 RECORDS\n  + FETCH FROM BUCKET 32 (SingleIndex_7) ASC \u003d 0 RECORDS\n+ FILTER ITEMS WHERE \n  `String2` CONTAINSTEXT \u0027Bloggs\u0027\n+ CALCULATE PROJECTIONS\n  *"
}

@lvca
Copy link
Contributor

lvca commented Feb 5, 2024

Reopening this issue for the index type that is not correct. It's mostly esthetic, but still an issue.

@lvca lvca reopened this Feb 5, 2024
lvca added a commit that referenced this issue Feb 5, 2024
@lvca
Copy link
Contributor

lvca commented Feb 5, 2024

Ok, I found a bigger problem: when the database is reopened, the full-text index is not 100% configured correctly. So the bug is more than "esthetic". Fixed now. Thanks for the report @alan-strickland-red

@lvca lvca closed this as completed Feb 5, 2024
lvca added a commit that referenced this issue Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed
Projects
None yet
Development

No branches or pull requests

2 participants