-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escape entity details queries #793
Conversation
983d154
to
8a59cf2
Compare
8a59cf2
to
4d8239d
Compare
packages/graph-explorer/src/modules/GraphViewer/exportedGraph.ts
Outdated
Show resolved
Hide resolved
packages/graph-explorer/src/modules/GraphViewer/exportedGraph.ts
Outdated
Show resolved
Hide resolved
.map(trimIfString) | ||
.filter(isNotEmptyIfString) | ||
.filter(isNotMaliciousIfSparql(connection.queryEngine)) | ||
.map(escapeIfPropertyGraphAndString(connection.queryEngine)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Does any escaping need to be done for sparql IRIs in case there are special characters? Or does it rely on the _sparqlFetch
which calls encodeURIComponent
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am choosing not to do any manual encoding of the string. So if there are special characters in the IRI then they will be passed to the database and rejected there. The main worry is that a string would escape out of the bounds of an IRI value in the query.
SELECT ?s ?p ?o
WHERE {
<${subject}> ?p ?o
}
The subject
between the <
and >
is the IRI. So as long as the IRI string does not contain a >
then the database will receive the string as an IRI and reject any invalid IRIs.
I could not find a way to ensure the IRI string is properly encoded. If I were to encode the IRI myself, then the encoding function would re-encode any already properly encoded characters, rendering the value invalid. So I'm taking the path of least resistance and checking specifically for the one character that will certainly allow an injection attack.
LGTM just a couple questions open for discussion but non-blocking. |
edges: z | ||
.array(z.union([z.string(), z.number()])) | ||
.transform(ids => ids.map(id => createEdgeId(id))), | ||
vertices: z.array(z.union([z.string(), z.number()])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: have you explored the idea of moving the invalid id filtering logic into zod using refine
and z.string().url
? I'm guessing this would cause the file to be rejected if it contained anything strange which would be different from the current logic which skips over invalid ids (maybe that would be a good thing?).
Chatgpt suggested this for conditional validation of the ids to be valid URLs depending on the queryEngine
:
vertices: z.array(z.union([z.string(), z.number()]))
.refine((vertices, ctx) => {
const { queryEngine } = ctx.parent.connection;
if (queryEngine === "SPARQL") {
// Ensure that vertices are URLs if queryEngine is SPARQL
if (!vertices.every(v => typeof v === "string" && urlValidator.safeParse(v).success)) {
ctx.addIssue({
code: z.ZodIssueCode.custom,
message: "All vertices must be valid URLs when queryEngine is SPARQL.",
path: ["vertices"],
});
return false;
}
}
return true;
}),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, TIL I could do this in Zod (I was looking for this functionality):
const { queryEngine } = ctx.parent.connection;
I did consider using Zod for this validation, but I decided against it as there were too many validations that were connection dependent. So I went with an approach that treats the Zod schema as strictly the shape of the file, and then a separate validation and mapping step to convert the data to the shape expected by Graph Explorer.
I might change my mind about this when I explore adding graph data in to the URL via query params. The shape of the data will be fairly similar to the shape of the file, so I think it would be a good fit. And I'm curious how either approach will scale as we add future versions of the export file format and must maintain backwards compatibility.
I'm going to go ahead and merge this since I don't think any of the open discussions are risky enough to hold it back. However, I would like to continue the discussions after merging, which may result in future PR changes. |
Description
This change helps prevent the chances of an injection attack through the saved graph file.
In the process of importing the graph file we now do some basic sanity checks on the vertex and edge IDs.
>
existOther changes
idType
that was no longer necessaryValidation
Related Issues
Check List
license.
pnpm checks
to ensure code compiles and meets standards.pnpm test
to check if all tests are passing.Changelog.md
.