Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add beginning of RDF file writer. #165

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add beginning of RDF file writer. #165

wants to merge 2 commits into from

Conversation

balhoff
Copy link

@balhoff balhoff commented Jun 27, 2023

I took a stab at implementing an RDF file writer (just for edges, not nodes at the moment—I don't think we want to have duplicate node metadata in different RDF datasets). @EvanDietzMorris I have not actually run this; could you let me know if I'm on the right track, and what else needs to be done to output some Turtle files in the ORION build?

@balhoff balhoff requested a review from EvanDietzMorris June 27, 2023 18:09
@EvanDietzMorris
Copy link
Contributor

I'm not sure I understand the issue with nodes, we may want to chat about that. Looks like this is on the right track though.

I think the fastest/cleanest way would be to add a condition here for rdf:

if 'neo4j' in graph_spec.graph_output_format.lower():

It could read from the jsonl nodes and edges files that were produced in the previous merging step as a completed graph (graph_output_dir/NODES_FILENAME and EDGES_FILENAME) and write them out in rdf. It might be nice to just make a file conversion helper like kgx_file_converter.py has for jsonl to csv.

Then we could specify rdf as output format for a graph like here:

output_format: jsonl

This approach has the downside that if rdf is the only output you care about, it's going to merge the sources and write them to kgx jsonl files first for no great reason. We could also incorporate the rdf output further upstream to avoid that but I haven't had time to think about how we might want to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants