Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to execute arbitrary data loading and ETL operations on graph input data #4

Closed
rlratzel opened this issue May 12, 2022 · 1 comment · Fixed by #8
Closed
Assignees
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change python
Milestone

Comments

@rlratzel
Copy link
Contributor

DGL users are responsible for loading and cleaning the graph input data used to create the DGL graph objects. Since loading the data needs to be done server-side, GaaS clients need to be able to execute calls to load files and clean/reorganize data for creating graph objects.

Options include:

  • Client API that can tell the server to import an arbitrary module, then run functions provided by that module. Users would write the necessary ETL code as a python module that can be loaded and run.
  • Client API that allows users to send UDFs over in the form of code strings which can them be eval'd server-side to read and update data.

Both of the above can be scoped specifically to graph creation in order to simplify the implementation, however, the ability to run arbitrary code could be generally useful for other things.

@rlratzel rlratzel added improvement Improves an existing functionality non-breaking Introduces a non-breaking change python labels May 12, 2022
@rlratzel rlratzel added this to the 22.06 milestone May 12, 2022
@rlratzel rlratzel self-assigned this May 12, 2022
@rlratzel
Copy link
Contributor Author

@BradReesWork also suggested the following in #7 :

Suggestion: One of the start-up argument should be a reference to a Python file that contains the data loading process.
The file must contain a function: def load_data(): -> PG
loader = import(sys.argv[1].replace('.py', ''))
PG = loader.load_data()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change python
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant