Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink design of object storage classes. #33

Open
gadamc opened this issue Oct 9, 2017 · 1 comment
Open

Rethink design of object storage classes. #33

gadamc opened this issue Oct 9, 2017 · 1 comment

Comments

@gadamc
Copy link
Collaborator

gadamc commented Oct 9, 2017

The current maze of if/else statements used to setup the various connections to the different CloudObjectStorage services is not so appealing or future-compatible.

https://github.com/ibm-watson-data-lab/ibmos2spark/blob/master/python/ibmos2spark/osconfig.py#L171

This pattern persists in they python, sparkR and Scala implementations (though not in sparklyR because we don't yet support COS there).

If any more options for COS are added, or services that use the same name, this will get even worse. We should refactor this code. One idea would be to subclass CloudObjectStorage and use a factory pattern to generate appropriate instances. In doing so, we may also be able to bring the 'softlayer' and 'bluemix' objects into that refactor.

Of course, we'll have to be careful about backwards compatibility. One option would be to change the interface to the library so as to avoid breaking anybody's code. For example

import ibmos2spark
conf = ibmos2spark.configure(sc, credentials, config_name, service_type = '', auth_method = '')
data_url = conf.url(bucket_or_container, object_name)
@bassel-zeidan
Copy link
Collaborator

bassel-zeidan commented Oct 9, 2017

I completely agree that it would be good to have some internal conceptual separation inside the class for different COS types. This actually might be very important when we have big chunk of code (logic) for each type. However, currently i think there is no need for that since the logic is quite small. It is just a couple of lines for each type. "At least that is my personal opinion".

However, it is good that you opened it to keep that in mind and please feel free to create PRs to demonstrate and realize your suggestion.

@gadamc gadamc changed the title Refactor CloudObjectStorage Refactor CloudObjectStorage or refactor full library. Oct 10, 2017
@gadamc gadamc changed the title Refactor CloudObjectStorage or refactor full library. Rethink design of object storage classes. Oct 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants