-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
558 Multiple disks issue on windows and latest rdflib release. #588
558 Multiple disks issue on windows and latest rdflib release. #588
Conversation
…changes on RDFLib when the user imports it by him/herself. Includes a little test to be performed once manually.
As said in the first commit message, the solution has the drawback of also reflecting the changes on RDFLib when the user imports it by him/herself. I guess this is not an issue since the only change is a bug being fixed. After testing that the solution works, I have removed the assert statements. I will leave the issue as a draft until tomorrow, maybe a better solution comes to my mind in the meantime. |
The previous commit had a mistake, and the problem turned out to be much trickier than expected. Actually I am not even sure that the fix in rdflib actually will solve the problem in newer versions, as location = 'file:///C:\\Users\\user\\Downloads\\ontology.ttl'
urlparse(location)
Note the leading The approach proposed is still valid though, if standard paths, i.e. |
…e ctypes library and a win32 API call. Hopefully fixes the security issue the CI is referring to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a shame I cannot easily test this because I have only a single drive in my dev laptop...
When I wrote my solution idea I was thinking more about calling the serialize always with "file://" no matter if on windows or linux. I am also fine with decorating the serialize method like you did. Why do you need the case distiction on the platform, though? I think calling graph.serialize("file:///home/user/something") should work fine on linux as well?
osp/__init__.py
Outdated
# Call the win32 API to get the volume ID. | ||
windows_func = ctypes.windll.kernel32.GetVolumeNameForVolumeMountPointW | ||
windows_func.argtypes = wintypes.LPCWSTR, wintypes.LPWSTR, wintypes.DWORD | ||
lpszVolumeMountPoint = wintypes.LPCWSTR(f'{scheme}:\\') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks a bit complicated for me. What is this? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is the same thing that I though, and if you check the evolution of the code, it's what I intended to do but after trying several times I think that it is not really possible both to just wrap the serialize function and to do it this way.
The problem is the way that urllib is handling the windows paths inside the serialize
function. The actual path needs to be on the path
field of the ParseResult, and it does not happens with the windows path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the ctypes stuff, yeah it looks very complicated also for me, but it was the only way that I found of getting windows paths by volume (which work with urlparse) and also passing the CI. At first I thought of just querying windows itself via the command line, opening a subprocess that returns a the volume ID and capturing the output, but then bandit, the security vulnerability scanner complained about it being potentially insecure (which is true).
Thus, the solution was to query it in some other way. It turns out that python has no easy solution to call the win32 API to get this information. There are modules on pypy but I did not want to have an extra dependency just for this. ctypes on the other hand belongs to the standard library, and lets you call C functions from Python, which is precisely a way to call the win32 API to get the ID of the disk.
location = 'file:///C:\Users\user\Downloads\ontology.ttl' But this isn't a valid windows path? At least I always thought three slashes after file: is only possible linux. Two come from the protocol file:// and the third from the file system root, which is "/". On windows the file system root is C:\ and therefore no third slash. file:///some/linux/path |
I read somewhere that init.py files with |
I'll further investigate what you said about init files. |
After investigating after the init file I could not find major drawbacks. I just found a confirmation of the reason why it works (see section 5.2.1), but nothing significantly useful. So the only remarks that I will make is that the names created in |
Thanks for your effort. I'll test it a bit and then I'll approve! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my side it can be merged. Maybe we should add a unittest for the version comparison, but I'm not sure about that. What do you think?
osp/__init__.py
Outdated
return v < o | ||
|
||
# check remaining numbers of other_version | ||
return all(o <= 0 for o in other_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since other_version is a iterator (because of map
), here it should only iterate over the items not covered in the loop. If there are no more, all
of an empty sequence is defined to be true.
osp/__init__.py
Outdated
if _osp_urlparse(kwargs['destination']).path.startswith('\\'): # The destination is a windows path. | ||
# Call the win32 API to get the volume ID. | ||
windows_func = _osp_ctypes.windll.kernel32.GetVolumeNameForVolumeMountPointW | ||
windows_func.argtypes = _osp_wintypes.LPCWSTR, _osp_wintypes.LPWSTR, _osp_wintypes.DWORD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any resource describing this. Or did you come up with this on your own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The win32 API documentation for GetVolumeNameForVolumeMountPointW
and the ctypes library documentation. I did know about ctypes, but not about this function in the Windows API, but it was mentioned online as the standard way to get the disk ID.
Yeah I will write a test for this, better safe than sorry. |
@urbanmatthias Now it's ready again. |
Among the solutions proposed in issue #558, I opted for the second. Once any part of osp-core is imported, the function
rdflib.Graph.serialize
is replaced by a decorated version of itself so that it always receivesfile://
at the beginning. I included a small assert statement at the beginning of each file in osp-core using therdflib.Graph.serialize
function and tried importing something from each of these files to check whether the patch is indeed being applied. Seems to be the case.