Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load function not working #4

Open
weidinger-c opened this issue Jul 12, 2024 · 4 comments
Open

Load function not working #4

weidinger-c opened this issue Jul 12, 2024 · 4 comments

Comments

@weidinger-c
Copy link

weidinger-c commented Jul 12, 2024

Hi, I wanted to try out your save and load functions, but got this error with your load function:

save("/media/w/numpy_array_fast.npy", numpy_array)
load("/media/w/numpy_array_fast.npy")
Exception has occurred: ValueError
invalid literal for int() with base 10: '                                             '
  File "/workspaces/tile_db/tiledb/tiledb.py", line 29, in <genexpr>
    shape = tuple(int(num) for num in str(header[60:120], 'utf-8').replace(', }', '').replace('(', '').replace(')', '').split(','))
  File "/workspaces/tile_db/tiledb/tiledb.py", line 29, in load
    shape = tuple(int(num) for num in str(header[60:120], 'utf-8').replace(', }', '').replace('(', '').replace(')', '').split(','))
  File "/workspaces/tile_db/tiledb/tiledb.py", line 332, in import_las_file
    load("/media/w/numpy_array_fast.npy")
  File "/workspaces/tile_db/samples/create_db.py", line 25, in <module>
    db_accessor.import_las_file(las_filepath)
ValueError: invalid literal for int() with base 10: '      

I am using numpy 2.0.0, if that helps.

Thanks.

@mahynski
Copy link

mahynski commented Aug 2, 2024

Just submitted PR which fixed this error for me. If you don't want to wait, just replace line 26 with:

shape = tuple(int(num) for num in str(header[60:120], 'utf-8').strip().replace(', }', '').replace('(', '').replace(')', '').split(',') if num != '')

@weidinger-c
Copy link
Author

weidinger-c commented Aug 5, 2024

Thanks, I tried your code, but unfortunately the loaded array is not the same as the original one. Could this be due to using "structured arrays" with different data types?

Here a short snipped how I create my random test data:
image

@mahynski
Copy link

mahynski commented Aug 5, 2024

Yes that appears to be the case. I tried this out and the issue is coming from the line in the load() function:

descr = str(header[19:25], "utf-8").replace("'", "").replace(" ", "")

This is specific logic that extracts the type of the array when it is something simple, but doesn't work in your case. A more robust approach is needed there, akin to "numpy.lib.format.read_array_header_2_0". Also, the total size of the data is calculated later with

datasize = np.lib.format.descr_to_dtype(descr).itemsize

which doesn't seem to work when you have a structured array since it cannot easily parse out the number and size of the different types in your array. I do not see a simple fix, but it should be possible. For now, I think the code should work if you create 4 separate arrays and save them individually. Not as elegant, unfortunately.

@weidinger-c
Copy link
Author

Thanks for the reply. I guess I'll create a new issue for support of strucured arrays. But it seems from your reply, that this is not as simple as one would guess...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants