-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move sample data to a repository #62
Comments
I believe the data that we use in the example notebook is now contained within the Were you downloading though GitHub Desktop by the way? I have found that to be very slow for some reason which is not directly related to repo size. It sometimes takes a long time on a fast connection. |
The example notebook now only using the data from the tests directory which contains 8.7MB of data - 4.9MP is a ctf file which we should maybe make smaller as it's not used in the example notebook. I will delete the example data directory in develop (I thought I had already done this tbh) which will cut out 36MB (60-70% of the total size) It would be great to work towards having a library of example datasets, defined with consistent filenames and formats to automatically pull into a notebook. |
This is still an issue. Cloning downloads 321.29 MiB of data. What's being downloaded? Does cloning include the whole history? Any ideas @merrygoat ? |
Yes, the hidden .git folder has all of the historical diffs - you should be able to check this by doing a shallow clone:
Where depth is the number of diffs to fetch. You can use |
Publishing on PyPI is a better approach than fiddling with the git history, isn't it? |
I didn't think of it like that, but yes certainly. |
I had a look at finding big files and deleting them from the history, I found a decent guide (https://web.archive.org/web/20190207210108/http://stevelorek.com/how-to-shrink-a-git-repository.html) but it scares me. I will publish to PyPI for now, it's daft that I haven't done that yet |
I installed DefDap on a different computer today and it took so long tod download everything, primarily because the example data (which is needed) is relatively large. Could I suggest we move it to a repository, like Zenodo and then have a command to download it in the example notebook?
The text was updated successfully, but these errors were encountered: