Repository | Description | Update Schedule | Period |
---|---|---|---|
Google Drive | Latest dataset only | Twice daily at 00:00 and 12:00 | from 2024/06/13 |
Figshare | Past datasets | Daily until 2024/06/06, then monthly | from 2022/12/22 |
Github | Past datasets | As needed | from 2019/7/11 until 2022/12/22 |
- Excluded datasets with the data type "calculation" in the descriptor from the sample dataset and curve dataset. As of 2024/07/01 12:00:01 UTC+0900 (JST), there were 346 samples.
- Changed dataset file name prefix from "all" to "starrydata". For example,
all_curves.csv
is nowstarrydata_curves.csv
. - Changed the file extension of the paper dataset from JSON to CSV for availability.
- Reduced the columns in the paper dataset to only those necessary for citation, reducing the file size from 400MB to about 50MB.
- Added
project_names
andcreated_at
to the paper dataset.
- The latest datasets are now uploaded to Google Drive.
- Fixed the character corruption issue when users open all_samples.csv in certain applications, such as Excel, by adding a BOM.
- The upload schedule to Figshare has been changed from daily to monthly.
- Fixed the incorrect timestamp format in the dataset. For example, corrected "2024-05-17 00:00:01 JST+0900" to "2024-05-17 00:00:01 GMT+0900 (JST)".
- The values in the XY value list were originally strings enclosed in double quotations. These double quotations were removed for easier analysis.
- e.g. ["299.8597", "324.8683"] -> [299.8597, 324.8683]
- Added
updated_at
,created_at
, andcomposition_details
toall_samples.csv
.
- The dataset location was changed from this GitHub repository to Figshare.