-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create generalised extract_data
function
#117
Conversation
Write new extract_data function that works across all columns for all datasets. Still need to write tests for it.
Move function `bind_databases` from traits.build (where it is `build_combine`. And also the helper function `util_df_to_list`.
* rename `database_create_combined_table` to `flatten_database` * update documentation * it is intentional that the two util functions are exported, but can be discussed
*fix bug in `flatten_database` - contributors cannot return many columns * rename file
- Wasn't separating context property `link_vals` that were concatenated on a single line
A start on tests for the new extract function. - Probably need a few more tests to make sure complicated context searches give correct output
…raitecoevo/austraits into generalise_extract_off_develop
continued edits to tests undercovered a problem when using multiple "extracts" in sequence... still working on that
…raitecoevo/austraits into generalise_extract_off_develop
Specific context tables might be empty (null) if a series of extract functions are used in tandem. Was fixing tests for austraits_lite5.0.0 and came across another issue... Unlike the old extract functions the new one doesn't work with vectors of values - line 123 in `extract_data`
The tests to do with extract_data are generally working, but there is a weird, oscillating bug with the order of tables in the database list that I can't work out. And also there is a test on line 198 (expect_equal(trait_subset$traits %>% dplyr::distinct(dataset_id) %>% nrow(), 5)) alternates between giving 5 & 8 as output. But bugs fixed: - works with empty tibbles - works with vectors of values to match
series of small errors causing tests to fail: - needed to reorder tables in some of the austraits_lite versions - recreated the austraits_lite for versions 3 & 4 that are empty, other than having the structure required to test compatibility - plot_data needed an `as.numeric` - the function `summarise_trait_means` was still being tests on v3.0 and indeed looking at the code, isn't going to summarise, because it is only searching for replicates within a single observation_id
The function `join_context_properties` was not being exported
…raitecoevo/austraits into generalise_extract_off_develop
…dding links in roxygen
…raitecoevo/austraits into generalise_extract_off_develop
…raitecoevo/austraits into generalise_extract_off_develop
…raitecoevo/austraits into generalise_extract_off_develop
Write tests for bind_databases (issue #114)
…raitecoevo/austraits into generalise_extract_off_develop
…raitecoevo/austraits into generalise_extract_off_develop
changed `plant_trait_name` to `trait_name` in the function `plot_trait_distribution_beeswarm` to generalise it to all traits.build databases. Towards Issue #111
…raitecoevo/austraits into generalise_extract_off_develop
…raitecoevo/austraits into generalise_extract_off_develop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work Lizzy! The extract_data function is a beast but one that we've tamed :D
It bugs me I don't understand how he back tick works but that is for another day!
Well done!!
- rename all with `convert_` instead of `util_` - add `convert_list_to_df1` from traits.build, so all 3 functions in one location
- implemented or tried to implement all suggestions in review - for things that didn't work I've included the suggestion but commented it out - I suspect many are minor syntax fixes that i am not working out.
- remove naming of individual databases from `bind_databases`. Once we use this version of the functions in traits.build we will have to add in the creation of the `dataset_id` column in `taxonomic_updates` from elsewhere. That is the only use of the database name and it introduces an error when use in generic situations - added lots of loops, more compact coding to `extract_data`
extract_data
function that makes it possible to subset a traits.build database using any column in any table.- The function works with single values or vectors of values
- Does not return an error for empty searches
- For context property values, searches across all context property categories simultaneously
bind_datasets
)- There is code in this function to rename the individual datasets before they are bound together - it is not clear if this is necessary, because the output is identical with and without this step, but it is being maintained for now, but is a bit cumbersome since databases to be bound aren't always datasets with a dataset_id. (Tried building all of austraits.build with and without this too and identical output.)