Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating LOIO visualization #47

Merged
merged 11 commits into from
Feb 6, 2024
348 changes: 197 additions & 151 deletions 3.evaluate_model/LOIO_evaluation.ipynb

Large diffs are not rendered by default.

13,181 changes: 13,181 additions & 0 deletions 3.evaluate_model/evaluations/LOIO_probas/LOIO_summary_ranks_allfeaturespaces.tsv

Large diffs are not rendered by default.

Large diffs are not rendered by default.

55 changes: 38 additions & 17 deletions 3.evaluate_model/scripts/nbconverted/LOIO_evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,10 @@ def compute_avg_rank_and_pvalue(grouped_df):

# Set I/O
proba_dir = pathlib.Path("evaluations", "LOIO_probas")
loio_file = pathlib.Path(proba_dir, "compiled_LOIO_probabilites_withshuffled.tsv")
loio_file = pathlib.Path(proba_dir, "compiled_LOIO_probabilites.tsv")

output_summary_file = pathlib.Path(proba_dir, "LOIO_summary_ranks_withshuffled.tsv")
output_summary_phenotype_file = pathlib.Path(proba_dir, "LOIO_summary_ranks_perphenotype_withshuffled.tsv")
output_summary_file = pathlib.Path(proba_dir, "LOIO_summary_ranks_allfeaturespaces.tsv")
output_summary_phenotype_file = pathlib.Path(proba_dir, "LOIO_summary_ranks_perphenotype_allfeaturespaces.tsv")


# In[4]:
Expand Down Expand Up @@ -92,21 +92,40 @@ def compute_avg_rank_and_pvalue(grouped_df):


# Calculate average rank for each Metadata_DNA
gwaybio marked this conversation as resolved.
Show resolved Hide resolved
rank_groups = [
"Metadata_DNA",
"Model_Type",
"Mitocheck_Phenotypic_Class",
"Model_Feature_Type",
"Model_Balance_Type"
]

# Output data columns
output_data_columns = [
"Average_Rank",
"Average_P_Value",
"Min_IQR_Rank",
"Max_IQR_Rank",
"Min_IQR_P_Value",
"Max_IQR_P_Value",
"Count"
]

avg_ranks = (
loio_df.groupby(["Metadata_DNA", "Model_type", "Mitocheck_Phenotypic_Class", "Model_Feature_Type"])
loio_df.groupby(rank_groups)
.apply(compute_avg_rank_and_pvalue)
.reset_index()
)

avg_ranks.columns = ["Metadata_DNA", "Model_type", "Mitocheck_Phenotypic_Class", "Model_Feature_Type", "Average_Scores"]
avg_ranks.columns = rank_groups + ["Average_Scores"]

loio_scores_df = (
pd.concat([
avg_ranks.drop(columns="Average_Scores"),
pd.DataFrame(avg_ranks.Average_Scores.tolist(), columns=[
"Average_Rank", "Average_P_Value", "Min_IQR_Rank", "Max_IQR_Rank",
"Min_IQR_P_Value", "Max_IQR_P_Value", "Count"
])
pd.DataFrame(
avg_ranks.Average_Scores.tolist(),
columns=output_data_columns
)
], axis="columns")
)

Expand All @@ -116,7 +135,7 @@ def compute_avg_rank_and_pvalue(grouped_df):
loio_scores_df.head()


# ## Get average ranks and p value of correct prediction
# ## Get average ranks and p value per phenotype
#
# - Per model type (final vs. shuffled)
# - Per Phenotype
Expand All @@ -127,22 +146,24 @@ def compute_avg_rank_and_pvalue(grouped_df):
# In[7]:


# Calculate average rank for each Metadata_DNA
# Calculate average rank for each phenotype
rank_groups.remove("Metadata_DNA") # Remove the per image to group on

avg_ranks = (
loio_df.groupby(["Model_type", "Mitocheck_Phenotypic_Class", "Model_Feature_Type"])
loio_df.groupby(rank_groups)
.apply(compute_avg_rank_and_pvalue)
.reset_index()
)

avg_ranks.columns = ["Model_type", "Mitocheck_Phenotypic_Class", "Model_Feature_Type", "Average_Scores"]
avg_ranks.columns = rank_groups + ["Average_Scores"]

loio_scores_df = (
pd.concat([
avg_ranks.drop(columns="Average_Scores"),
pd.DataFrame(avg_ranks.Average_Scores.tolist(), columns=[
"Average_Rank", "Average_P_Value", "Min_IQR_Rank", "Max_IQR_Rank",
"Min_IQR_P_Value", "Max_IQR_P_Value", "Count"
])
pd.DataFrame(
avg_ranks.Average_Scores.tolist(),
columns=output_data_columns
)
], axis="columns")
)

Expand Down
608 changes: 376 additions & 232 deletions 7.figures/Figure4_LOIO_analysis.ipynb

Large diffs are not rendered by default.

Binary file modified 7.figures/figures/main_figure_4_loio.png
gwaybio marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This figure looks so much better and tells the story in a more clean and concise manner!

I only have one question:

Are these figures showing the result from the balanced or unbalanced model, with or without IC, and is this with all nuclei features (not a subset for AreaShape)?

This will probably be stated in the figure legend, but wanted to ask since I am curious what model I am looking at.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great questions! This is our standard model: Balanced, with IC, and all features. I add clarification in 8618141

Going to merge now, thanks again!

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading