Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Decimal Precision in StatisticalResult.print_summary (No Style) #1638

Open
sbwiecko opened this issue Sep 20, 2024 · 2 comments
Open

Comments

@sbwiecko
Copy link
Contributor

The print_summary method in the lifelines.statistics.StatisticalResult class was not correctly handling the decimals argument when no explicit style was provided. This led to the output table displaying the default precision of 2 decimals, even if a different value was specified.

  • Problem: the value of the 'decimals' argument did not propagate properly in the class.
  • Solution: set self.decimals so that the value is shared properly by all the methods of the class.
  • Impact: this fix ensures that the desired decimal precision is respected in the output table, even when no specific style is chosen.

For example, the issue was reproduced using the provided example with the results of the logrank_test without explicit 'style' argument, resulting in a incorrect table with a precision of 2 decimals (the default) instead of 10:

from lifelines import statistics as stats
from lifelines.datasets import load_rossi

rossi = load_rossi()

results = stats.logrank_test(
    durations_A=rossi.loc[rossi['fin']==0, 'week'],
    durations_B=rossi.loc[rossi['fin']==1,'week'],
    event_observed_A=rossi.loc[rossi['fin']==0, 'arrest'],
    event_observed_B=rossi.loc[rossi['fin']==1,'arrest'],
)

results.print_summary(decimals=10)
<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <tbody>
    <tr>
      <th>t_0</th>
      <td>-1</td>
    </tr>
    <tr>
      <th>null_distribution</th>
      <td>chi squared</td>
    </tr>
    <tr>
      <th>degrees_of_freedom</th>
      <td>1</td>
    </tr>
    <tr>
      <th>test_name</th>
      <td>logrank_test</td>
    </tr>
  </tbody>
</table>
</div><table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>test_statistic</th>
      <th>p</th>
      <th>-log2(p)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>3.84</td>
      <td>0.05</td>
      <td>4.32</td>
    </tr>
  </tbody>
</table>

This happened only without explicit 'style' was provided, as the following worked well:

results.print_summary(style='html', decimals=5)
...
  <tbody>
    <tr>
      <th>0</th>
      <td>3.83757</td>
      <td>0.05012</td>
      <td>4.31858</td>
    </tr>
  </tbody>
results.print_summary(style='ascii', decimals=4)
<lifelines.StatisticalResult: logrank_test>
               t_0 = -1
 null_distribution = chi squared
degrees_of_freedom = 1
         test_name = logrank_test

---
 test_statistic      p  -log2(p)
         3.8376 0.0501    4.3186
results.print_summary(style='latex', decimals=6)
\begin{tabular}{lrrr}
 & test_statistic & p & -log2(p) \\
0 & 3.837570 & 0.050116 & 4.318582 \\
\end{tabular}

With the correction, the call results.print_summary(decimals=4) now results in the expected table:

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <tbody>
    <tr>
      <th>t_0</th>
      <td>-1</td>
    </tr>
    <tr>
      <th>null_distribution</th>
      <td>chi squared</td>
    </tr>
    <tr>
      <th>degrees_of_freedom</th>
      <td>1</td>
    </tr>
    <tr>
      <th>test_name</th>
      <td>logrank_test</td>
    </tr>
  </tbody>
</table>
</div><table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>test_statistic</th>
      <th>p</th>
      <th>-log2(p)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>3.8376</td>
      <td>0.0501</td>
      <td>4.3186</td>
    </tr>
  </tbody>
</table>

Other fitters and regression tables (Cox PH, Weibull, etc.) were not affected by this bug and continue to function as expected.

@sbwiecko
Copy link
Contributor Author

PR #1635 solves the issue.

@sbwiecko sbwiecko reopened this Oct 29, 2024
@CamDavidsonPilon
Copy link
Owner

Shoot you are right. I reverted the changes to make some edits, and didn't test thoroughly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants