Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Individual Shifted Beta Geometric (sBG) model #133

Merged
merged 3 commits into from
Feb 27, 2023
Merged

Conversation

ricardoV94
Copy link
Contributor

@ricardoV94 ricardoV94 commented Jan 25, 2023

Closes #40
Closes #97

ToDO

  • Tests
  • Add more text/context to notebook
  • Docstrings

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments and questions for now! 🙂

pymc_marketing/clv/models/shifted_beta_geo.py Outdated Show resolved Hide resolved
pymc_marketing/clv/models/shifted_beta_geo.py Outdated Show resolved Hide resolved
pymc_marketing/clv/models/shifted_beta_geo.py Outdated Show resolved Hide resolved
@ricardoV94 ricardoV94 force-pushed the shifted_beta_geo branch 2 times, most recently from dc63495 to 4722d7d Compare January 30, 2023 16:05
@ColtAllen
Copy link
Collaborator

I dug up a link to the research paper because I'm not familiar with this model:

https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf

Are there plans to add any of the other expressions in that paper like aggregate retention rate? There's even a discounted lifetime value function.

Also, looking over the code I see a lot of redundancy in self.model, distribution_customer_churn_time() and _distribution_new_customer(). Is there a way these could be refactored to cut down on the repeated code lines?

@ricardoV94
Copy link
Contributor Author

ricardoV94 commented Feb 21, 2023

I dug up a link to the research paper because I'm not familiar with this model:

https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf

Are there plans to add any of the other expressions in that paper like aggregate retention rate? There's even a discounted lifetime value function.

Yeah we can add them. In this case it's trivial to do whatever summary statistics we want since we have all the variables in the trace, including individual parameters. Once we do the model per-cohort we will need to look at the close form solutions in those papers. @ColtAllen Do you mind opening an issue for that?

Also, looking over the code I see a lot of redundancy in self.model, distribution_customer_churn_time() and _distribution_new_customer(). Is there a way these could be refactored to cut down on the repeated code lines?

Hmm... maybe, but I think it's simple enough. Just defining a new PyMC model that we can do posterior predictive on with the old fit.

@ricardoV94 ricardoV94 marked this pull request as ready for review February 21, 2023 12:07
@ricardoV94
Copy link
Contributor Author

ricardoV94 commented Feb 21, 2023

Changed t_end to T, and added missing tests. Ready for a final review

@codecov
Copy link

codecov bot commented Feb 21, 2023

Codecov Report

Merging #133 (afb79f7) into main (d410c11) will increase coverage by 0.32%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #133      +/-   ##
==========================================
+ Coverage   94.12%   94.45%   +0.32%     
==========================================
  Files          17       18       +1     
  Lines         919      973      +54     
==========================================
+ Hits          865      919      +54     
  Misses         54       54              
Impacted Files Coverage Δ
pymc_marketing/clv/__init__.py 100.00% <ø> (ø)
pymc_marketing/clv/models/__init__.py 100.00% <100.00%> (ø)
pymc_marketing/clv/models/basic.py 98.52% <100.00%> (+0.41%) ⬆️
pymc_marketing/clv/models/beta_geo.py 98.85% <100.00%> (-0.08%) ⬇️
pymc_marketing/clv/models/gamma_gamma.py 98.00% <100.00%> (-0.15%) ⬇️
pymc_marketing/clv/models/shifted_beta_geo.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@juanitorduz
Copy link
Collaborator

Closes #4

ToDO

  • Tests
  • Add more text/context to notebook
  • Docstrings

I think the indicated issue is not correct (is pointing to MMM transformations)

@ricardoV94
Copy link
Contributor Author

Closes #4
ToDO

  • Tests
  • Add more text/context to notebook
  • Docstrings

I think the indicated issue is not correct (is pointing to MMM transformations)

Thanks, it used to be #40, but one of the edits killed that zero :)

@ricardoV94
Copy link
Contributor Author

Addressed comments, ready for re-review

Copy link
Collaborator

@juanitorduz juanitorduz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully aware of the model details (I will have two), but overall the code looks nice to read. I let a small comment regarding a typo.

pymc_marketing/clv/models/shifted_beta_geo.py Outdated Show resolved Hide resolved
@juanitorduz
Copy link
Collaborator

I read the paper looks great! I could support adding some formulas missing from the paper in another PR! So I think if you correct the typo above we can merge this one 🙂

@ricardoV94
Copy link
Contributor Author

Typo fixed @juanitorduz

@ricardoV94
Copy link
Contributor Author

Found a typo in the notebook, fixing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLV enhancement New feature or request
Projects
None yet
4 participants