layout | title |
---|---|
page |
Résumé |
Python
{:target="_blank"},
SQL
{:target="_blank"},
Git
{:target="_blank"},
Jira
{:target="_blank"},
Azure DevOps
{:target="_blank"},
PyCharm
{:target="_blank"},
CLI
{:target="_blank"}
Natural Language Processing (NLP)
, Network Analysis/Graph Theory
,
Machine Learning
, Statistics
, API (Develop & Consume)
,
Classification
, Regression
, Data Visualization
,
Unsupervised Learning
, Web Scraping/Crawling
, Code Review
,
Object Oriented Programming
, Automation
, Unit Testing
pandas
{:target="_blank"},
scikit-learn
{:target="_blank"},
networkx
{:target="_blank"},
pydantic
{:target="_blank"},
numpy
{:target="_blank"},
scrapy
{:target="_blank"},
matplotlib
{:target="_blank"},
seaborn
{:target="_blank"},
pre-commit
{:target="_blank"},
black
{:target="_blank"},
isort
{:target="_blank"},
flake8
{:target="_blank"},
mypy
{:target="_blank"},
requests
{:target="_blank"},
spacy
{:target="_blank"},
typer
{:target="_blank"},
pytest
{:target="_blank"},
beautifulsoup4
{:target="_blank"},
streamlit
{:target="_blank"},
tqdm
{:target="_blank"},
fastapi
{:target="_blank"}
04/2022—present
- Developed a proof of concept to convert tax instructions into a directed acyclic graph using named entity recognition and a key file.
- Using topological sorting within a directed acyclic graph, predicted hundreds of tax lines in parallel within the same generation.
- Through the use of graph algorithms, developed methods to quickly identify which lines in a tax form impacted other lines. This was used to determine where to focus attention when reviewing tax returns for issues that could have a large impact on individual clients.
- Published the subpackage, "ReturnAssistant," with the ability to process 60,000 prior year returns in ~200ms.
- Crawled the IRS website, extracted instructions for all forms and schedules, and preprocessed into training data.
- Developed and trained a relation extraction pipeline to automate the construction of a directed acyclic graph using a custom tokenizer, a fine-tuned named entity recognition component, and a custom trained relation extraction component.
- Hand-picked for the generative artificial intelligence research and development team that was tasked with researching how emerging technology in the natural language processing space could be leveraged by the company.
- Developed a Python package to help navigate and validate the company tax engine using graph algorithms.
10/2021—04/2022
- Automated PRM ticket creation saving 100 minutes per month.
- Introduced Gaussian Processes (Kriging) as a data generation technique to estimate values where observations were not recorded in a time series.
- Created and maintained a private Python package to streamline my team's work.
06/2019—10/2021
- Deployed a classification model via API to label cost accounting codes with 95% accuracy, shortening manual labeling time from 3 hours to seconds widening the user-base.
- Automated weekly JIRA tickets, time checks, and comment reminders using Python and the JIRA API saving 100 minutes per week.
- Using a topic model, trained a named entity recognition pipeline on charge code descriptions to identify profitable subsets.
- Cut average costs per primary complete mastectomy from $9,000 to $5,500 using evidence provided via bootstrapping and difference testing.
- Deconstructed and rebuilt the COVID-19 CHIME model using internal data, providing an auto-updating dashboard to senior leadership to help with capacity and supply planning.
05/2017—05/2019
- Designed and automated standard analyses, updating PowerPoint presentations directly.
- Simulated and optimized complex rule-based algorithms for patient-provider matching.
- Lead quarterly Python classes to teach and train associates.
- Implemented quality-improvement benchmarks for health systems using a combination of the Achievable Benchmarks of Care{:target="_blank"} and mathematically derived metrics.
- Developed a Python module to generate and execute SQL, allowing the validation of data sets and simulation of changes.
08/2013—08/2015
06/2013—07/2013
08/2011—06/2013
- Mapping Tax Structures Via Natural Language Processing Generated Directed Acyclic Graphs (Patent Application #18/414,771)—HRB Innovations
- Bronze Pandas Tag Badge{:target="_blank"}—Stack Overflow
- Bronze Python Tag Badge{:target="_blank"}—Stack Overflow
- 66 Days of Data Shoutout{:target="_blank"}—66 Days of Data
- A Generalization of the Goresky-Klapper Conjecture, Part I{:target="_blank"}—Society for Industrial and Applied Mathematics (SIAM)
- Consulting Project Eagle Award{:target="_blank"}—Cerner Corporation
- DataCon 2017 Data Competition Winner—Cerner Corporation
- [BUG] -- Arguments
enable
anddisable
not working as expected inspacy.load
{:target="_blank"}—spaCy - [ENH] -- Option to exclude
model_extra
fromrepr
{:target="_blank"}—Pydantic - BUG -- Requiring
name
argument inStackAPI
makes "/users/{id}/network-activity" endpoint inaccessible{:target="_blank"}—StackAPI - Update index handling in
PandasAdapter
{:target="_blank"}—Scikit-learn - Most recent
scikit-learn
results in several failed unit tests{:target="_blank"}—mlxtend - Integrate scikit-learn's
set_output
method intoTransactionEncoder
{:target="_blank"}—mlxtend - ENH -- Replaced for-loops in :function:
rescale_layout
with numpy vectorized methods.{:target="_blank"}—Networkx - Typo fix in
Language.replace_listeners
docs{:target="_blank"}—spaCy - DOC Update MLPRegressor docs{:target="_blank"}—Scikit-learn
- add
feature_names_in_
attribute toFeatureUnion
{:target="_blank"}—Scikit-learn - synset pos parameter{:target="_blank"}—spacy-wordnet
- _Detecting Kidney Stone CPT Communities using the Louvain Method{:target="blank"}—Health Analytics Summit @ Health Catalyst (09/2021)
- Exploring Null Space—DataCon @ Cerner Corporation (09/2018)
- Automating Excel Report Generation with Python—DataCon @ Cerner Corporation (09/2016)