Daily Dose of Data Science is a publication on Substack that brings together intriguing frameworks, libraries, technologies, and tips that make the life cycle of a Data Science project effortless.
This repository is a collection of all the code snippets presented in my publication. If you want to receive these tips in your mailbox daily, you can subscribe to my Substack newsletter .
Run These Code Snippets on Your Local Machine
To download the tips listed here, you can clone this repo.
git clone https://github.com/ChawlaAvi/Daily-Dose-of-Data-Science
Pandas
Jupyter Tips
Python
Plotting
NumPy
Memory Optimization
Cool Tools
Run-time Optimization
Sklearn
Debugging
Missing Data
ML-AI News
Machine Learning
Statistics
Testing
Terminal
Documents
Title
Notebook
Substack
Article
Analyze A Pandas DataFrame Without Code
🔗
🔗
70x Faster Pandas By Changing Just One Line of Code
🔗
🔗
Reduce Memory Usage Of A Pandas DataFrame By 90%
🔗
🔗
Medium
Speed-up Pandas Apply 5x with NumPy
🔗
🔗
A Lesser-Known Feature of Apply Method In Pandas
🔗
🔗
Create Pandas DataFrame from Dataclass
🔗
🔗
Run SQL in Jupyter To Analyze A Pandas DataFrame
🔗
🔗
When You Should Not Use the head() Method In Pandas
🔗
🔗
Three Lesser-known Tips For Reading a CSV File Using Pandas
🔗
🔗
The Best File Format To Store A Pandas DataFrame
🔗
🔗
Medium
Lesser-Known Feature of the Merge Method in Pandas
🔗
🔗
The Best Way to Use Apply() in Pandas
🔗
🔗
A No-code Tool To Understand Your Data Quickly
🔗
🔗
Display Progress Bar With Apply() in Pandas
🔗
🔗
Supercharge value_counts() Method in Pandas With Sidetable
🔗
🔗
Explore CSV Data Right From The Terminal
🔗
🔗
Define the Correct DataType for Categorical Columns
🔗
🔗
Medium
Don't Create Conditional Columns in Pandas with Apply
🔗
🔗
Write Your Own Flavor Of Pandas
🔗
🔗
Create DataFrame Hassle-free By Using Clipboard
🔗
🔗
Alter the Datatype of Multiple Columns at Once
🔗
🔗
Why you should not dump DataFrames to a CSV
🔗
🔗
Medium
Why You Should Not Read CSVs with Pandas
🔗
🔗
Medium
Parallelize Pandas Apply() With Swifter
🔗
🔗
A Hidden Feature of Describe Method In Pandas
🔗
🔗
Enrich Your Notebook With Interactive Controls
🔗
🔗
Data Analysis Using No-Code Pandas In Jupyter
🔗
🔗
Create Pivot Tables, Aggregations and Plots Without Any Code
🔗
🔗
Medium
Parallelize Pandas with Pandarallel
🔗
🔗
Medium
Pretty Plotting With Pandas
🔗
🔗
How to Read Multiple CSV Files Efficiently
🔗
🔗
Medium
Configure Sklearn To Output Pandas DataFrame
🔗
🔗
Datatype For Handling Missing Valued Columns in Pandas
🔗
🔗
Medium
Vectorization Does Not Always Guarantee Better Performance
🔗
🔗
Title
Notebook
Substack
Article
Stop Previewing Raw DataFrames. Instead, Use DataTables.
🔗
🔗
Label Your Data With The Click Of A Button
🔗
🔗
The Coolest Jupyter Notebook Hack
🔗
🔗
View Documentation in Jupyter Notebook
🔗
🔗
Get Notified When Jupyter Cell Has Executed
🔗
🔗
Clear Cell Output In Jupyter Notebook During Run-time
🔗
🔗
CodeSquire: The AI Coding Assistant You Should Use Over GitHub Copilot
🔗
🔗
Find Your Code Hiding In Some Jupyter Notebook With Ease
🔗
🔗
Enrich Your Notebook With Interactive Controls
🔗
🔗
Data Analysis Using No-Code Pandas In Jupyter
🔗
🔗
Create Pivot Tables, Aggregations and Plots Without Any Code
🔗
🔗
Medium
Restart Notebook Without Losing Variables
🔗
🔗
Medium
Retrieve Previously Computed Output In Jupyter Notebook
🔗
🔗
Medium
Transfer Variables Between Jupyter Notebooks
🔗
🔗
Medium
Title
Notebook
Substack
Article
A Single Line That Will Make Your Python Code Faster
🔗
🔗
Make Dot Notation More Powerful in Python
🔗
🔗
An Elegant Way To Perform Shutdown Tasks in Python
🔗
🔗
What Are Class Methods and When To Use Them?
🔗
🔗
Hide Attributes While Printing A Dataclass Object
🔗
🔗
List : Tuple :: Set : ?
🔗
🔗
Post_init : Add Attributes To A Dataclass Post Initialization
🔗
🔗
Simplify Your Functions With Partial Functions
🔗
🔗
DotMap: A Better Alternative to Python Dictionary
🔗
🔗
Prevent Wild Imports With all in Python
🔗
🔗
Performance Comparison of Python 3.11 and Python 3.10
🔗
🔗
Why 256 is 256 But 257 is not 257?
🔗
🔗
Make a Class Object Behave Like a Function
🔗
🔗
Lesser-known Feature of Pickle Files
🔗
🔗
Specify Loops and Runs In %%timeit
🔗
🔗
Don't Use time.time() To Measure Execution Time
🔗
🔗
Import Your Python Package as a Module
🔗
🔗
Fine-grained Error Tracking With Python 3.11
🔗
🔗
Run Python Project Directory As A Script
🔗
🔗
Use Slotted Class To Improve Your Python Code
🔗
🔗
Using Dictionaries In Place of If-conditions
🔗
🔗
In Defense of Match-case Statements in Python
🔗
🔗
Title
Notebook
Substack
Article
Perfplot: Measure, Visualize and Compare Run-time With Ease
🔗
🔗
Prettify Word Clouds In Python
🔗
🔗
Calendar Map As A Richer Alternative to Line Plot
🔗
🔗
Density Plot As A Richer Alternative to Scatter Plot
🔗
🔗
Medium
Python One-Liner To Create Sketchy Hand-drawn Plots
🔗
🔗
Create a Moving Bubbles Chart in Python
🔗
🔗
Visualizing Google Search Trends of 2022 using Python
🔗
🔗
Create A Racing Bar Chart In Python
🔗
🔗
Elegantly Plot the Decision Boundary of a Classifier
🔗
🔗
Dot Plot: A Potential Alternative to Bar Plot
🔗
🔗
Medium
Hexbin Plots As A Richer Alternative to Scatter Plots
🔗
🔗
Medium
Enrich Your Notebook With Interactive Controls
🔗
🔗
Regression Plot Made Easy with Plotly
🔗
🔗
Pretty Plotting With Pandas
🔗
🔗
Polynomial Linear Regression Plot Made Easy With Seaborn
🔗
🔗
Analyse Flow Data With Sankey Diagrams
🔗
🔗
Waterfall Charts: A Better Alternative to Line/Bar Plot
🔗
🔗
Medium
Title
Notebook
Substack
Article
Speed-up NumPy 20x with Numexpr
🔗
🔗
An Elegant Way To Perform Matrix Multiplication
🔗
🔗
Difference Between Dot and Matmul in NumPy
🔗
🔗
Don't Print NumPy Arrays! Use Lovely-NumPy Instead
🔗
🔗
Polynomial Linear Regression with NumPy
🔗
🔗
Title
Notebook
Substack
Article
70x Faster Pandas By Changing Just One Line of Code
🔗
🔗
Reduce Memory Usage Of A Pandas DataFrame By 90%
🔗
🔗
Medium
The Best File Format To Store A Pandas DataFrame
🔗
🔗
Medium
Define the Correct DataType for Categorical Columns
🔗
🔗
Medium
Datatype For Handling Missing Valued Columns in Pandas
🔗
🔗
Medium
Save Memory with Python Generators
🔗
🔗
Title
Notebook
Substack
Article
Preview Your README File Locally In GitHub Style
🔗
🔗
This GUI Tool Can Possibly Save You Hours Of Manual Work
🔗
🔗
Stop Previewing Raw DataFrames. Instead, Use DataTables.
🔗
🔗
Converting Python To LaTeX Has Possibly Never Been So Simple
🔗
🔗
Label Your Data With The Click Of A Button
🔗
🔗
Analyze A Pandas DataFrame Without Code
🔗
🔗
A No-Code Online Tool To Explore and Understand Neural Networks
🔗
🔗
Speed-up NumPy 20x with Numexpr
🔗
🔗
Debugging Made Easy With PySnooper
🔗
🔗
Deep Learning Network Debugging Made Easy
🔗
🔗
CodeSquire: The AI Coding Assistant You Should Use Over GitHub Copilot
🔗
🔗
Find Unused Python Code With Ease
🔗
🔗
Enrich Your Notebook With Interactive Controls
🔗
🔗
Data Analysis Using No-Code Pandas In Jupyter
🔗
🔗
Modify Python Code During Run-Time
🔗
🔗
Medium
Modify Function During Run-Time
🔗
🔗
Medium
Importing Modules Made Easy with Pyforest
🔗
🔗
Create Pivot Tables, Aggregations and Plots Without Any Code
🔗
🔗
Medium
Title
Notebook
Substack
Article
A Single Line That Will Make Your Python Code Faster
🔗
🔗
Make Sklearn KMeans 20x times faster
🔗
🔗
Speed-up NumPy 20x with Numexpr
🔗
🔗
The Best File Format To Store A Pandas DataFrame
🔗
🔗
Medium
The Best Way to Use Apply() in Pandas
🔗
🔗
Don't Create Conditional Columns in Pandas with Apply
🔗
🔗
Why you should not dump DataFrames to a CSV
🔗
🔗
Medium
Parallelize Pandas Apply() With Swifter
🔗
🔗
Parallelize Pandas with Pandarallel
🔗
🔗
Medium
How to Read Multiple CSV Files Efficiently
🔗
🔗
Medium
Title
Notebook
Substack
Article
Sklearn One-liner to Generate Synthetic Data
🔗
🔗
Skorch: Use Scikit-learn API on PyTorch Models
🔗
🔗
Make Sklearn KMeans 20x times faster
🔗
🔗
Build Baseline Models Effortlessly With Sklearn
🔗
🔗
Polynomial Linear Regression with NumPy
🔗
🔗
An Elegant Way to Import Metrics From Sklearn
🔗
🔗
Feature Tracking Made Simple In Sklearn Transformers
🔗
🔗
Configure Sklearn To Output Pandas DataFrame
🔗
🔗
Title
Notebook
Substack
Article
Debugging Made Easy With PySnooper
🔗
🔗
Don't use print() to debug your code.
🔗
🔗
Medium
Inspect Program Flow with IceCream
🔗
🔗
Medium
Lesser-known Feature of f-strings in Python
🔗
🔗
Title
Notebook
Substack
Article
Handle Missing Data With Missingno
🔗
🔗
Datatype For Handling Missing Valued Columns in Pandas
🔗
🔗
Title
Notebook
Substack
Article
Now You Can Use DALL·E With OpenAI API
🔗
🔗
Title
Notebook
Substack
Article
How to Encode Categorical Features With Many Categories?
🔗
🔗
Why KMeans May Not Be The Apt Clustering Algorithm Always
🔗
🔗
Skorch: Use Scikit-learn API on PyTorch Models
🔗
🔗
A No-Code Online Tool To Explore and Understand Neural Networks
🔗
🔗
Make Sklearn KMeans 20x times faster
🔗
🔗
Deep Learning Network Debugging Made Easy
🔗
🔗
Build Baseline Models Effortlessly With Sklearn
🔗
🔗
Polynomial Linear Regression with NumPy
🔗
🔗
Title
Notebook
Substack
Article
Pandas and NumPy Return Different Values for Standard Deviation. Why?
🔗
🔗
Why Correlation (and Other Statistics) Can Be Misleading
🔗
🔗
Title
Notebook
Substack
Article
Generate Your Own Fake Data In Seconds
🔗
🔗
Title
Notebook
Substack
Article
Visualize Commit History of Git Repo With Beautiful Animations
🔗
🔗
How Would You Identify Fuzzy Duplicates In A Data With Million Records?
🔗
🔗
Automated Code Refactoring With Sourcery
🔗
🔗
Medium
Explore CSV Data Right From The Terminal
🔗
🔗
Title
Document
Substack
Article
10 Automated EDA Tools That Will Save You Hours Of (Tedious) Work
🔗
🔗
30 Python Libraries to (Hugely) Boost Your Data Science Productivity
🔗
🔗