MSBA Introduction to Python
This workshop is a joint collaboration between The Orange County R Users Group (OCRUG) and the UCI Paul Merage School of Business, Masters of Science in Business Analytics (MSBA)
In this workshop, a basic introduction to Python will be presented covering fundamentals of Python programming and practical data science skills using the pandas
Python library.
Date: July 30 & 31, 2020
Time: 5 PM to 8 PM, Pacific
Location: Online (Zoom)
- 3 hours each day, 2 days total
- Divide into 40 min sessions
- 15 min instruction
- 15 min practice in breakout rooms
- Teaching assistants will be assigned to breakout rooms
- 10 min review & questions
- 4 sessions per day (160 min for sessions + 10 min break + 10 min wrap-up) *8 sessions total for both days
The two-day workshop will be presented as 4 sessions each day, where each session is roughly divided into 15 minutes of instruction, 15 minutes of practice and exercies, and 10 minutes of review (~40 min per session).
The focus will be on Python as a language, drawing from the python docs: https://docs.python.org/3/
- Session 1: Using Jupyter Notebooks
- What is a notebook, why are they useful?
- Jupyter interface
- Working with cells (creating, executing, cell types, etc)
- Tips and best practices
- Session 2: Review of Python Fundamentals
- Importance of spacing
- Expressions and variables
- Math operations
- Data types (numbers, strings, boolean)
- Lists
- Session 3: Control Flows
- Conditional statements
- Loops
- Session 4: Functions
- What are they and why are they important
- Function syntax
- How to write your own functions
- Tips and best practices
The focus will be on Pandas as the entry into data science specific tasks, drawing from the getting started tutorials: https://pandas.pydata.org/docs/getting_started/intro_tutorials/index.html
- Session 5: Introduction to
pandas
- Why tabular data tables are useful for data science (compare to Excel)
- Series and DataFrames
- How to create Series and DataFrames
- How to read Series and DataFrames from files
- Session 6: Subsetting
DataFrames
- Selecting columns
- Filtering rows
- The various ways of indexing data frames (by labels, slices, conditional expressions), loc and iloc
- Session 7: Reshaping and Merging
DataFrames
- Wide vs. long formats and converting between the two: pivot and melt
- Grouped summaries, groupby
- Concatenating tables by column and row: concat
- Joining data tables: merge
- Session 8: Data Visualization with
pandas
- Basic plotting from pandas: plot, scatter, box, hist, etc
- Examples of more complex plots, coloring and grouping by variables
- Tuning plot parameters (sizes, colors, layouts)
- Saving plots (e.g. to use in presentations, etc)