Skip to content

Commit

Permalink
Initial problem
Browse files Browse the repository at this point in the history
  • Loading branch information
gshotwell committed Dec 2, 2023
1 parent cdeeecb commit 28cbc6f
Show file tree
Hide file tree
Showing 19 changed files with 5,532 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,5 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

.DS_Store
48 changes: 47 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,47 @@
# python-data-app-challenge
# Data App Comparison

This repo illustrates some differences between various web application frameworks.
The purpose is to provide minimal, concrete examples of how to accomplish common development tasks in various python web application frameworks, and to use those examples to help people learn their APIs.
The frameworks we have so far are:

- [Dash](https://plotly.com/dash/)

- [Panel](https://panel.holoviz.org/reference/index.html)

- [Shiny](https://shiny.posit.co/)

- [Streamlit](https://streamlit.io/)

## Running the examples

Navigate to the example folder and install the dependencies in a virtual environment with

``` bash
pip install -r requirements.txt
```

| Framework | Command |
|-----------|------------------------|
| Dash | `python app.py` |
| Panel | `panel serve app.py` |
| Streamlit | `streamlit run app.py` |
| Shiny | `shiny run app.py` |

# Submitting a new problem

Please raise an issue to discuss and clarify the problem statement, and then submit a pull request with the problem statement in a README file.
Ideally problems should have the following qualities:

- Problems should be small and clear

- Successful apps should stand alone and not require external APIs or system setup

- Problems should focus on the capabilities of the web framework

- For inspriation see [7guis](https://eugenkiss.github.io/7guis/) or [TodoMVC](https://todomvc.com/)

# Submitting a new solution

We want only one solution per framework, but please submit PRs with either solutions from a new framework, or improvements to an the existing solution.
Your solution should focus on the framework's capabilities, and ideally have fairly few dependencies.
For example it's not a good idea to include a lot of JavaScript code in your Streamlit solution because that will tell the reader more about how to do something in JavaScript than it will about what they can do in Streamlit.
26 changes: 26 additions & 0 deletions sampling-dashboard/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Problem description

This exercise illustrates the common problem of sampling from a dataset and interrogating that dataset with matplotlib plots.
You could imaging the sample being taken from a database, or larger than memory dataset, but in this case it's based on a small sample of the NYC Taxi data.

## Requirements

1. The application should have the following components:

- A proportion input which selects the proportion of the dataset to sample

- A log-scale input which selects whether the tip plot is on a log scale

- A plot showing the relationship between tips and prices

- A plot showing a histogram of prices

2. The app should use matplotlib plots (which can be found in `plots.py`

3. The histogram plot should not rerender if the log-scale selector is changed

4. The sample should only be retaken if the proportion slider changes

5. Each time the proportion slider changes the app should take a new sample

#
86 changes: 86 additions & 0 deletions sampling-dashboard/dash/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
import dash
import dash_bootstrap_components as dbc
import pandas as pd
import plotly.express as px
from dash import Input, Output, dcc, html

app = dash.Dash(external_stylesheets=[dbc.themes.BOOTSTRAP])
# the style arguments for the sidebar. We use position:fixed and a fixed width
SIDEBAR_STYLE = {
"position": "fixed",
"top": 0,
"left": 0,
"bottom": 0,
"width": "16rem",
"padding": "2rem 1rem",
"background-color": "#f8f9fa",
}

CONTENT_STYLE = {
"margin-left": "18rem",
"margin-right": "2rem",
"padding": "2rem 1rem",
}

sidebar = html.Div(
children=[
dcc.Input(id="sample", type="number", min=0, max=1, value=0.1, step=0.01),
html.Div("Plot scale"),
dcc.RadioItems(["Linear", "Log"], id="scale"),
],
style=SIDEBAR_STYLE,
)

content = html.Div(
id="page-content",
style=CONTENT_STYLE,
children=[
html.Div(id="max-value", style={"padding-top": "50px"}),
dcc.Graph(id="scatter-plot"),
dcc.Graph(id="histogram"),
dcc.Store(id="sampled-dataset"),
],
)

app.layout = html.Div([dcc.Location(id="url"), sidebar, content])


@app.callback(Output("sampled-dataset", "data"), Input("sample", "value"))
def cache_dataset(sample):
df = pd.read_csv("nyc-taxi.csv")
df = df.sample(frac=sample)

# To cache data in this way we need to seiralize it to json
json = df.to_json(date_format="iso", orient="split")
return json


@app.callback(Output("max-value", "children"), Input("sampled-dataset", "data"))
def update_max_value(sampled_df):
df = pd.read_json(sampled_df, orient="split")
return f'First taxi id: {df["taxi_id"].iloc[0]}'


@app.callback(
Output("scatter-plot", "figure"),
Input("sampled-dataset", "data"),
Input("scale", "value"),
)
def update_scatter(sampled_df, scale):
df = pd.read_json(sampled_df, orient="split")
scale = scale == "Log"
fig = px.scatter(df, x="total_amount", y="tip_amount", log_x=scale, log_y=scale)
fig.update_layout(transition_duration=500)
return fig


@app.callback(Output("histogram", "figure"), Input("sampled-dataset", "data"))
def update_histogram(sampled_df):
df = pd.read_json(sampled_df, orient="split")
fig = px.histogram(df, x="total_amount")
fig.update_layout(transition_duration=500)
return fig


if __name__ == "__main__":
app.run_server(debug=True)
Loading

0 comments on commit 28cbc6f

Please sign in to comment.