-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EXAMPLES] Create dashboard example #5488
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Argilla Dashboard example | ||
|
||
## Description | ||
This is an example of a dashboard created with Argilla. The dashboard is a simple gradio app that | ||
allows the user to see the annotation datasets progress. Users can select the dataset they want to | ||
see and the app will display the progress of the dataset. | ||
|
||
## Running the app | ||
|
||
You must provide the `ARGILLA_API_URL` and `ARGILLA_API_KEY` environment variables to run the app. | ||
|
||
```bash | ||
ARGILLA_API_URL=<your-argilla-api> ARGILLA_API_KEY=<your-api-key> python app.py | ||
``` | ||
|
||
## Running on HF Spaces | ||
|
||
You can also run the app on HF Spaces. You can find the app [here](https://huggingface.co/spaces/frascuchon/argilla-progress). |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,205 @@ | ||
# app dashboard from https://huggingface.co/spaces/davanstrien/argilla-progress/blob/main/app.py | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we could add a duplicate space button too.https://www.gradio.app/docs/gradio/duplicatebutton |
||
import os | ||
from typing import List | ||
|
||
import argilla as rg | ||
import gradio as gr | ||
import pandas as pd | ||
import plotly.colors as colors | ||
import plotly.graph_objects as go | ||
|
||
client = rg.Argilla( | ||
api_url=os.getenv("ARGILLA_API_URL"), | ||
api_key=os.getenv("ARGILLA_API_KEY"), | ||
) | ||
|
||
|
||
def get_progress(dataset: rg.Dataset) -> dict: | ||
dataset_progress = dataset.progress(with_users_distribution=True) | ||
|
||
total, completed = dataset_progress["total"], dataset_progress["completed"] | ||
progress = (completed / total) * 100 if total > 0 else 0 | ||
|
||
return { | ||
"total": total, | ||
"annotated": completed, | ||
"progress": progress, | ||
"users": { | ||
username: user_progress["completed"].get("submitted") | ||
for username, user_progress in dataset_progress["users"].items() | ||
}, | ||
} | ||
|
||
|
||
def create_gauge_chart(progress): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we add a legend to this chart? It isn't too clear what the colors red vs blue mean. |
||
fig = go.Figure( | ||
go.Indicator( | ||
mode="gauge+number+delta", | ||
value=progress["progress"], | ||
title={"text": "Dataset Annotation Progress", "font": {"size": 24}}, | ||
delta={"reference": 100, "increasing": {"color": "RebeccaPurple"}}, | ||
number={"font": {"size": 40}, "valueformat": ".1f", "suffix": "%"}, | ||
gauge={ | ||
"axis": {"range": [None, 100], "tickwidth": 1, "tickcolor": "darkblue"}, | ||
"bar": {"color": "deepskyblue"}, | ||
"bgcolor": "white", | ||
"borderwidth": 2, | ||
"bordercolor": "gray", | ||
"steps": [ | ||
{"range": [0, progress["progress"]], "color": "royalblue"}, | ||
{"range": [progress["progress"], 100], "color": "lightgray"}, | ||
], | ||
"threshold": { | ||
"line": {"color": "red", "width": 4}, | ||
"thickness": 0.75, | ||
"value": 100, | ||
}, | ||
}, | ||
) | ||
) | ||
|
||
fig.update_layout( | ||
annotations=[ | ||
dict( | ||
text=( | ||
f"Total records: {progress['total']}<br>" | ||
f"Annotated: {progress['annotated']} ({progress['progress']:.1f}%)<br>" | ||
f"Remaining: {progress['total'] - progress['annotated']} ({100 - progress['progress']:.1f}%)" | ||
), | ||
# x=0.5, | ||
# y=-0.2, | ||
showarrow=False, | ||
xref="paper", | ||
yref="paper", | ||
font=dict(size=16), | ||
) | ||
], | ||
) | ||
|
||
fig.add_annotation( | ||
text=( | ||
f"Current Progress: {progress['progress']:.1f}% complete<br>" | ||
f"({progress['annotated']} out of {progress['total']} records annotated)" | ||
), | ||
xref="paper", | ||
yref="paper", | ||
x=0.5, | ||
y=1.1, | ||
showarrow=False, | ||
font=dict(size=18), | ||
align="center", | ||
) | ||
|
||
return fig | ||
|
||
|
||
def create_treemap(user_annotations, total_records): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @davidberenstein1957 Do you think we could simplify this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can maybe just add percentages to the table? I am not 100% sure if the treemap adds a lot of value compared to that. |
||
sorted_users = sorted(user_annotations.items(), key=lambda x: x[1], reverse=True) | ||
color_scale = colors.qualitative.Pastel + colors.qualitative.Set3 | ||
|
||
labels, parents, values, text, user_colors = [], [], [], [], [] | ||
|
||
for i, (user, contribution) in enumerate(sorted_users): | ||
percentage = (contribution / total_records) * 100 | ||
labels.append(user) | ||
parents.append("Annotations") | ||
values.append(contribution) | ||
text.append(f"{contribution} annotations<br>{percentage:.2f}%") | ||
user_colors.append(color_scale[i % len(color_scale)]) | ||
|
||
labels.append("Annotations") | ||
parents.append("") | ||
values.append(total_records) | ||
text.append(f"Total: {total_records} annotations") | ||
user_colors.append("#FFFFFF") | ||
|
||
fig = go.Figure( | ||
go.Treemap( | ||
labels=labels, | ||
parents=parents, | ||
values=values, | ||
text=text, | ||
textinfo="label+text", | ||
hoverinfo="label+text+value", | ||
marker=dict(colors=user_colors, line=dict(width=2)), | ||
) | ||
) | ||
|
||
fig.update_layout( | ||
title_text="User contributions to the total end dataset", | ||
height=500, | ||
margin=dict(l=10, r=10, t=50, b=10), | ||
paper_bgcolor="#F0F0F0", # Light gray background | ||
plot_bgcolor="#F0F0F0", # Light gray background | ||
) | ||
|
||
return fig | ||
|
||
|
||
def get_datasets(client: rg.Argilla) -> List[rg.Dataset]: | ||
return client.datasets.list() | ||
|
||
|
||
datasets = get_datasets(client) | ||
|
||
|
||
def update_dashboard(dataset_idx: int | None = None): | ||
if dataset_idx is None: | ||
return [None, None, None] | ||
|
||
dataset = datasets[dataset_idx] | ||
progress = get_progress(dataset) | ||
|
||
gauge_chart = create_gauge_chart(progress) | ||
treemap = create_treemap(progress["users"], progress["total"]) | ||
|
||
leaderboard_df = pd.DataFrame( | ||
list(progress["users"].items()), columns=["User", "Annotations"] | ||
) | ||
|
||
leaderboard_df = leaderboard_df.sort_values( | ||
"Annotations", ascending=False | ||
).reset_index(drop=True) | ||
|
||
return gauge_chart, treemap, leaderboard_df | ||
|
||
|
||
with gr.Blocks() as demo: | ||
gr.Markdown("# Argilla Dataset Dashboard") | ||
|
||
datasets_dropdown = gr.Dropdown(label="Select your dataset") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be nice to pre-select a dataset instead of entering on an empty screen. |
||
datasets_dropdown.choices = [ | ||
(dataset.name, idx) for idx, dataset in enumerate(datasets) | ||
] | ||
|
||
def set_selected_dataset(dataset_idx) -> None: | ||
global selected_dataset | ||
|
||
dataset = datasets[dataset_idx] | ||
selected_dataset = dataset | ||
|
||
with gr.Row(): | ||
gauge_output = gr.Plot(label="Overall Progress") | ||
treemap_output = gr.Plot(label="User contributions") | ||
|
||
with gr.Row(): | ||
leaderboard_output = gr.Dataframe( | ||
label="Leaderboard", headers=["User", "Annotations"] | ||
) | ||
|
||
demo.load( | ||
update_dashboard, | ||
inputs=[datasets_dropdown], | ||
outputs=[gauge_output, treemap_output, leaderboard_output], | ||
every=5, | ||
) | ||
|
||
datasets_dropdown.change( | ||
update_dashboard, | ||
inputs=[datasets_dropdown], | ||
outputs=[gauge_output, treemap_output, leaderboard_output], | ||
) | ||
|
||
|
||
if __name__ == "__main__": | ||
demo.launch() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
gradio | ||
argilla @ git+https://github.com/argilla-io/argilla.git@develop#subdirectory=argilla | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should not forget to update this |
||
plotly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.