Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Users to Customize Aggregation #206

Open
xehu opened this issue Apr 23, 2024 · 0 comments
Open

Allow Users to Customize Aggregation #206

xehu opened this issue Apr 23, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@xehu
Copy link
Collaborator

xehu commented Apr 23, 2024

Currently, the system automatically "aggregates" features generated about a single chat/message to the conversation and user levels --- calculating various summary statistics for the features (mean, median, max, min, std):

https://github.com/Watts-Lab/team-process-map/blob/main/feature_engine/utils/calculate_conversation_level_features.py

However, aggregating by everything yields thousands of features --- this is way too many! Instead, we should make it possible for the user to specify what they want: for example, maybe they are only interested in the mean function (not mean, median, max, min, AND std...).

There are some design decisions here, but they are relatively simple ones; we simply need to think about how we want the user to specify which aggregations they want. Specifically, we want to think about:

  1. Which levels of aggregation does the user want? (Conversation and User are the options)
  2. Which columns (at the chat level) do they want aggregated?
  3. Which functions do they want to aggregate with (e.g., mean, std...)

Accordingly, we will want to think through the way the user should specify these desires. Here is an example:

  aggregation:
    methods: ["mean", "std"]
    columns: ["column1", "column2"]

There should also be an option to say they want no aggregations at all.

Getting Started

  1. Modify the FeatureBuilder constructor to have the user pass in parameters for whether they want conversation- and user-level aggregations at all; and if so, which aggregations they want to have (which columns, which methods).
  2. Follow the logic through in the utilities where the aggregations take place.
@xehu xehu added the enhancement New feature or request label Apr 23, 2024
@xehu xehu added this to the Release V1 of Team Process Mapping Package milestone Apr 23, 2024
@xehu xehu linked a pull request Aug 7, 2024 that will close this issue
25 tasks
@xehu xehu modified the milestones: Release V1 of Team Process Mapping Package, Improve Package Functionality/Usability Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants