Workflow runs · stanford-crfm/helm

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows

4,139 workflow runs

Scenario tests Scenario tests #257: Scheduled

February 2, 2025 15:34

7m 15s main

main

February 2, 2025 15:34

7m 15s

Scenario tests Scenario tests #256: Scheduled

February 1, 2025 15:34

7m 20s main

main

February 1, 2025 15:34

7m 20s

Add Phi 3.5 models (#3306) Test #7948: Commit 228e0f1 pushed by yifanmai

February 1, 2025 04:01

9m 39s main

main

February 1, 2025 04:01

9m 39s

Add Mistral Small 3 model Test #7947: Pull request #3308 opened by yifanmai

February 1, 2025 01:24

10m 41s yifanmai/mistral-small-3

yifanmai/mistral-small-3

February 1, 2025 01:24

10m 41s

Add QwQ model on Together AI Test #7946: Pull request #3307 opened by yifanmai

February 1, 2025 01:13

9m 46s yifanmai/qwq

yifanmai/qwq

February 1, 2025 01:13

9m 46s

Add Phi 3.5 models Test #7945: Pull request #3306 synchronize by yifanmai

February 1, 2025 01:13

9m 46s yifanmai/phi-3.5

yifanmai/phi-3.5

February 1, 2025 01:13

9m 46s

Add Phi 3.5 models Test #7944: Pull request #3306 opened by yifanmai

February 1, 2025 00:58

10m 7s yifanmai/phi-3.5

yifanmai/phi-3.5

February 1, 2025 00:58

10m 7s

Add Deepseek-R1 model Test #7943: Pull request #3305 opened by yifanmai

February 1, 2025 00:18

12m 19s yifanmai/deepseek-r1

yifanmai/deepseek-r1

February 1, 2025 00:18

12m 19s

Add o3-mini model Test #7942: Pull request #3304 opened by yifanmai

January 31, 2025 23:08

9m 43s yifanmai/openai-o3

yifanmai/openai-o3

January 31, 2025 23:08

9m 43s

Add helpdesk call summarization scenario (#3303) Test #7941: Commit f6a9856 pushed by yifanmai

January 31, 2025 19:50

10m 32s main

main

January 31, 2025 19:50

10m 32s

Add helpdesk call summarization scenario Test #7940: Pull request #3303 opened by yifanmai

January 31, 2025 19:50

12m 20s yifanmai/helpdesk-call-summarization

yifanmai/helpdesk-call-summarization

January 31, 2025 19:50

12m 20s

Scenario tests Scenario tests #255: Scheduled

January 31, 2025 15:34

8m 31s main

main

January 31, 2025 15:34

8m 31s

Add Financial Phrasebank scenario Test #7939: Pull request #3302 synchronize by yifanmai

January 30, 2025 22:20

14m 5s yifanmai/financial-phrasebank

yifanmai/financial-phrasebank

January 30, 2025 22:20

14m 5s

Add Financial Phrasebank scenario Test #7938: Pull request #3302 opened by yifanmai

January 30, 2025 22:19

9m 56s yifanmai/financial-phrasebank

yifanmai/financial-phrasebank

January 30, 2025 22:19

9m 56s

Add support to redact model outputs Test #7937: Pull request #3301 synchronize by MiguelAFH

January 30, 2025 19:42

6m 52s redact-output

redact-output

January 30, 2025 19:42

6m 52s

Add support to redact model outputs Test #7936: Pull request #3301 synchronize by MiguelAFH

January 30, 2025 19:37

4m 5s redact-output

redact-output

January 30, 2025 19:37

4m 5s

Add support to redact model outputs Test #7935: Pull request #3301 opened by MiguelAFH

January 30, 2025 17:46

3m 36s redact-output

redact-output

January 30, 2025 17:46

3m 36s

Scenario tests Scenario tests #254: Scheduled

January 30, 2025 15:34

7m 58s main

main

January 30, 2025 15:34

7m 58s

Add Spider 1.0 scenario Test #7934: Pull request #3300 opened by yifanmai

January 29, 2025 23:21

10m 16s yifanmai/spider

yifanmai/spider

January 29, 2025 23:21

10m 16s

Add Legal Opinion Sentiment Classification scenario (#3286) Test #7933: Commit 59dcfb1 pushed by yifanmai

January 29, 2025 23:13

10m 24s main

main

January 29, 2025 23:13

10m 24s

Add BIRD SQL scenario (#3292) Test #7932: Commit 92e3ee1 pushed by yifanmai

January 29, 2025 21:56

10m 25s main

main

January 29, 2025 21:56

10m 25s

Update Unitxt tables benchmark schema (#3295) Test #7931: Commit 443f330 pushed by yifanmai

January 29, 2025 21:27

10m 12s main

main

January 29, 2025 21:27

10m 12s

Fix incorrect annotations for Omni-MATH and WildBench for empty outpu… Test #7930: Commit 7b640ed pushed by yifanmai

January 29, 2025 20:05

10m 15s main

main

January 29, 2025 20:05

10m 15s

Fix incorrect annotations for Omni-MATH and WildBench for empty outputs Test #7929: Pull request #3299 opened by yifanmai

January 29, 2025 19:12

11m 51s yifanmai/fix-empty-output-annotators

yifanmai/fix-empty-output-annotators

January 29, 2025 19:12

11m 51s

Scenario tests Scenario tests #253: Scheduled

January 29, 2025 15:35

8m 10s main

main

January 29, 2025 15:35

8m 10s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions

Workflows

Management

All workflows

Actions

Loading...
Loading

All workflows

Filter by Event

Sorry, something went wrong.

Sorry, something went wrong.

No matching events.

Filter by Status

Sorry, something went wrong.

Sorry, something went wrong.

No matching statuses.

Filter by Branch

Sorry, something went wrong.

Sorry, something went wrong.

No matching branches.

Filter by Actor

Sorry, something went wrong.

Sorry, something went wrong.

No matching users.

Actions: stanford-crfm/helm

Actions

All workflows All workflows Actions Loading... Loading Sorry, something went wrong.

All workflows

All workflows

Actions

Loading...
Loading