Skip to content

The AI Alliance project to define a reference stack for AI model and system evaluation, with evaluations, benchmarks, and leaderboards.

Notifications You must be signed in to change notification settings

The-AI-Alliance/trust-safety-evals

Repository files navigation

README

Published Documentation

This repo contains the code and documentation for the AI Alliance Trust and Safety Evaluations Initiative, which defines a reference stack for AI model and system evaluation, with evaluations, benchmarks, and leaderboards.

See the website for a more detailed description of this initiative.

At this time, only the "skeleton" documentation is provided. We will be making initial commits of code, etc. soon.

Getting Involved

We welcome contributions as PRs. Please see our Alliance community repo for general information about contributing to any of our projects and initiatives. This section provides some specific details you need to know.

In particular, see the AI Alliance CONTRIBUTING instructions. You will need to agree with the AI Alliance Code of Conduct.

All code contributions are licensed under the Apache 2.0 LICENSE (which is also in this repo, LICENSE.Apache-2.0).

All documentation contributions are licensed under the Creative Commons Attribution 4.0 International (which is also in this repo, LICENSE.CC-BY-4.0).

All data contributions are licensed under the Community Data License Agreement - Permissive - Version 2.0 (which is also in this repo, LICENSE.CDLA-2.0).

We use the "Developer Certificate of Origin" (DCO).

Warning

Before you make any git commits with changes, understand what's required for DCO.

See the Alliance contributing guide section on DCO for details. In practical terms, supporting this requirement means you must use the -s flag with your git commit commands.

About the Code

Some code for this initiative will be kept in this repo, but other code will be kept in separate repos. This section will provide links to the relevant locations as they are added.

SafetyBAT Leaderboard

This leaderboard has a Hugging Face git repo that mirrors two other repos as follows.

To make updates to this code, use the following procedure.

Install support for large files:

git lfs install

Clone the tse-ibm-benchbench repo:

git clone [email protected]:The-AI-Alliance/tse-ibm-benchbench.git
cd tse-ibm-benchbench

Add the HF repo as an upstream repo. Note the use of the name hf-upstream:

git remote add hf-upstream [email protected]:spaces/aialliance/safetybat

Fetch all the branches for both forks:

git fetch --all --prune

Now you can work locally and push changes upstream. Consider the scenario where you want to compare the two main branches to see what might be out of date and push changes upstream:

git checkout main    # make sure you are in the tse-* main branch.
git diff hf-upstream/main

To push the latest from tse-ibm-benchbench upstream to the Hugging Face repo fork, do the following:

git checkout hf-upstream/main
git merge origin/main
git commit -s -m 'description' .
git push --all  # push all branches that have changed.

(The -s flag is for signoff, required for DCO, discussed above.)

Similarly, after you make edits to tse-ibm-benchbench and want to push them upstream:

git checkout hf-upstream/main
git merge origin/main
git commit -s -m 'description' .
git push --all  # push all branches that have changed.

About the Documentation

About the GitHub Pages Website Published from this Repo

The website is published using GitHub Pages, where the pages are written in Markdown and served using Jekyll. We use the Just the Docs Jekyll theme.

See GITHUB_PAGES.md for more information.

Note

As described above, all documentation is licensed under Creative Commons Attribution 4.0 International. See LICENSE.CDLA-2.0).

About

The AI Alliance project to define a reference stack for AI model and system evaluation, with evaluations, benchmarks, and leaderboards.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •