Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added language penalizer #114

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion llm_eval/utils/metrics.py
Original file line number Diff line number Diff line change
@@ -1 +1,22 @@
# What code was I trying to put in here?
# What code was I trying to put in here?
# probably.. THE LANGUAGE PENALIZER

from langdetect import detect

def language_penalizer(text: str, target_lang: str = 'ko') -> float:
"""
Returns 1.0 if the detected language of the input text matches the target language,
otherwise returns 0.0. In cases where language detection fails due to text length or
encoding issues, the function returns 0.0.

Parameters:
- text (str): The text whose language is to be detected.
- target_lang (str): The target language code to compare against (default is 'ko' for Korean).

Returns:
- float: 1.0 if the detected language matches the target language; otherwise, 0.0.
"""
try:
return 1.0 if detect(text) == target_lang else 0.0
except:
return 0.0