Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnose and resolve field scores > 1 #9

Open
jamesdunham opened this issue Jun 27, 2023 · 1 comment
Open

Diagnose and resolve field scores > 1 #9

jamesdunham opened this issue Jun 27, 2023 · 1 comment
Assignees

Comments

@jamesdunham
Copy link
Member

@jmelot noticed that we observe some paper-field scores >> 1, which is unexpected. Field scores should be in [0, 1]. They're the average of available fasttext, tf-idf, and entity scores, where:

@jamesdunham
Copy link
Member Author

  • There are 1,891,673 aberrant field scores out of 77,492,102,585
  • They occur across fields (e.g., in Arithmetic, Zoology)
  • Values range from 1.0000001192092896 to 1.1e+36
  • Offending scores can be found in jd1881_sandbox.fos_v2_scores_gt_1, which contains the result of
  SELECT
    merged_id, 
    name,
    field.score AS score
  FROM fields_of_study_v2.field_scores, UNNEST (fields) AS field
  LEFT JOIN fields_of_study_v2.field_meta ON field_id = field.id
  WHERE score > 1
  ORDER BY score DESC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant