-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting ORC-WER breakdown #6
Comments
At the moment, it is not implemented. But we planned to integrate it. Since you are interested, I will take care to implement it. Do you have a suggestion, how it should be obtained? ins/del/sub aren't unique values. |
One way would be to get the reference |
Thanks for the pointer, I forgot that kaldialign exists. The code will be much cleaner with kaldialign, then a wrapper around kaldi could be, since adding a pip dependency is simple. |
Now the code uses kaldialign to calculate insertions, deletions and substitutions. Hence, the generated files have now this information: .../meeteval/example_files$ python -m meeteval.wer orcwer -h 'hyp*.stm' -r 'ref*.stm'
Wrote: .../hyp_orcwer_per_reco.json
Wrote: .../hyp_orcwer.json .../meeteval/example_files$ cat hyp_orcwer.json
{
"errors": 18,
"length": 184,
"insertions": 0,
"deletions": 14,
"substitutions": 4,
"error_rate": 0.09782608695652174
} .../meeteval/example_files$ cat hyp_orcwer_per_reco.json
{
"recordingA": {
"errors": 4,
"length": 124,
"insertions": 0,
"deletions": 0,
"substitutions": 4,
"error_rate": 0.03225806451612903,
"assignment": [
"Alice",
"Alice",
"Bob",
"Bob",
"Alice",
"Alice",
"Alice",
"Alice"
]
},
"recordingB": {
"errors": 14,
"length": 60,
"insertions": 0,
"deletions": 14,
"substitutions": 0,
"error_rate": 0.23333333333333334,
"assignment": [
"Bob",
"Bob",
"Alice",
"Alice"
]
} Let me know, when you need more features or if a modification could simplify an integration in another framework. |
Is there a simple way to get the WER break-down into ins/del/sub when computing the ORC-WER?
The text was updated successfully, but these errors were encountered: