Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buggy behavior half way into projects (not respecting disfluency setting, missing punctuation). #146

Open
Arn-Thor opened this issue Feb 25, 2025 · 2 comments

Comments

@Arn-Thor
Copy link

Since upgrading to version 0.6 I've observed the following two problems, both occurring half way or more into an audio file project:

  1. The disfluencies setting is not being respected. In fact, it seems to make no difference if it's on or off, in the final portion of the text file "uhms" and "uhs" are prolific. The problem arises occasionally. (when observed, overlapping speech has been checked). See example 1 below.
  2. When there's no problem with disfluencies, the transcription starts omitting commas and full stops. (when observed, overlapping speech has been off) See example 2 below.

Settings:
Language: English
Model: Precise
Mark pause: none
Speaker detection: manual
overlapping speech: varies
disfluencies: varies
timestamps: off

System: Windows 11, Nvidia GeForce RTX 4080 SUPER, Intel i7-12700k, 64GB RAM
Input file: MP3

Example 1: So we could see some expenditure switching in the fiscal stimulus, uh, which would make, you know, the aggregate number and deliver a bit more. So, uh, okay. So we're, uh, coming up close to the end of the hour here. So, uh, last call for, um, any questions from the audience. If people want to throw, um, uh, throw other issues on the table, we'll do our best to, um, we'll do our best to address them.

Example 2: I'm really glad you mentioned that and China specifically because that's also how I feel I mean Europe has outperformed in the last two months since Trump since the Trump election Europe has surprised most people by doing better in most markets than Wall Street and the euro as you said has not gone to parity and below which almost everybody was predicting but actually has strengthened marginally against the dollar and until recently also against the CNH but for all the reasons that you've pushed on to me and actually I've got to concede in the short term that European run may at least be due for a pause if not for an end while we see as you said whether Europe shoots itself in the foot again or actually uses this or responds to this clear pivot point in history to do something more constructive and that but that leaves the rest of the world...

@kaixxx
Copy link
Owner

kaixxx commented Feb 25, 2025

First: I cannot turn disfluencies hard on and off. It's more like a suggestion for the AI. It works by presenting the AI an example how the transcription should look like. If this example contains 'um' and 'uh', the model tends to include them more. If I give no example, the model tends to leave disfluencies out.

I wonder if the differences you observed between noScribe 0.5. and 0.6 have to do with the upgrade to a new, faster transcription model. You can install the older version 2 of the whisper model into noScribe following these instructions: https://github.com/kaixxx/noScribe/wiki/Add-custom-Whisper-models-for-transcription This model was the standard in noScribe up until version 0.5. Some consider this model to be better, albeit it's a little slower.

BTW: 'Overlapping speech' has nothing to do with this. This setting does not influence the transcription at all. It just determines how the text is arranged in the final document.

@Arn-Thor
Copy link
Author

Thanks for the reply! It's useful to get a little more color on how it works. And I must first say I appreciate this piece of software immensely.

Understood on the disfluencies. That explains the inconsistent behavior; the model just gets confused and none of the prompts really matter at that point.

Rolled back to 0.5 and the same files now transcribe as expected, without errors. I'll jump back to 0.6 and try those instructions for a custom model. Thanks for pointing me in that direction!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants