Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Add token counting to BaseLM #437

Merged
merged 1 commit into from
Feb 23, 2024

Conversation

KCaverly
Copy link
Collaborator

@KCaverly KCaverly commented Feb 23, 2024

With a few projects discussed including Optimization Dry Runs #397 , and Token Budgeting. I thought we may want to add token counting functionality directly in at the BaseLM level within the refactor.

This change includes two small changes, changes to the abstract methods at the BaseLM, and token counting implementation for LiteLLM.

class BaseLM(BaseModel, ABC):
    ...
    
    @abstractmethod
    def count_tokens(self, prompt: str) -> int:
        """counts the number of tokens for a specific prompt."""
        ...

The good news is LiteLLM makes this super simple, and no additional attributes are needed to make this work.

from dspy.backends.lm.litellm import LiteLLM

lm = LiteLLM(model="gpt-3.5-turbo")

# returns 13, the number of tokens in a string.
lm.count_tokens("hello this is a small test")

If there is another place, token counting would be preferred, I am happy to move this functionality over, just thought I would give us an easy starting point.

@CyrusOfEden

@CyrusNuevoDia CyrusNuevoDia merged commit 1fb1f85 into stanfordnlp:backend-refactor Feb 23, 2024
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants