[Suggestion] Add token counting to BaseLM #437

KCaverly · 2024-02-23T15:39:26Z

With a few projects discussed including Optimization Dry Runs #397 , and Token Budgeting. I thought we may want to add token counting functionality directly in at the BaseLM level within the refactor.

This change includes two small changes, changes to the abstract methods at the BaseLM, and token counting implementation for LiteLLM.

class BaseLM(BaseModel, ABC):
    ...
    
    @abstractmethod
    def count_tokens(self, prompt: str) -> int:
        """counts the number of tokens for a specific prompt."""
        ...

The good news is LiteLLM makes this super simple, and no additional attributes are needed to make this work.

from dspy.backends.lm.litellm import LiteLLM

lm = LiteLLM(model="gpt-3.5-turbo")

# returns 13, the number of tokens in a string.
lm.count_tokens("hello this is a small test")

If there is another place, token counting would be preferred, I am happy to move this functionality over, just thought I would give us an easy starting point.

@CyrusOfEden

feat: add token counting to BaseLM

a0bf926

CyrusNuevoDia merged commit 1fb1f85 into stanfordnlp:backend-refactor Feb 23, 2024
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Suggestion] Add token counting to BaseLM #437

[Suggestion] Add token counting to BaseLM #437

KCaverly commented Feb 23, 2024 •

edited

Loading

[Suggestion] Add token counting to BaseLM #437

[Suggestion] Add token counting to BaseLM #437

Conversation

KCaverly commented Feb 23, 2024 • edited Loading

KCaverly commented Feb 23, 2024 •

edited

Loading