Most compatible PHP implementation of OpenAI's original Tiktoken.
Requires PHP 8.1+
Install Tiktoken via Composer package manager.
composer require rahul900day/tiktoken-php
Model | Supported |
---|---|
GPT-3 | ✅ |
GPT-3.5 & 4 | ✅ |
GPT-4o | ❌ |
use Rahul900day\Tiktoken\Tiktoken;
$encoder = Tiktoken::getEncodingForModel('gpt-4');
$encoder->encode("hello world aaaaaaaaaaaa");
$encoder->decode([9906, 4435]);
use Rahul900day\Tiktoken\Tiktoken;
$encoder = Tiktoken::getEncodingForModel('gpt-4');
$encoder->encode('<|endoftext|>', allowedSpecial: 'all');
Tiktoken always cache the server's responses when downloading them.
By default it uses the system's default temporary directory to cache a response but you
can still overwrite the cache location by setting TIKTOKEN_CACHE_DIR
environment variable.
use Rahul900day\Tiktoken\Encodings\OpenAiPublic\Cl100KBaseEncoding;
class Cl100KIm extends Cl100KBaseEncoding
{
protected function getName(): string
{
return 'cl100k_im';
}
protected function getSpecialTokens(): array
{
return [
...parent::getSpecialTokens(),
"<|im_start|>" => 100264,
"<|im_end|>" => 100265,
];
}
}
use Rahul900day\Tiktoken\Registry;
use Rahul900day\Tiktoken\Tiktoken;
Registry::registerCustomEncoding('cl100k_im', new Cl100KIm);
$encoding = Tiktoken::getEncoding('cl100k_im');
// Expect: 100264
$encoding->encode("<|im_start|>", allowedSpecial: 'all');
Please see CHANGELOG for more information on what has changed recently.
Please review our security policy on how to report security vulnerabilities.
This package is released under the MIT License.