enkota / tokenizer-x

OpenAI token calculator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Calculates the OpenAI tokens for the given prompt

Latest Version on Packagist

GitHub Code Style Action Status Total Downloads

TokenizerX

TokenizerX is a Laravel package designed to streamline tokenization processes in your applications. With the latest update, TokenizerX now supports cutting-edge GPT-4 models, providing advanced natural language processing capabilities.

TokenzierX is a Laravel package that calculates the tokens required for a given prompt before requesting the OpenAI REST API. This package helps to ensure that the user does not exceed the OpenAI API token limit and can generate accurate responses.

To access the OpenAI Rest API, you may consider the beautiful Laravel Package OpenAI PHP

Supported OpenAI Models

  • gpt-4
  • gpt-3.5-turbo
  • text-davinci-003
  • text-davinci-002
  • text-davinci-001
  • text-curie-001
  • text-babbage-001
  • text-ada-001
  • davinci
  • curie
  • babbage
  • ada
  • code-davinci-002
  • code-davinci-001
  • code-cushman-002
  • code-cushman-001
  • davinci-codex
  • cushman-codex
  • text-davinci-edit-001
  • code-davinci-edit-001
  • text-embedding-ada-002
  • text-similarity-davinci-001
  • text-similarity-curie-001
  • text-similarity-babbage-001
  • text-similarity-ada-001
  • text-search-davinci-doc-001
  • text-search-curie-doc-001
  • text-search-babbage-doc-001
  • text-search-ada-doc-001
  • code-search-babbage-code-001
  • code-search-ada-code-001

Supported Encoding

  • r50k_base
  • p50k_base
  • p50k_edit
  • cl100k_base

Installation

You can install the package via composer:

composer require rajentrivedi/tokenizer-x

Usage

By default package will condsider gpt-3 model

use Rajentrivedi\TokenizerX\TokenizerX;
TokenizerX::count("how are you?");

If You want token counts for specific OpenAI model, you can pass model as a second argument from above given supported model list.

use Rajentrivedi\TokenizerX\TokenizerX;
TokenizerX::count("how are you?", "gpt-4");

You can also read the text from file

TokenizerX::count(file_get_contents('path_to_file'));

Please make sure that text of the file don't change while reading the file programmatically, this may happen due to encoding. You can check the generated token Ids by using following

TokenizerX::tokens(file_get_contents('path_to_file'));

This will return an array of tokens generated & compare those token Ids with OpenAI Tokenizer

You can also use the OpenAI Tokenizer to double-check package generated token counts.

Testing

composer test

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

Please review our security policy on how to report security vulnerabilities.

⭐ Star the Repository ⭐

If you find this project useful or interesting, I kindly request you to give it a ⭐ star on GitHub. Your support will encourage and motivate me to continue improving and maintaining this project.

By starring the repository, you can show appreciation for the work put into developing this open-source project. It also helps to increase its visibility, making it more accessible to other developers and potentially attracting contributors.

To give a ⭐ star, simply click on the Star button at the top-right corner of the repository page.

Credits

License

TokenizerX is developed using

The MIT License (MIT). Please see License File for more information.

About

OpenAI token calculator

License:MIT License


Languages

Language:PHP 100.0%