[New Free Plugin] Tokenizer for GPT-3.5 & GPT-4

Hey fam, this is my first published plugin, super simple but useful for counting the token count of a inputted text.

Link: Tokenizer Plugin | Bubble

What does the Tokenizer do?
It accurately calculates the token count of a given text input using GPT-3 and GPT-4’s byte pair encoding (BPE) method. With seamless integration into your Bubble.io app’s workflows, this plugin helps you better understand text complexity and manage API usage when working with GPT-3 and GPT-4 powered applications.

Built using this package suggested by OpenAI - gpt-3-encoder - npm

Example of use:

  • Inform users/you workflows when the text input exceeds a certain token count, prompting them/our workflows to shorten the text or break it into smaller chunks.
  • Estimate the size of an API call before sending it, helping you manage API usage and avoid unexpected costs, and avoiding API errors.

How to use:

  • Install plugin
  • In workflows find ‘Get token count’.
  • The input is for your text
  • The output comes out in following workflow step using the expression “Results of step ‘#’ (Get token count)'s token count”

Putting more than 10k word into it will potentially timeout the plugin, so test really large docs before putting in production and have measures in place if it fails. But for most use cases 10,000 words is enough!

The plugin page hasn’t provided a link yet, so just search for it.

Hope someone find its useful, but I built it for me so I’m happy regardless :grin:


I love it when someone has already done the thing I was looking for… will test it out now!


Great to hear I hope it helps! Its not super fast but still very useful over 2 weeks I have used it.

Seperate question… but since youve done the counting.

What do you use to do the splitting of docs? If you’ve needed to do this for ‘chunking’?

This post is what you want I reckon - Langchain with vector database connected to Bubble - #46 by jeffbuze

The whole thread is good

Many thanks!!!