Hi all,
Bubble finally has a Hugging Face plugin!
This plugin opens up a door way to access hundreds of AI models hosted through Hugging Face platform. It essentially allows you to interact with these models in one of two ways, through use of the serverless inference API or by means of a transformer. The inference API option generally performs quicker than using a transformer, but not all models will be available for each inference type.
Demo page is here:
https://paul-testing-4.bubbleapps.io/version-test/hugging_face
Editor is here:
Paul-testing-4 | Bubble Editor
Instructions are here:
https://paul-testing.bubbleapps.io/version-test/hugging_face_instructions
As an example, this image shows the response from a model using the âObject Detectionâ task.
If you decide to use this plugin, expect errors to begin with because it may take a bit of testing and playing around with the options before you finally get things working. One of the biggest challenges Iâve had was that a lot of models have different requirements, so some models that might accept a file for example will want it passed as a URL whilst others may want the base64 data provided. Because of things like this, there are a number of options within each action that allow you to choose.
Be sure to setup the error event, itâs called âerror has occurredâ and keep tabs on the âerrorâ state for details. As always, if you run into any problems then just shout and Iâll be available to help where I can.
Iâve created a small error section within the instructions page that I will probably expand on over time.
https://paul-testing.bubbleapps.io/version-test/hugging_face_instructions?debug_mode=true&tab=5
Actions
There are over 25 actions where each is responsible for running models under a specific task. Tasks are what Hugging Face use to categorize the models into. Iâve outlined a list of these actions below to give a quick insight into their names and what they do.
-
Chat Completion
This action runs the âChat Completionâ task which is a specific type of text generation that focuses on generating responses to user input within a conversational environment. -
Text Generation
This action runs the âText Generationâ task which is a general purpose task that involves generating new text based on a given prompt. It can predict the next word, sentences or entire paragraphs. -
Text Classification
This action runs the âText Classificationâ task which assigns a label or class to a given text. Depending on the functionality of the model, it can be used for cases such as sentiment analysis, natural language inference, assessing grammatical correctness etc. -
Audio Classification
This action runs the âAudio Classificationâ task which assigns a label or class to a given audio. Depending on the functionality of the model, it can be used for cases such as recognizing specific commands, the emotions of a statement, identifying speakers etc. -
Audio To Audio
This action runs the âAudio To Audioâ task which accepts an audio file and outputs one or more generated audios. Some example models are used for speech enhancement, source separation, language conversions etc. -
Automatic Speech Recognition
This action runs the âAutomatic Speech Recognitionâ task which accepts audio and transcribes into text. This is also known as Speech to Text (STT). -
Fill Mask
This action runs the âFill Maskâ task which attempts to predict missing words in a sentence. -
Summarization
This action runs the âSummarizationâ task which attempts to produce a shorter version of the content whilst preserving important information. -
Question Answering
This action runs the âQuestion Answeringâ task which can retrieve answers to questions from a given text. -
Table Question Answering
This action runs the âTable Question Answeringâ task which is the answering of a question about information within a given table or JSON data structure. -
Token Classification
This action runs the âToken Classificationâ task which can be used for sentence parsing, either grammatical or Named Entity Recognition (NER) to understand keywords contained within text. -
Translation
This action runs the âTranslationâ task which can be used for converting text from one language to another. -
Sentence Similarity
This action runs the âSentence Similarityâ task (also known as feature extraction when using transformers) which can be used for calculating the semantic similarity between one text and a list of other sentences. -
Text To Speech
This action runs the âText To Speechâ task (TTS) which generates natural-sounding speech from text input. -
Image Classification
This action runs the âImage Classificationâ task which assigns a label or class to a given image along with a probability score. -
Object Detection
This action runs the âObject Detectionâ task which detects objects within an image and returns labels with corresponding bounding boxes and probability scores. -
Image Segmentation
This action runs the âImage Segmentationâ task which detects segments within an image and returns labels with corresponding bounding boxes and probability scores. -
Image To Text
This action runs the âImage To Textâ task which outputs text from a given image, commonly used for captioning or optical character recognition. -
Text To Image
This action runs the âText To Imageâ task which creates an image from a text prompt. -
Image To Image
This action runs the âImage To Imageâ task which will can transform a source image through a variety of ways. Depending on the functionality of the model, it can be used for cases such as super-resolution, image inpainting, colorization, inference etc. -
Visual Question Answering
This action runs the âVisual Question Answeringâ task which answers open-ended questions based on an image. -
Document Question Answering
This action runs the âDocument Question Answeringâ task which answers questions on document images. -
Tabular Regression
This action runs the âTabular Regressionâ task which predicts a numerical value given a set of attributes. -
Tabular Classification
This action runs the âTabular Classificationâ task which classifies a target category (a group) based on set of attributes. -
Zero Shot Classification
This action runs the âZero Shot Classificationâ task which checks how well an input text fits into a set of labels you provide. -
Zero Shot Image Classification
This action runs the âZero Shot Image Classificationâ task which checks how well an input image fits into a set of labels you provide. -
Depth Estimation
This action runs the âDepth Estimationâ task which predicts the depth of objects within an image.
The serverless inference API requires an access token from Hugging Face, then the associated actions will be able to invoke a model and return itâs data.
Using a transformer will download and interact with the models directly within the browser (requires no access token) but this will require a worker file to be uploaded to your root files directory to offload the computations and optimize performance.
If you need to setup the worker file, refer to these instructions:
https://paul-testing.bubbleapps.io/version-test/hugging_face_instructions?tab=6
Iâm hoping the plugin will prove to be a valuable tool for all to use.
Paul