[PLUGIN] - Hugging Face

Hi all,

Bubble finally has a Hugging Face plugin!

This plugin opens up a door way to access hundreds of AI models hosted through Hugging Face platform. It essentially allows you to interact with these models in one of two ways, through use of the serverless inference API or by means of a transformer. The inference API option generally performs quicker than using a transformer, but not all models will be available for each inference type.

Demo page is here:
https://paul-testing-4.bubbleapps.io/version-test/hugging_face

Editor is here:
Paul-testing-4 | Bubble Editor

Instructions are here:
https://paul-testing.bubbleapps.io/version-test/hugging_face_instructions


As an example, this image shows the response from a model using the ‘Object Detection’ task.

If you decide to use this plugin, expect errors to begin with because it may take a bit of testing and playing around with the options before you finally get things working. One of the biggest challenges I’ve had was that a lot of models have different requirements, so some models that might accept a file for example will want it passed as a URL whilst others may want the base64 data provided. Because of things like this, there are a number of options within each action that allow you to choose.

Be sure to setup the error event, it’s called “error has occurred” and keep tabs on the “error” state for details. As always, if you run into any problems then just shout and I’ll be available to help where I can.

I’ve created a small error section within the instructions page that I will probably expand on over time.
https://paul-testing.bubbleapps.io/version-test/hugging_face_instructions?debug_mode=true&tab=5


Actions

There are over 25 actions where each is responsible for running models under a specific task. Tasks are what Hugging Face use to categorize the models into. I’ve outlined a list of these actions below to give a quick insight into their names and what they do.

  • Chat Completion
    This action runs the ‘Chat Completion’ task which is a specific type of text generation that focuses on generating responses to user input within a conversational environment.

  • Text Generation
    This action runs the ‘Text Generation’ task which is a general purpose task that involves generating new text based on a given prompt. It can predict the next word, sentences or entire paragraphs.

  • Text Classification
    This action runs the ‘Text Classification’ task which assigns a label or class to a given text. Depending on the functionality of the model, it can be used for cases such as sentiment analysis, natural language inference, assessing grammatical correctness etc.

  • Audio Classification
    This action runs the ‘Audio Classification’ task which assigns a label or class to a given audio. Depending on the functionality of the model, it can be used for cases such as recognizing specific commands, the emotions of a statement, identifying speakers etc.

  • Audio To Audio
    This action runs the ‘Audio To Audio’ task which accepts an audio file and outputs one or more generated audios. Some example models are used for speech enhancement, source separation, language conversions etc.

  • Automatic Speech Recognition
    This action runs the ‘Automatic Speech Recognition’ task which accepts audio and transcribes into text. This is also known as Speech to Text (STT).

  • Fill Mask
    This action runs the ‘Fill Mask’ task which attempts to predict missing words in a sentence.

  • Summarization
    This action runs the ‘Summarization’ task which attempts to produce a shorter version of the content whilst preserving important information.

  • Question Answering
    This action runs the ‘Question Answering’ task which can retrieve answers to questions from a given text.

  • Table Question Answering
    This action runs the ‘Table Question Answering’ task which is the answering of a question about information within a given table or JSON data structure.

  • Token Classification
    This action runs the ‘Token Classification’ task which can be used for sentence parsing, either grammatical or Named Entity Recognition (NER) to understand keywords contained within text.

  • Translation
    This action runs the ‘Translation’ task which can be used for converting text from one language to another.

  • Sentence Similarity
    This action runs the ‘Sentence Similarity’ task (also known as feature extraction when using transformers) which can be used for calculating the semantic similarity between one text and a list of other sentences.

  • Text To Speech
    This action runs the ‘Text To Speech’ task (TTS) which generates natural-sounding speech from text input.

  • Image Classification
    This action runs the ‘Image Classification’ task which assigns a label or class to a given image along with a probability score.

  • Object Detection
    This action runs the ‘Object Detection’ task which detects objects within an image and returns labels with corresponding bounding boxes and probability scores.

  • Image Segmentation
    This action runs the ‘Image Segmentation’ task which detects segments within an image and returns labels with corresponding bounding boxes and probability scores.

  • Image To Text
    This action runs the ‘Image To Text’ task which outputs text from a given image, commonly used for captioning or optical character recognition.

  • Text To Image
    This action runs the ‘Text To Image’ task which creates an image from a text prompt.

  • Image To Image
    This action runs the ‘Image To Image’ task which will can transform a source image through a variety of ways. Depending on the functionality of the model, it can be used for cases such as super-resolution, image inpainting, colorization, inference etc.

  • Visual Question Answering
    This action runs the ‘Visual Question Answering’ task which answers open-ended questions based on an image.

  • Document Question Answering
    This action runs the ‘Document Question Answering’ task which answers questions on document images.

  • Tabular Regression
    This action runs the ‘Tabular Regression’ task which predicts a numerical value given a set of attributes.

  • Tabular Classification
    This action runs the ‘Tabular Classification’ task which classifies a target category (a group) based on set of attributes.

  • Zero Shot Classification
    This action runs the ‘Zero Shot Classification’ task which checks how well an input text fits into a set of labels you provide.

  • Zero Shot Image Classification
    This action runs the ‘Zero Shot Image Classification’ task which checks how well an input image fits into a set of labels you provide.

  • Depth Estimation
    This action runs the ‘Depth Estimation’ task which predicts the depth of objects within an image.


The serverless inference API requires an access token from Hugging Face, then the associated actions will be able to invoke a model and return it’s data.

Using a transformer will download and interact with the models directly within the browser (requires no access token) but this will require a worker file to be uploaded to your root files directory to offload the computations and optimize performance.

If you need to setup the worker file, refer to these instructions:
https://paul-testing.bubbleapps.io/version-test/hugging_face_instructions?tab=6

I’m hoping the plugin will prove to be a valuable tool for all to use.

Paul

2 Likes

Wow. what a plugin! :hugs:

Looking forward to having a go with this and see what can be built.

1 Like