[PLUGIN] PDF Textractor (Free & PRO) - The Ultimate AI-Ready Text & Metadata Extractor! 🤖

tiagoncpereira · April 24, 2026, 1:32pm

Hey Bubblers!

If you are building AI apps, RAG systems, or document analyzers, you know the pain of dealing with PDFs. Sending whole PDF files to OpenAI or Anthropic is slow, prone to errors, and burns through your API tokens rapidly.

I built PDF Textractor to solve this exact problem. It converts digital PDFs into raw, clean text server-side before you ever touch an AI API. And today, I’m launching two versions!

PDF Textractor (FREE) The leanest, fastest way to get text out of a PDF.

Pass a PDF URL → Get a single string of clean text.
100% Server-side and completely free forever.

PDF Textractor PRO (For Advanced Workflows) Designed to save you money on AI tokens and give you deep document insights.

Specific Page Extraction: Only need the summary on page 3 of a 50-page report? Just input “3”. Need specific ranges? Input “1-3, 7”. You extract (and pay AI for) only what you need!
Hidden Metadata: Instantly extract the Page Count, Title, Author, and Creation Date to auto-populate your database.
Bulletproof: Advanced error handling so your app never crashes on a bad file.

Use Cases:

Feeding clean, targeted text to ChatGPT / Claude prompts.
Creating searchable text indexes in your database.
Setting up conditional workflows (e.g., “If PDF is > 10 pages, trigger a background task”).

Links:

Free Version
Pro Version

Check them out on the marketplace and let me know what you think! I’m already planning Phase 2 (Client-side extraction & OCR), so feedback is super welcome.

Happy building!

tiagoncpereira · April 30, 2026, 1:38pm

Hey Bubblers!

If you build AI apps, document parsers, or accounting tools, you know the struggle of extracting text from PDFs. Running massive PDFs on Bubble’s servers eats up your Workload Units (WUs) and often results in 14-second timeout errors. And if the PDF is a scanned invoice? You just get empty text.

Today, I’m thrilled to announce the V2 “God Mode” Update for PDF Textractor PRO!

We have completely rebuilt the architecture to bring Enterprise-level processing directly to your user’s browser.

Here is what makes V4 a game-changer:

Zero-WU Client-Side Engine We introduced a new visual element that does all the heavy lifting in the user’s browser. You can now extract 500-page documents in seconds without touching your Bubble server capacity.

Hybrid AI OCR (Tesseract.js) No more empty results! If the plugin detects that a page is a scanned image (like a photo of a receipt), it automatically fires up the OCR engine to read the pixels.

The “Force OCR” Superpower Dealing with complex scientific papers with 2 columns and side-notes? Normal extractors merge the columns horizontally into a messy “word soup”. Just turn on Force OCR, and our AI will visually segment the blocks, keeping the reading order absolutely perfect!

Live Progress UI Because OCR can take a few seconds on large documents, the engine exposes a live progress_percent state. You can finally build those beautiful, Netflix-style loading bars while your users wait!

Best of both worlds: The plugin still includes the Server-Side action for your backend workflows and webhooks. You choose the right tool for the job!

Check out the update and let me know what you think below! Happy building!

tiagoncpereira · May 4, 2026, 2:10pm

Hey Bubble community!

If you are building an ERP, an accounting app, or an AI-wrapper that handles heavy document workflows, you know the struggle. Setting up recursive backend workflows just to extract text from 50 invoices or contracts is a massive headache—and it absolutely devours your server capacity (WUs).

Today, I’m thrilled to announce V3 of PDF Textractor (Pro), and it brings the exact feature you’ve been asking for: The Batch Processor!

We’ve introduced a brand new Client-Side action called Extract Multiple Documents. Instead of processing PDFs one by one, you can now pass an entire List of PDF URLs directly into a single workflow action. The plugin will automatically build a queue, process them sequentially in the user’s browser, and return a clean List of Texts!

What makes the V3 Batch Processor a game-changer?

Automatic Queue Management: Feed it 10, 20, or 50 PDFs at once. The plugin handles the queue automatically behind the scenes without freezing your user’s browser.
Global Progress Tracking: Say goodbye to blind uploads. We added new states (overall_progress_percent, current_file_index, total_files) so you can build beautiful, accurate progress bars that tell your users exactly where they are in the bulk process (e.g., “Processing file 5 of 20 - 45%”).
Zero Server Costs (Save WUs!): Because this massive bulk extraction happens 100% Client-Side, it uses the user’s browser power. Your Bubble WUs are completely safe!

How easy is it to use?

Trigger the Extract Multiple Documents action and pass a list of URLs (e.g., Search for Invoices’s File URL).
Bind a progress bar on your page to PDF Textractor’s overall_progress_percent.
When the new Batch Extraction Finished event triggers, save the extracted_texts_list to your database. The extracted texts are output in the exact same order as your input URLs! Done!

I built this specifically to solve the headaches of bulk document processing in Bubble. Let me know what you think below, and drop any questions or feedback!

Happy building!

tiagoncpereira · May 13, 2026, 1:32pm

Hey Bubblers!

One of the biggest complaints about Document Management apps in Bubble is that they look like 1990s file directories. Listing files as invoice_final_v2.pdf is boring and slows down users.

With PDF Textractor V4, we are introducing a way to make your apps look like a modern SaaS: Client-Side PDF Thumbnails!

How it works: Since we already have a powerful rendering engine for OCR, we’ve unlocked a new action that “takes a photo” of any PDF page.

Features:

Page-to-Image: Convert Page 1 (or any page) into a Base64 image string.
Blazing Fast: No external APIs, no server-side processing. It’s all done in the browser.
UI Overhaul: Perfect for Creating “Document Galleries” or “File Previews” in your repeating groups.
Easy Integration: Just plug the output state into a standard Image element and you’re done!

Check out how much better your app looks when users can actually see what they are about to click on.

Available now in the marketplace! Let me know your thoughts!

tiagoncpereira · June 1, 2026, 1:33pm

Hey everyone!

We are back with an absolute game-changer for anyone building document-heavy SaaS platforms, CRMs, or invoicing tools in Bubble.

Welcome to PDF Textractor Pro V5 – The “Data Miner” Update!

The Problem: Up until now, extracting text from a PDF meant getting a massive, unformatted block of words. If you wanted to pull out specific data points like the client’s email, phone number, or total invoice amounts, you had to write insane custom Regex patterns inside Bubble or spend hard-earned cash sending thousands of tokens to the OpenAI API just to parse a simple receipt.

The V5 Solution: We built a local, lightning-fast regular expression engine directly inside the plugin. Now, you can extract structured data completely natively—with zero setup and ZERO AI API token costs!

Here is what’s new in V5:

Extract Emails Natively: Turn on a checkbox, and get a clean list of all unique email addresses scraped from the PDF.

Extract Phone Numbers: Automatically extracts international contact numbers and formats them into a usable list.

Extract Currency Values: Perfect for processing invoices! Instantly grabs financial figures ($85.00, €93.50, 500 EUR) straight into a Bubble list.

Automated Deduplication: If an email or phone number appears on every single page header or footer, our engine automatically deduplicates it. Your database only gets clean, unique data.

How fast is it?

Because it runs natively within our client-side and server-side workflows, the data mining happens concurrently with the core text extraction. It takes mere milliseconds!

Check out the implementation guide in the plugin documentation, update your workflows to use the new “yes/no” boolean triggers, and start saving money on your AI parsing workflows today!

We’d love to hear your thoughts.

Topic		Replies	Views
New Plugin: AI OCR, Handwriting & Translation for PDFs + Images (using OpenAI) Plugins	0	35	October 24, 2025
[New Plugin] - AI OCR, Handwriting & Translation for PDFs + Images (using OpenAI) Showcase	0	44	October 24, 2025
[NEW PLUGIN] AskYourPDF Plugin \| Chat with your documents and ask any related questions! Showcase	0	256	March 11, 2024
[PLUGIN] OCR With Claude AI Plugins	0	100	May 6, 2025
:newspaper: ᴺᴱᵂ ᴾᴸᵁᴳᴵᴺ AWS Textract - OCR Text & Data [Now with Queries support & Automated AWS Environment Setup] Showcase	18	2741	September 27, 2023

[PLUGIN] PDF Textractor (Free & PRO) - The Ultimate AI-Ready Text & Metadata Extractor! 🤖

Related topics