Handling large text files (~16MB) in Bubble – looking for alternative approaches

Hello everyone

I’m working on a Bubble app where users upload large .txt files (~16MB, tens or hundreds of thousands of lines).

Goal:

  • Extract individual lines from the file

  • Match them against a list of texts stored in Bubble

  • Save or display only the matched results

Issue:

  • Bubble cannot reliably extract or loop through text files at this size (timeouts / memory limits)

  • I tried offloading processing to Xano, but even Xano struggles when reading and splitting the entire file in one request

  • The primary problem seems to be converting the full file into text or an array at once

Current direction:

  • Using Make.com or n8n as an intermediary

  • Bubble uploads the file

  • Automation tool splits the file into smaller chunks (200–500 lines)

  • Each chunk is sent to Xano for matching

  • Only filtered results are sent back to Bubble

This works in theory, but I’ve found chunking large text files reliably is trickier than expected.

Questions for the community:

  • Has anyone successfully handled large text files like this in Bubble?

  • Are there Bubble-native patterns or plugins I might be missing?

  • Has anyone used client-side streaming, external workers, or other backends for this?

  • Any ideas with Make / n8n / Xano that you’ve learned the hard way?

I’m happy to trade “pure Bubble” for stability and scalability — just looking for the most reliable production approach.

Thanks in advance :folded_hands:

Here is approach chatgpt recommended, tab2 explains it in a more detailed manner.

If you want the most stable setup, the answer is simple: use a small backend service / worker whose only job is to read the file incrementally.

Bubble uploads the file (Bubble already stores it in S3), then you pass the file URL to this worker. It reads the file line by line or in small byte chunks and sends 200–500-line batches to Xano (or does the matching itself). At no point does anything load the full file into memory.

Make and n8n can still be useful, but they are not truly streaming for this kind of workload. They typically download the whole file and convert it to text before chunking, which means you will eventually hit size, memory, or execution limits as files get larger. Note that n8n webhooks have a hard request size limit of around 16 MB, so with files this size you’re already at the ceiling.

So practically,

  • For maximum stability: Bubble upload → file URL → small backend worker → batched matching → Bubble results

  • If you use Make or n8n: treat them as orchestration tools and expect fragility

Appreciate the response @code-escapee

what kind of backend service/worker can i use?

Can you recommend any for me?

Hi, would you mind sharing exactly the type of file and what you want to accomplish? Can also in a DM.

I am finalizing a n8n competitor which is having a very performant architecture and by design has no problems handling very large files. At least in theory. So would be great to have this use case and see how we can support this.

Without going into detail, our engine is fully llm native which means it can follow llm instructions building workflows and can include plain vanilla scripts also written by llm to process this.

Workflow could be that you post the file to the workflow engine api and when done it can output it to a webhook or something. Chunked or not. Or directly push it to xano.

We did ran all kinds of test with advanced text processing but not more than 100.000 lines which took about 1-2 seconds. So speed will depend a bit on what you need to extract and number of rows.

Hi @sem

I sent you a message to your dm

Looking forward to your response

Thanks

1 Like

not to stop you from using @sem ‘s method but if you want a worker setup Clouflare workers is what I find to be the easiest to set up.

Here’s AI summary on how to deploy it for this situation:
High-level setup:

  • Bubble uploads the file and gets a URL

  • Bubble calls the worker with the file URL and an upload ID

  • The worker streams the file and batches about 200–500 lines

  • Each batch is sent to Xano (or processed in the worker)

  • Xano stores results and tracks progress

Things that matter in practice:

  • Don’t read the entire file into text. That’s what causes the failures.

  • Use an upload ID plus a batch index so retries don’t create duplicates.

  • Handle partial lines between chunks.

  • Add a simple shared secret so random calls don’t hit the worker.

Where to look for details:

  • Cloudflare Workers docs (basic worker + streaming fetch)

  • TextDecoder / streaming examples

  • On the Xano side, a unique constraint on upload ID + batch index is enough for idempotency.

Once you’re streaming the file instead of loading it all at once, the 16MB limit stops being the real problem.

If the user upload the file, while not to have a client side script (that will be the cheapest option) to split the file and send this to a backend WF as a list?

@Jici

Here is the file_url so that you can see the volume of texts i’m dealing with

https://67b303bff66d7e64c7101947e95ee6fb.cdn.bubble.io/f1766014819732x524622417529375000/genome_Kelly_Hauhn_v5_Full_20250905150858.txt

@code-escapee

on what platform i’m i setting up the cloudfare?

I haven’t used cloudfare before so i don’t really know how to go about it

You’d set this up directly on Cloudflare using Cloudflare Workers.
Create a free Cloudflare account → go to Workers & Pages → create a new Worker. You can do it entirely from their dashboard without any prior setup.

The Worker is just a small JS endpoint that Bubble calls with the file URL and an upload ID. It fetches the file, streams it, batches the lines, and sends them to your backend.

Happy to DM you a concrete example and links if that helps.

You can split the file in the browser and send chunks to backend workflows, and it’s usually the cheapest route. The downside is reliability. (you need the browser has to stay open, resuming it is harder, and you still need to handle retries and deduping). It also means more logic and surface area on the client.

For large files where you want it to just run and finish, a small worker is usually more stable since all the chunking happens on server side.

1 Like

Yes this is the downside. You need to wait for the client to have completed the process. However, I think for this case it’s a good solution.

I will do a test to see how much time it can take to process this file…

Take less than a second to split the file. So not a big thing. After, it’s to send in the backend WF and process the data that will take more time. There’s 643161 in this file so this mean 7 Schedule on a list actions.

Pull the data once, match it externally, and push the results back into Bubble. Otherwise, there’s too much WU.

Im actually building a plugin similar for this situation😁

on what platform can i use handle the matching?

Its a bubbleio plugin. Im gonna DM U

Cloudflare Workers is the cheapest option, but the most straightforward answer is… in the cloud.

For matching you could use our vectordoc plugin. Vectordoc Search RAG Plugin | Bubble

okay

plan on checking cloudfare later today