Pretty sure this should be easy: Trouble adding PDF-to-Text response to Bubble database

Hello!

I’m trying to get the “body” data that results from an API call to the “PDF-to-Text” API from PDF.co loaded into my database (in a Bubble data field called “Offer_Text”) so it can be used in a summarizing call to ChatGPT as a first step in processing a bunch of different new outputs. All I need is for the “body” from the PDF-to-Text call to be added into that Offer_Text field. I’ve done it with another PDF to text api that had severe file size limits, but it’s just not working now. Here are shots of the workflow commands that aren’t working:

Step 1: Upload the text to PDF-to-Text:

Step 2: Use the result to create a ChatGPT summary (called Offer_Summary):

Step 3: Populate the Offer_Text and Offer_Summary fields with the results for later use:

As I say, I had this working with another PDF-to-Text API that wasn’t useful because of file size limitations, but it doesn’t seem to work with PDF-to-Text. ChatGPT just keeps summarizing the same thing over and over and PDF-to-Text isn’t not populating the “Offer_Text” field that should be coming from PDF-to-Text.

Here’s a screenshot of the empty Offer_Text, the incorrect Offer_Summary field, and the correct url string field that’s uploading (or should be uploading) to PDF-to-Text:

I’ve initialized the call to the PDF-to-Text API successfully and it sends back text in the “body” section of the raw data. But it Chat doesn’t seem to grab that, instead grabbing the same sample data from my initializing call and summarizing that over and over with slight changes.

I think there’s something obvious that I must be missing.

Any thoughts or ideas would be greatly appreciated. Thanks so much for your help!

I am not sure at which point things are not working. Is it upon the return of the PDF.CO api call? … Or when OAI returns a completion?

I believe it’s when the PDF.co API sends back the data. It looks like Chat isn’t seeing “the result of” that process and instead is resummarizing what’s in the initializing parameter. The reason I know Chat sees the data is that it writes a summary that includes the other three parameters you can see in the workflow (Pain Points 1, 2, and 3). So Chat is seeing the other data in my bubble DB, just not the result of the PDF-to-Text call.

(thanks for following up so Fast!)

One alteration: That first screenshot should have been of the upload to PDF.co as follows, not a repeat of the step 2 screenshot:

Sorry about the confusion.

Capture the result of the body onto your dB to see what is returning. Create a log with a field text and place the body there. Don’t do anything with OAI yet. Just break the problem in pieces and check what happens at different points in the flow.

That’s just it, the result of the body is not being captured in my db, but it’s being generated. I’ve set the body of the PDF.co response to be “Offer_Text” in the DB and you can see in the above screenshot, it’s empty. When I run the step-by-step debugger, the result of the PDF.co call is “(Empty).” Here’s a screenshot of that:

However, all that text you see above the empty field is the live response from Chat that is re-summarizing the sample data from the initializing call with the addition of elements from the current, live call.

And just FYI, I’m pretty sure that the PDF.co call is going through since (1.) I was able to see response data in my initializing call, and (2.) the debugger shows a successful upload to PDF.co of the PDF to be converted. Here are screenshots of both of those:

(1.) The PDF.co initializing call with the raw data response showing text from the PDF successfully translated:

(2.) The debugger showing successful upload of the PDF document to PDF.co.

The resulting “body” data from PDF.co is just not getting to the DB. (I also know PDF.co is doing something because I’m being charged for the “credits” it takes every time I do a PDF conversion even though the data doesn’t come through to my DB.

The pdf.co call may not be having enough time to return the converted text. Run a test without any subsequent actions.

Good thought! I’ve tried looking later, and the data just never arrives in the database, even hours or days after the call.

It just needs to be available in the return …. Then you can choose what to do with it.

How big a pdf are you using?

The test PDF is about 1.3 MB. I’ve made sure PDF.co can handle much larger, though.

You’re right that the next step only needs the result of the PDF.co call (i.e., it doesn’t need to go to the Bubble database first). And in fact, the “result of step 1” is what Bubble should be sending to Chat in Step 2 (Step 1 being the PDF.co translation service. See the workflow screenshot below). But there’s nothing there. That’s the problem.

Here’s the screenshot of step 2 attempting to send the “results of step 1” to Chat:

In that, “(body) Offer_Text” is defined as the result of Step 1 being sent directly to Chat.

I don’t hear that a second test has been attempted.
Consider rebuilding the api call
Try another pdf
Check the logs
Do all of this on another page
Check for privacy rules

Privacy rules! Excellent idea! I’ve run into that before.

Such a great ideas. But unfortunately I had cleared any impeding privacy settings before and it still applies here. Also tested the other thoughts:
I redid the * api call a bunch of ways and nothing worked.

  • Tried multiple longer, shorter, all text, fewer images, etc. PDFs with the same result
  • The logs didn’t tell me anything new.
  • I’ve got this going on three different pages in my app

It’s still a mystery.

One thing I noticed as I was restructuring the API call was that there’s a mismatch between the “inline” and “async” parameters. In the API call it’s “True” “False” as follows:

But in the Workflow it’s “false” “false” as follows:

I tried different combos of those. Got one error from Chat, one that returned the url rather than the body text, and one that returned an ID. But none of them sent any data to the “Offer_Text” field in the database.

Do you have any thoughts on how that could be related (or not)? Seems like there’s something preventing Bubble from placing the data in the DB (like a privacy rule, but not a privacy rule).

Is the text been produced by pdf.co?

Yes. I can see it in the raw data returned when I initialize the call. It’s under “body.”

In case it sparks, any ideas, here is a screenshot of the raw data returned at the API call initialization

Not when it is initialized … on the actual page … is it there? :wink:

Oh! Ha! Yes. When I grab the URL produced by PDF.co, it takes me to the text output. Here’s the URL and a screenshot.

URL:
https://pdf-temp-files.s3.us-west-2.amazonaws.com/UKS7VBYXHH8DI8QOF631RDU5W4E18U4D/MSFT_PDG_Adobe_Sign_ITDM_W2_Ebook_Beyond-the-dotted-line_REVISED.txt?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzEIT%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDFplixPXPcwCEjuXeyKCAWYfS8N9HamXAG560gjYESmTl5KPJhQk%2F4oYgL0Pt3qZ%2Bf3K7yjoHLOFfoxW6QIXmhEE7XFDpYDl8WE1EahcnFMRMrVbEc5a3lCrvNCzUY4my19Ovb%2BqZM9Zoa8tshaWLqr0jpKo9adpX17G8qqQS7zlYCXBPyohyDCZxc64fGJmX4Mo3KDDrAYyKLJ77RRteNF90mJEBey6cd0%2BCr5tCkPAvvrZs0DBn5jTc%2FfPIFYMiRQ%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHECFBEDOG/20231231/us-west-2/s3/aws4_request&X-Amz-Date=20231231T044419Z&X-Amz-SignedHeaders=host&X-Amz-Signature=d65e221c9a1cb2adaccc5797f383584aefe133f0ae5c7a4903231f9e276fafc4

Screenshot of the page:

That’s only when I set the “inline” parameter as “False” that I get the “URL” address rather than the “body” text, which is what I need in the Bubble database and to send to Chat. I get the “body” text when I set “inline” as “true.”

(It’s tantalizing that I can see the output I need, I just can’t place it where it needs to go.)