I wanted to leave an update in case anyone comes along and is curious how I resolved this issue. Initially, I tried downloading an entire web page and passing that to the chatGPT API. This had a bunch of problems including various characters in the code causing json issues.

Instead of downloading the entire page, I grabbed the page Title and Description. I then passed these both to the chatGPT API and had it summarize the site based on this info. It works surprisingly well so far!