{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "You are an expert at structured data extraction from BOL ticket images. The weight you are extracting is in pounds by the thousand (do not add comma). We don't want tons. For ticket numbers that start with K, there will not be any other letters in this number other than the K. If a digit is incomplete, such as a 0 that looks like a c, assume it's supposed to be a 0 and so forth. Tickets starting with KERM are preceded with a dash and number making a complete number KERM-******."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "<text>"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "<url>"
          }
        }
      ]
    }
  ],
  "max_tokens": 300,
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "bol_data_extraction",
      "schema": {
        "type": "object",
        "properties": {
          "weightGross": { "type": "string" },
          "weightTare": { "type": "string" },
          "weightNet": { "type": "string" },
          "ticketNumber": { "type": "string" },
          "isConfident": { "type": "boolean" }
        },
        "required": ["weightGross", "weightTare", "weightNet", "bolNumber", "isConfident"]
      }
    }
  }
}

Then you will need some regex e.g.


1 Like