Parsing Presentations
Extract structured data from rich visual PDFs and presentations.
Getting Started
VLM-1 can extract structured data from rich visual PDFs and presentations. Here’s an example of a slide image from a presentation and the structured JSON output that VLM-1 can extract:
Sample image from a financial presentation.
Notebook Example
In this notebook, we will use the VLM-1 model to understand financial presentations. As an example, we will use a dataset from the SEC Edgar database that contains financial presentations from various companies. We will use the VLM-1 model to extract information from these presentations and analyze the data.
We will call the VLM-1 API using the Python requests library. We will use the generate endpoint of the API to extract visual information from the presentation slides.
Now, let’s list the available models in the vlm-1 API.
We’ll call VLM-1 through a helper function that defines the header and schema. Note that this leverages a few utils defined in the collab notebook. Take a look at the link above for more details.
Example Output
Now let’s try this out on aanother example slide.
We can render the markdown inline
Get Started with our Document -> JSON API
Head over to our Document -> JSON to start building your own document processing pipeline with VLM-1. Sign-up for access to our API here.