VLM-1 can extract structured data from rich visual PDFs and presentations. Here’s an example of a slide image from a presentation and the structured JSON output that VLM-1 can extract:
Sample image from a financial presentation.
{"id":"...","created_at":"...","completed_at":"...","status":"completed","response":{"description":"The document details the differentiated operating model of Selective Insurance, highlighting its unique, locally based field model and franchise value distribution model with high-quality partners. It also includes a pie chart showing the distribution of 2023 net premiums written.","title":"Differentiated Operating Model","page_number":7,"tables":[{"description":"The table highlights two key aspects of Selective Insurance's operating model: its unique, locally based field model and its franchise value distribution model with high-quality partners. It includes details on the locally based specialists, distribution partners, office locations, and quotes from partners.","title":"Differentiated Operating Model Overview","caption":null,"markdown":"| Aspect | Description |\n|-----------------------------------------------------|-----------------------------------------------------------------------------------------------------|\n| Unique, locally based field model | - Locally based underwriting, claims, and safety management specialists |\n| | - Proven ability to develop and integrate actionable tools |\n| | - Enables effective portfolio management in an uncertain loss trend environment |\n| Franchise value distribution model with high-quality partners | - Approximately 1,550 distribution partners selling standard lines products and services through approximately 2,650 office locations|\n| | - ~850 of these distribution partners sell personal lines products |\n| | - ~90 wholesale agents sell E&S business |\n| | - ~6,400 distribution partners sell National Flood Insurance Program products across 50 states |\n| Quote from Selective Agent | \"Everyone with Selective makes our customers feel like the #1 priority. The ease of working with Selective is unmatched.\" |"}],"charts":[{"type":"pie","description":"Pie chart showing the distribution of 2023 net premiums written totaling $4 billion, with segments for Standard Commercial Lines (79%), Standard Personal Lines (10%), and Excess and Surplus Lines (11%).","title":"2023 Net Premiums Written","caption":null,"markdown":"| Category | Percentage |\n|------------------------------|------------|\n| Standard Commercial Lines | 79% |\n| Standard Personal Lines | 10% |\n| Excess and Surplus Lines | 11% |"}]},}
If you want to simply look at the code, skip to the colab notebook link directly here.
In this notebook, we will use the VLM-1 model to understand financial presentations.
As an example, we will use a dataset from the SEC Edgar database that contains financial
presentations from various companies. We will use the VLM-1 model to extract information
from these presentations and analyze the data.
We will call the VLM-1 API using the Python requests library. We will use
the generate endpoint of the API to extract visual information from the presentation slides.
We’ll call VLM-1 through a helper function that defines the header and schema.
Note that this leverages a few utils defined in the collab notebook.
Take a look at the link above for more details.
defvlm(image: Image.Image, domain:str):"""Send an image to the VLM API.""" data ={"model":"vlm-1","domain": domain,"image": encode_image(image),} response = requests.post(f"{VLM_BASE_URL}/image/generate", headers=headers, json=data) response.raise_for_status()return response.json()defvlm_visualize(image: Union[Image.Image,str, Path], domain:str):"""Send an image to the VLM API and display the result."""ifisinstance(image,str)and image.startswith("http"): image = download_image(image)elifisinstance(image,(str, Path)):ifnot Path(image).exists():raise FileNotFoundError(f"File not found {image}") image = Image.open(str(image)).convert("RGB")elifisinstance(image, Image.Image): image = image.convert("RGB")else:raise ValueError("Invalid image, must be a path, PIL Image or URL")
{"id":"95c76a66-4f9f-4a6f-b318-fcebeabae449","created_at":"2024-08-13T23:26:51.916563","completed_at":"2024-08-13T23:26:59.832087","response":{"description":"The document from Selective Insurance describes the impact of their portfolio management approach on business mix improvements. It contains a bar chart along with a pie chart and supporting text.","title":null,"page_number":15,"tables":null,"charts":[{"type":"bar","description":"The bar chart illustrates the Renewal Pure Price and Point of Renewal Retention across different retention groups: Excellent, Above Average, Average, Below Average, and Low & Very Low. It demonstrates that higher retention is linked with lower pricing.","title":null,"caption":"Standard Commercial Lines Pricing by Retention Group","markdown":"| Retention Group | Renewal Pure Price | Point of Renewal Retention | % of Premium |\n|-------------------|--------------------|----------------------------|--------------|\n| Excellent | ~7% | ~92% | 15% |\n| Above Average | ~10% | ~90% | 14% |\n| Average | ~11% | ~89% | 47% |\n| Below Average | ~11.5% | ~87% | 16% |\n| Low & Very Low | ~13% | ~75% | 8% |\n\n_As of December 31, 2023_"},{"type":"pie","description":"The pie chart depicts the mix of Direct Premium Written (DPW) in 2023 across various segments. The segments include Contractors, Mercantile & Services, Community & Public Services, Manufacturing & Wholesale, and Bonds.","title":null,"caption":"2023 DPW Mix","markdown":"| Business Segment | Percentage |\n|-------------------------------|------------|\n| Contractors | 44% |\n| Mercantile & Services | 25% |\n| Community & Public Services | 16% |\n| Manufacturing & Wholesale | 14% |\n| Bonds | 1% |"}]},"status":"completed"}