Getting Started

vlm-1 can extract structured data from long documents and reports. Here’s a rough breakdown of the steps involved in parsing a document:

1

Upload Document

Use the /v1/files endpoint to upload the document you want to parse.

from vlmrun.client import VLMRun
from vlmrun.client.types import FileResponse

# Initialize the client
client = VLMRun(api_key="<your-api-key>")

# Upload the file
response: FileResponse = client.files.upload(
    file=Path("<path/to/test.pdf>")
)
print(f"Uploaded file:\n {response.model_dump()}")

You should see a response like this:

Uploaded file:
{
  'id': '1e76cfd9-ba99-49b2-a8fe-2c8efaad2649',
  'filename': 'file-20240815-7UvOUQ-earnings_single_table.pdf',
  'bytes': 62430,
  'purpose': 'assistants',
  'created_at': '2024-08-15T02:22:06.716130',
  'object': 'file'
}
2

Submit the Document AI Job

Submit the uploaded file (via its file_id) to the /v1/document/generate endpoint to start the document parsing job. Currently, this endpoint only supports PDF files and submits the job to a queue for processing (batch=True).

from vlmrun.client.types import PredictionResponse

# Submit the document for parsing
response: PredictionResponse = client.document.generate(
    file=response.id,
    domain="document.file",
)
print(f"Document parsing job submitted:\n {response.model_dump()}")

You should see a response like this:

Document parsing job submitted:
{
  "id": "052cf2a8-2b84-45f5-a385-ccac2aae13bb",
  "created_at": "2024-08-15T02:22:09.157788",
  "response": null,
  "status": "pending"
}
3

Fetch the Results

Use the /v1/document/{request_id} endpoint to fetch the results of the document parsing job. The results of the extraction job will be in JSON format under the response field.

# Fetch the results
response: PredictionResponse = client.document.get(request_id)
print(f"Document parsing job results:\n {response.model_dump()}")

You should see a response like this:

{
  "id": "052cf2a8-2b84-45f5-a385-ccac2aae13bb",
  "created_at": "2024-08-15T02:22:09.157788",
  "status": "completed",
  "response": {
    "pages: [
      {
        "title": "Reducing Our Environmental Impact: Go Zero",
        "page_number": 30,
        "description": "...",
        "lines": [...]
        "paragraphs": [...]
        "tables": null,
        "charts": [...]
      },
      ...
    ]
  }
}

Notebook Example

If you want to simply look at the code, skip to the colab notebook link directly here.

Illustrative Examples

Here are some examples of the structured JSON output that vlm-1 can extract from long documents and reports:

Document AI with `vlm-1` - Example 1.

Document AI with `vlm-1` - Example 2.

Document AI with `vlm-1` - Example 3.

Document AI with `vlm-1` - Example 4.

Get Started with our Document -> JSON API

Head over to our Document -> JSON to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.