Documentation Index Fetch the complete documentation index at: https://docs.vlm.run/llms.txt
Use this file to discover all available pages before exploring further.
The MarkdownDocument schema is the cornerstone of VLM Run’s document processing system, providing a standardized, machine-readable representation of complex documents. This technical reference guide details the schema’s architecture, components, and implementation patterns.
MarkdownDocument Data Model
The MarkdownDocument schema addresses the fundamental challenges in document processing:
Structural Preservation : Maintains document hierarchy and relationships
Content Extraction : Handles mixed content types (text, tables, figures, code)
Spatial Understanding : Preserves layout and positioning information
Data Integrity : Ensures accurate representation of structured elements
Extensibility : Supports custom annotations and metadata
1. MarkdownPage
A MarkdownDocument is a list of MarkdownPage objects, each representing a page in the document.
Here’s an alternative way to visualize the MarkdownPage schema:
Tabular Representation of `MarkdownPage`
Component Field Type Description MarkdownDocument pagesList[MarkdownPage]Pages in the document MarkdownPage metadataPageMetadataMetadata of the page tablesList[Table]Tables in the page figuresList[Figure]Figures in the page contentstrContent of the page PageMetadata languagestrLanguage of the document page_numberintPage number of the document (0-indexed) Table metadata.titlestrTitle of the table metadata.captionstrCaption of the table metadata.notesstrNotes about the table headers.idstrUnique identifier for the header headers.columnintColumn index of the header headers.namestrName of the header headers.dtypestrData type of the header data.*dict[str, Any]Maps column header ids to values bboxBoxCoordsBounding box of the table Figure idintUnique identifier for the figure titlestrTitle of the figure captionstrCaption of the figure bboxBoxCoordsBounding box of the figure
2. MarkdownTable
Tables are represented with a <Table id="tb-{id}"/> tag in the markdown content, with the actual table content stored in the tables list. This allows for rich representation of table’s data while maintaining the document’s flow.
Charts and figures are represented with a <Chart id="ch-{id}"/> tag in the content. The chart details are stored in the figures list, including properties like:
Example Usage
Here’s an example of how the MarkdownPage model is used to process a document:
from pathlib import Path
from vlmrun.client import VLMRun
from vlmrun.client.types import PredictionResponse, MarkdownDocument
# Initialize client
client = VLMRun( api_key = "<VLMRUN_API_KEY>" )
# Process document
response: PredictionResponse = client.document.generate(
file = Path( "document.pdf" ),
domain = "document.markdown" ,
batch = True ,
)
# Access processed document
doc: MarkdownDocument = client.predictions.wait(response.id, timeout = 120 )
print (doc.model_dump_json( indent = 2 ))
Example JSON Response
Here’s an example of how the MarkdownPage schema appears in a JSON response:
{
"pages" : [
{ // page 0
"metadata" : {
"page_number" : 0
},
"tables" : [
{
"metadata" : {
"title" : "Sample Data Table" ,
"caption" : "Table showing example data"
},
"content" : "| Header 1 | Header 2 | \n |----------|----------| \n | Data 1 | Data 2 | \n | Data 3 | Data 4 |" ,
"headers" : [
{
"id" : "h1" ,
"column" : 0 ,
"name" : "Header 1" ,
"dtype" : "string"
},
...
],
"data" : [
{
"h1" : "Data 1" ,
"h2" : "Data 2"
},
...
]
}
],
"figures" : [
{
"id" : 0 ,
"title" : "Sample Bar Chart" ,
"caption" : "Example visualization" ,
"content" : "..."
}
...
],
"content" : "..."
},
{ // page 1
...
},
{ // page 2
...
},
...
]
}