How does PrimeCut pricing work?

PrimeCut is one adaptive API, priced pay-as-you-go at a maximum of €0.01 per page. A page is one document page, one minute of audio or video, or 1,000 tokens of delivered content, whichever counts higher. Processing adapts to each document: hierarchy, cross-references, and visual content are preserved with full fidelity where the document needs it — technical documents like research papers, financial filings, or engineering specs — while simple documents are processed lean. One balance, one mode, priced per page.

How is PrimeCut different from other chunkers like Unstructured.io, LangChain text splitters, or LlamaIndex?

PrimeCut is full-document, structure-aware: it detects document hierarchy (headings, tables, captions, sections) before chunking and preserves those relationships in the output. Most alternatives take a character- or token-level approach — LangChain’s RecursiveCharacterTextSplitter, LlamaIndex’s default chunkers, and Unstructured.io all flatten documents into a stream. The result is naive chunkers split a single argument across three chunks; PrimeCut keeps the argument intact. On the open-source OfficeQA benchmark, PrimeCut delivers 23% of the tokens at 100% recall vs. Unstructured.io.

What filetypes does PrimeCut support?

PrimeCut supports 50+ filetypes across text and binary formats — including PDF, DOCX, PPTX, XLSX, HTML, Markdown, CSV, JSON, plain text, and image-based documents via OCR. Visual content (tables, charts, embedded images) is handled inline; tables retain their captions and column relationships, images retain their surrounding context. The same API call processes all filetypes; you don’t pre-classify input.

Does PrimeCut handle scanned PDFs?

Yes. PrimeCut is OCR-aware: scanned PDFs and image-based documents are processed through PrimeCut’s structure-aware pipeline alongside native text PDFs. The system extracts text from images, then applies the same hierarchy detection used on native PDFs. This matters because much enterprise content — contracts, research papers, financial filings, engineering specs — exists primarily as scanned PDFs.

How does PrimeCut integrate with my existing RAG pipeline, vector database, and LLM?

PrimeCut is a drop-in chunking and ingestion layer — your retrieval, embeddings, and vector store stay where they are. Integration is via MCP (Model Context Protocol), SDK, or REST API. PrimeCut produces structured chunks ready to embed; pass them to your embedding model and store them in any vector database. Compatible with Pinecone, Weaviate, Qdrant, and others. Works with major LLM providers (OpenAI, Anthropic) and frameworks (LangChain, LlamaIndex).

Our secret sauce
Document Ingestion and RAG Chunking

Name: PrimeCut
Brand: POMA AI
Availability: InStock

PrimeCut is POMA AI's patent-protected document ingestion & RAG chunking core — structure-aware across text, tables, scanned PDFs (OCR), and 50+ filetypes.

The structural awareness matters most on technical documents — research papers, financial filings, engineering specs — where naive chunkers split a single argument across three chunks and break retrieval. Drop it into your existing pipeline; retrieval, embeddings, and vector store stay where they are.

Try for free

Read the docs

How Hierarchical Ingestion and Chunking Works: Structure to RAG-Ready Chunks
How POMA PrimeCut Sees Your Document Hierarchy

Every document carries an internal logic: a hierarchy of headings, sub-sections, tables, lists, and supporting elements that define what content belongs together and why. That structure is not decoration — it is the semantic map of the document.

Standard ingestion pipelines discard this map. They extract raw text and hand it to a chunker that has no knowledge of where one idea ends and another begins.

PrimeCut understands your document’s content hierarchy before chunking — preserving structural relationships, eliminating context poisoning, and producing semantically coherent chunksets that make every downstream RAG component more accurate by default.

Text, Chart & Table — One Document, Fully Resolved

MSCI World Index (USD) Factsheet, Sep 2025 Chunksets 0–5 of 43

Shared root

Shared hierarchy

Leaf (unique to one chunkset)

Text

Image description

Table data

Source: MSCI World Index (USD) factsheet — processed by POMA PrimeCut into 43 chunksets from 85 structural chunks.

Try for Free

What you get back
Every upload returns a POMA archive.

Send us a document — any supported filetype — and you get back a POMA archive: a zip containing the structured SDK output your pipeline can read directly. No glue code, no per-file branching.

The archive bundles the chunks your retrieval layer consumes alongside the intermediate artifacts that produced them, so you can debug, re-process, or surface source content without re-running the pipeline.

Read the POMA archive reference

Inside a POMA archive

Core SDK files

chunks.json: Extracted chunks with hierarchy and page references.
chunksets.json: Grouped chunk collections for retrieval.
image_sources.json: Image references and metadata (when applicable).
assets/: Supporting files referenced by chunks (when applicable).

Extended processing artifacts

Every archive also carries the intermediate artifacts that produced those chunks: input as markdown and HTML, structurally-indented plain text, AI/OCR image descriptions, extracted tables, pre-processed source files, and archive- and content-level metadata.

What it does
From document to embedding-ready chunkset, one call.

PrimeCut treats each document as a structure, not as just bytes. It detects the hierarchy of your document - headings, subsections, and clauses. This allows it to preserves clauses with their definitions, tables with their captions, and graph content in their section. It then emits chunksets that your embedding model can use directly.

Structure-aware parsing

Headings, tables, lists, and captions retain their hierarchical relationships through chunking. No flattening to character runs.
Fifty-plus filetypes

PDF, DOCX, PPTX, XLSX, HTML — same engine for all of them. Images, charts, and tabulated data are handled inline as searchable content.
Hierarchical chunksets

Output is structured JSON: chunks with full ancestor metadata and ready-to-embed traversal paths. Drop into any embedding model or vector DB.

What POMA PrimeCut Does Differently
POMA PrimeCut vs Unstructured.io vs Conventional Chunking:

Hierarchical Chunking Compared

Conventional Chunk

an SPDF is one approach to help ensure that the QS regulation is met. Because of its benefits in helping comply with the QS regulation and

cybersecurity, FDA encourages manufacturers to use an SPDF, but other approaches might also satisfy the QS regulation.

### B. Designing for Security

When reviewing premarket submissions, FDA intends to assess device cybersecurity based on a number of factors, including, but not limited to, the

device's ability to provide and implement the security objectives below throughout the device architecture. The security objectives below generally

may apply broadly to devices within the scope of this guidance, including, but not limited to, devices containing artificial intelligence (AI) and

cloud-based services.

Security Objectives:

• Authenticity, which includes integrity;

• Authorization:

• Availability:

• Confidentiality; and

• Secure and timely updatability and patchability.

Premarket submissions should include information that describes how the above security objectives are addressed by and integrated into the device

design. The extent to which security requirements, architecture, supply chain, and implementation are needed to meet these objectives will depend on

but may not be limited to:

- The device’s intended use, indications for use, and reasonably foreseeable misuse;

- The presence and functionality of its electronic data interfaces;

• Its intended and actual environment of use:

- The risks presented by cybersecurity vulnerabilities;

- The exploitability of the vulnerabilities; and

- The risk of patient harm due to vulnerability exploitation.

SPDF processes aim to reduce the number and severity of vulnerabilities and thereby reduce the exploitability of a medical device system and the

associated risk of patient harm. Because exploitation of known vulnerabilities or weak cybersecurity controls should be considered reasonably

foreseeable failure modes for medical device systems, these factors should be addressed in the device design. $ ^{19} $ One of the key benefits of

using an SPDF is that a medical device system is more likely to be secure by design, such that the device is designed from the outset to be secure

within its system and/or network of use throughout the device lifecycle.

### C. Transparency

A lack of cybersecurity information, such as information necessary to integrate the device into the use environment, as well as information needed

by users to maintain the medical device system’s cybersecurity over the device lifecycle, has the potential to affect the safety and effectiveness

of a device. In order to address these concerns, it is important for device users to

## Contains Nonbinding Recommendations

have access to information pertaining to the device’s cybersecurity controls, potential risks to the medical device system, and other relevant

very long
spanning multiple sections
isolating heading from subsequent content

Unstructured.io Chunk

• The device’s intended use, indications for use, and reasonably foreseeable misuse;

• The presence and functionality of its electronic data interfaces;

• Its intended and actual environment of use; 18

• The risks presented by cybersecurity vulnerabilities;

• The exploitability of the vulnerabilities; and

• The risk of patient harm due to vulnerability exploitation.

No section indication
Random artifacts as part of apparent main text (“18”)
No context/positioning within the document

POMA Chunk(Set): Full Context Path

Cybersecurity Guidance for Medical Devices: Quality Systems and Premarket Submission

Requirements

	[…]

	Guidance for Industry and Food and Drug Administration Staff

		[…]

		Contains Nonbinding Recommendations outline

			[…]

			B. Designing for Security

			When reviewing premarket submissions, FDA intends to assess device cybersecurity

			based on a number of factors, including, but not limited to, the device's ability

			to provide and implement the security objectives below throughout the device

			architecture.

				[…]

				The extent to which security requirements, architecture, supply chain, and

				implementation are needed to meet these objectives will depend on but may not

				be limited to:

					[…]

					• Its intended and actual environment of use:

						[…]

						• The risk of patient harm due to vulnerability exploitation.

It just works

What We Do Differently - Explained

The benchmark
23% of the tokens, 100% recall.

Standard chunking ignores how your documents are structured. So a query like 'How high was the interest rate last year?' retrieves a wide net of chunks where most of the content has nothing to do with the question — and you still pay for every token returned. PrimeCut chunks structure-aware: queries return only the relevant content, no loss of recall.

Full benchmark on GitHub Explore chunking strategies

One API That Adapts to Every Document
PrimeCut Adaptive

PrimeCut adapts processing to each document — preserving hierarchy, cross-references, and visual content where it matters, and staying lean where it doesn't. One balance, priced per page.

PrimeCut Adaptive

Full structural and visual intelligence where your documents need it — lean, fast processing where they don't.

max €0.01 / page / 1000 token / min

Features

Rapid, full document hierarchy parsing
Semantically bounded, neighbour-aware chunks with ancestor context inheritance
Context-aware, ready-to-embed chunksets
Full AI processing — figures, tables, and images parsed as semantic content
Visual elements extracted, placeholdered, and converted to retrievable, context-aware textual chunks
Optimized for multimodal accurate hierarchical textual representation of complex content
Optimized for low cost — lean processing where documents are simple
Title generation

One API, both ends of the spectrum

Complex, high-stakes documents (legal & regulatory, financial & insurance, medical, engineering) — full structural and visual fidelity, maximum search accuracy
Large, simple corpora — lean processing that keeps cost near the floor
Mixed knowledge bases — PrimeCut Adaptive auto-classifies every document and gives it the right processing at the right price

Structured files — xml, cir, json, yaml, toml, ini, env, csv, tsv, xls, xlsx, xlsb — are always chunked with full structural fidelity.

Try for Free

Integration into Your RAG Pipeline
LangChain Document Chunking and RAG Pipeline Integration — No Architectural Overhaul

PrimeCut sits at the ingestion layer of your RAG pipeline — upstream of your vector database, your embedding model, and your retrieval logic. It receives documents. It returns structured, hierarchically-bounded chunksets.

The SDK is lightweight. The API is flexible. PrimeCut's output schema is consistent across both configurations.

Compatible with:

LLMs

OpenAI

Anthropic

Other leading LLMs

Vector Databases

Qdrant

Pinecone

Weaviate

Other vector databases

Frameworks

LangChain

LlamaIndex

Custom RAG implementations

Ready to get started?
Try it on your own pipeline.

Free tier covers 1,000 pages — drop the SDK in, point it at a document, see what comes back. No retrieval refactor, no vector DB swap, no architectural overhaul.

Processing at scale? Let's talk

Try for free

Read the documentation

1,000 free pages. No credit card required.

How Hierarchical Ingestion and Chunking Works: Structure to RAG-Ready Chunks How POMA PrimeCut Sees Your Document Hierarchy

Text, Chart & Table — One Document, Fully Resolved

What you get back Every upload returns a POMA archive.

Inside a POMA archive

Structure-aware parsing

Fifty-plus filetypes

Hierarchical chunksets

What POMA PrimeCut Does Differently POMA PrimeCut vs Unstructured.io vs Conventional Chunking: Hierarchical Chunking Compared