Trovix AI Integration

Trovix Sift ©

Automated document data extraction integrated into your existing workflow

Sift is a document intelligence pipeline that extracts structured data from unstructured documents — contracts, policy slips, claims forms, financial statements, ACORD submissions — and delivers it in a structured format to your target system. It operates at volume without manual intervention, with a confidence scoring and human review layer for extractions that fall below your defined threshold. Sift is deployed inside your environment and connects to your document ingestion and output systems via API.

Systems Sift integrates with

Sift connects to your document ingestion sources and delivers structured output to your target systems. No rekeying. No manual routing.

iManage / NetDocuments
Document events in your DMS trigger extraction automatically. Extracted data is written back to matter metadata fields.
Guidewire ClaimCenter
FNOL and claims documents are processed on receipt. Extracted fields written directly to the ClaimCenter claim record via REST API.
Whitespace / PPL
ACORD-standard slip data extracted from London market submissions. Compatible with Whitespace Digital Platform and PPL workflows.
Salesforce
Extracted data delivered to custom objects or standard fields via Salesforce REST API. Supports Financial Services Cloud objects.
SharePoint / OneDrive
Document libraries monitored for new items. Extraction triggered on upload. Output to SharePoint lists or external systems.
Custom databases
Output delivered via REST API, direct database write (PostgreSQL, SQL Server, Oracle) or file export (CSV, JSON, XLSX).

Integration architecture

Email / DMS IntakeRFC 5322 / iManage APIBulk UploadSFTP / SharePoint / S3Sift ExtractionNLP + custom modelStructured OutputJSON / CSV / XLSXFlagged for Reviewconfidence < thresholdTarget SystemDMS / CRM / registerConfidence Scoreper field extractedEach extracted field carries a confidence score and a citation to the source clause. Low-confidence fields are routed to human review queue.

Technical specification

Extraction approach
Fine-tuned extraction model trained on your document types. Initial training uses your historical document library. Improves over time.
Confidence scoring
Each extracted field is scored 0–1. Threshold configurable per field type. Fields below threshold queued for human review automatically.
Document formats
PDF (native and scanned), DOCX, XLSX, MSG, EML, TIFF, PNG. OCR layer for scanned documents using Azure AI Document Intelligence.
Volume capacity
Designed for continuous batch processing. Tested to 50,000+ documents per day on standard cloud infrastructure.
Human review interface
Browser-based review queue. Reviewer sees the original document alongside extracted fields. Accept, correct or reject per field.
Audit trail
Every extraction logged: document ID, field, extracted value, confidence score, model version, review outcome and reviewer identity.
Integration method
Event-driven via webhooks or polling. Supports SFTP drop zones, S3 buckets, SharePoint event hooks and direct API submission.

Regulatory compliance

FCA Consumer DutyExtraction audit trail satisfies fair treatment documentation requirements. Human review thresholds configurable per regulatory obligation.
Lloyd's Blueprint TwoACORD-standard output compatible with Lloyd's data quality requirements. Confidence scores available for MRC field validation.
SRA CodeDocument processing logged with full provenance. Matter data handled within your environment under your data policies.
GDPRPersonal data extracted only as required for defined purposes. Data minimisation controls configurable per document type.

Discuss Sift integration with your document processing stack

A technical conversation about your existing systems and integration approach. 30 minutes.

Book a technical call