AI Extraction

Overview

Attribute Extraction is the process of identifying specific information within unstructured data and converting it into a structured format. It scans Datasources, isolates particular pieces of information, and presents them in a structured manner.

Title	Description	Demo
Content Organization	Structure scattered data into defined formats.	Organize Data Formats
Tabular Data Extraction	Extract tabular data from PDFs and images.	Tabular Data Extraction
Financial Analysis Chatbot	Analyze financial documents and answer questions.	Financial Analysis Chatbot
Compare Documents	Compare documents and identify differences.	Comparing Insurance Policies

Usage

Add Datasource

Create a datasource by calling the Add datasource endpoint to define the field attributes to extract. Use the description parameter to guide the LLM.

Get Datasource

Get the datasource by calling the Get datasource endpoint. This will return the datasource with the corresponding id.

File Uploading (Add Data Entities)

We support the following file types for Attribute Extraction:

.pdf
.csv
.docx
.xlsx

Upon upload, these files run through the following pipeline:

File is converted into a text representation via OCR (if applicable)
The text runs through a series of metadata extractors + augmenters
Attributes are stored in our database and ready to use in your indexes.

Definitions

Glossary

Term	Definition
Attribute	Specific pieces of information identified and extracted from the raw data.
Attribute Extraction template	An outline of specified attributes and its descriptions.
Data Entity	A unique entry in a Datasource. Eg: an uploaded file and the extracted attributes.
Datasource	A collection of Data Entities addressing a specific data concern. This could be from an integration, grouping of files, etc.
OCR (Optical Character Recognition)	A technology that recognizes and converts different types of documents, such as scanned paper documents, PDF files, or images, into editable and searchable text.

Introduction

Getting Started

Products

API Reference

Platform

Tools

Guides

AI Extraction

Overview

Usage

Add Datasource

Get Datasource

File Uploading (Add Data Entities)

Definitions

Glossary

Introduction

Getting Started

Products

API Reference

Platform

Tools

Guides

​Overview

​Usage

​Add Datasource

​Get Datasource

​File Uploading (Add Data Entities)

​Definitions

​Glossary

Overview

Usage

Add Datasource

Get Datasource

File Uploading (Add Data Entities)

Definitions

Glossary