AI Retrieval
Metal Retrieval allows you to utilize AI on your data.
Overview
LLM Retrieval is the process of making your data compatible with LLMs. This is done by embedding your data into a vector space, then indexed into a database. This allows you to run semantic searches on your data, which is the core of LLMs.
Use Cases
Title | Description | Demo |
---|---|---|
Semantic Search | LLMs give your search the knowledge of semantic meaning | Multi Tenant search |
Chatbots | Utilize LLMs to create chatbots that can answer questions on your data | RAG Chatbot |
Question+Answering | Build Chat apps that answer questions on top of your unstructured data | Chat App |
Tabular Analysis | Analyze tabular data with LLMs | Financial Analysis Chatbot |
Clustering | Uncover hidden trends within your unstructured data | Clustering Tool |
Image Search | Search for images based on semantic meaning | Image Retrieval with CLIP |
Usage
Indexing
We provide APIs to easily push data into our system. We support the following “primitive” data types for ingestion:
- Image URLs (.jpg, .png, .gif, .bmp, .tiff)
- Text (string)
- Embeddings (number[])
This data will then be pushed into our indexing pipeline to generate embeddings and store in a vector db. Checkout the Index endpoint for more details.
File Importing
We provide APIs to easily push larger collections of data into our system as well. We support the following file types for ingestion:
.csv
.docx
.pdf
.pptx
.txt
.xlsx
Upon upload, these files run through the following pipeline:
- The file runs through a series of metadata extractors + augmenters
- File is converted into a text representation via OCR (if applicable)
- The text is split into overlapping 500 token chunks
- These chunks are embedded based on the chosen embeddings model (ada, clip, etc)
- The embeddings are indexed into a Vector database
- Vector is indexed into our database
Searching
Run semantic search out of the box with our API. We support the following search term types:
- Images (.jpg, .png, .gif, .bmp, .tiff)
- Text (string)
- Embeddings (number[])
Filtered search is also supported. Check out the Search endpoint for more details.
Definitions
Term | Definition |
---|---|
Document | A record that stores an embedding & metadata |
Embedding | A vector representation of your data. |
Index | A database of your embeddings. |
Indexing | Pushing data (raw text, files, images, etc) into our system. |
Search | An operation to run semantic searches on your index. |
Tuning | A mechanism to improve the quality of your embeddings for your particular use case. |