Request
Uploads a document to a dataset. Supports text content, PDFs, images, and other file types.
Path Parameters
Bearer token for authentication
Body (JSON Upload)
For text content, use JSON:
Custom metadata (title, category, tags, etc.)
Body (File Upload)
For files, use multipart/form-data:
File to upload (PDF, image, text, etc.)
JSON-encoded metadata object
Response
Unique document identifier
Original filename (for file uploads)
Processing status: processing, ready, failed
Number of searchable chunks (available when status: ready)
ISO 8601 creation timestamp
Examples
Text Upload
curl -X POST https://api.fltr.com/v1/datasets/ds_abc123/documents \
-H "Authorization: Bearer fltr_sk_abc123..." \
-H "Content-Type: application/json" \
-d '{
"content": "FLTR is a semantic search platform...",
"metadata": {
"title": "About FLTR",
"category": "documentation",
"tags": ["intro", "overview"]
}
}'
response = requests.post(
"https://api.fltr.com/v1/datasets/ds_abc123/documents",
headers={
"Authorization": "Bearer fltr_sk_abc123...",
"Content-Type": "application/json"
},
json={
"content": "FLTR is a semantic search platform...",
"metadata": {
"title": "About FLTR",
"category": "documentation"
}
}
)
File Upload
curl -X POST https://api.fltr.com/v1/datasets/ds_abc123/documents \
-H "Authorization: Bearer fltr_sk_abc123..." \
-F "file=@document.pdf" \
-F 'metadata={"title":"Product Guide","category":"docs"}'
with open('document.pdf', 'rb') as f:
response = requests.post(
"https://api.fltr.com/v1/datasets/ds_abc123/documents",
headers={"Authorization": "Bearer fltr_sk_abc123..."},
files={"file": f},
data={"metadata": '{"title":"Product Guide"}'}
)
Response
{
"document_id": "doc_xyz789",
"dataset_id": "ds_abc123",
"filename": "document.pdf",
"size_bytes": 1048576,
"status": "processing",
"created_at": "2024-01-10T12:00:00Z",
"metadata": {
"title": "Product Guide",
"category": "docs"
}
}
Processing
Documents are processed asynchronously:
- Upload - File received and stored
- Extraction - Text extracted from file
- Chunking - Content split into searchable segments
- Embedding - Vector embeddings generated
- Indexing - Added to search index
Processing typically takes 1-30 seconds depending on document size.
Supported File Types
- Text:
.txt, .md, .csv
- Documents:
.pdf, .docx, .pptx
- Images:
.jpg, .png (OCR applied)
- Code:
.py, .js, .java, etc.
- Data:
.json, .xml, .yaml
Limits
- Max file size: 10MB
- Max content length: 1M characters
- Max 10,000 documents per dataset
Notes
- Set
title in metadata for better search results
- Use
category and tags for filtering
- Custom metadata is searchable
- Duplicate content is allowed