Documents API

Learn how to add, retrieve, and manage documents in your VectorForgeAI collections.

Document Overview

Documents are the core data units in VectorForgeAI. Each document contains text content that is automatically processed, chunked, and embedded for vector search and retrieval. The Documents API allows you to manage these documents within your collections.

List Documents in a Collection

Retrieve all documents within a specific collection.

GET /collections/{collection_id}/documents

Path Parameters

Parameter Type Description
collection_id string ID of the collection to list documents from

Query Parameters

Parameter Type Required Description
limit integer No Number of documents to return (1-100). Default: 10
page integer No Page number for pagination. Default: 1

Request

cURL
curl -X GET "https://api.vectorforgeai.com/v1/collections/abc123xyz456/documents?limit=20&page=1" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Team-Token: YOUR_TEAM_TOKEN"

Response

JSON
{
  "documents": {
    "current_page": 1,
    "data": [
      {
        "identifier": "doc-001",
        "title": "What is VectorForgeAI",
        "body": "VectorForgeAI is a platform that simplifies AI infrastructure...",
        "metadata": {
          "author": "Jane Smith",
          "department": "Marketing"
        },
        "structured_problems": [],
        "created_at": "2025-05-01T09:15:22.123Z",
        "updated_at": "2025-05-01T09:15:22.123Z"
      },
      {
        "identifier": "doc-002",
        "title": "Getting Started Guide",
        "body": "This guide will help you get started with our platform...",
        "metadata": {
          "category": "Tutorials",
          "difficulty": "Beginner"
        },
        "structured_problems": [{"tags": ["guide", "getting-started"], "problem": "How do I get started with VectorForgeAI?", "summary": "This is a guide how to get started with VectorForgeAI", "solution": "First you go to the signup page, step 2 will be creating an API token. After this you can follow the documentation step by step."}],
        "created_at": "2025-05-02T14:25:12.789Z",
        "updated_at": "2025-05-02T14:25:12.789Z"
      }
    ],
    "first_page_url": "https://api.vectorforgeai.com/v1/collections/abc123xyz456/documents?page=1",
    "from": 1,
    "last_page": 3,
    "per_page": 20,
    "to": 20,
    "total": 42
  }
}

Add a Document to a Collection

Add a new document to an existing collection. If a document with the same identifier already exists, it will be updated.

POST /collections/{collection_id}/documents

Path Parameters

Parameter Type Description
collection_id string ID of the collection to add the document to

Request Parameters

Parameter Type Required Description
identifier string Yes Unique identifier for the document (max 512 characters)
title string No Title of the document (max 512 characters)
body string Yes The main content of the document
metadata object No Key-value pairs with additional data about the document
system_context string No Extra information or instructions for problem-solution generation (max 512 characters)

⚠️ Metadata Restrictions

The following keys are not allowed in the metadata object: tenant_id, collection_id, id, chunk_content, and chunk, as these are reserved for system use.

Request

cURL
curl -X POST https://api.vectorforgeai.com/v1/collections/abc123xyz456/documents \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Team-Token: YOUR_TEAM_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "identifier": "pricing-guide-2025",
    "title": "VectorForgeAI Pricing Guide 2025",
    "body": "VectorForgeAI offers flexible pricing plans to meet the needs of individuals, startups, and enterprises...",
    "metadata": {
      "category": "Documentation",
      "updated_date": "2025-05-01",
      "version": "2.4"
    }
  }'

Response

JSON
{
  "state": "created",
  "document": {
    "identifier": "pricing-guide-2025",
    "title": "VectorForgeAI Pricing Guide 2025",
    "body": "VectorForgeAI offers flexible pricing plans to meet the needs of individuals, startups, and enterprises...",
    "metadata": {
      "category": "Documentation",
      "updated_date": "2025-05-01",
      "version": "2.4"
    },
    "structured_problems": [{"tags": ["guide", "getting-started"], "problem": "How do I get started with VectorForgeAI?", "summary": "This is a guide how to get started with VectorForgeAI", "solution": "First you go to the signup page, step 2 will be creating an API token. After this you can follow the documentation step by step."}],
    "created_at": "2025-05-10T10:15:22.456Z",
    "updated_at": "2025-05-10T10:15:22.456Z"
  }
}

Delete a Document

Permanently remove a document from a collection, including all its vector embeddings.

DELETE /collections/{collection_id}/documents/{identifier}

Path Parameters

Parameter Type Description
collection_id string ID of the collection containing the document
identifier string Unique identifier of the document to delete

⚠️ Warning

This operation permanently deletes the document and all its associated vector embeddings. This action cannot be undone.

Request

cURL
curl -X DELETE https://api.vectorforgeai.com/v1/collections/abc123xyz456/documents/pricing-guide-2025 \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Team-Token: YOUR_TEAM_TOKEN"

Response

JSON
{
  "message": "Document deleted successfully."
}

Document Processing

When you add a document, two things will happen, VectorForgeAI automatically:

Creates Problem-Solution pairs New

  1. Turn the content into Markdown
  2. Using LLMs, generate one or more summaries, tags, problems and solutions from the content
  3. Generates vector embeddings for each of the problem-solution pairs using our embedding model
  4. Chunks the document into smaller segments for optimal processing (100-200 words per chunk)
  5. Generates vector embeddings for each chunk using our embedding model
  6. Stores both the document metadata and embeddings for efficient retrieval
  7. Makes the document available for semantic search and AI-powered responses

Creates Chunks

  1. Turn the content into Markdown
  2. Chunks the document into smaller segments for optimal processing (100-200 words per chunk)
  3. Generates vector embeddings for each chunk using our embedding model
  4. Stores both the document metadata and embeddings for efficient retrieval
  5. Makes the document available for semantic search and AI-powered responses

Best Practices

  • Document Size: While there's no hard limit on document body size, we recommend keeping documents focused on specific topics for better search relevance. We automatically convert HTML to Markdown.
  • Identifiers: Use consistent, meaningful identifiers to make document management easier. Can also be the URL.
  • Metadata: Use metadata to add searchable attributes to your documents, such as authors, categories, and dates.
  • Organization: Group related documents in the same collection for better context during search and retrieval.

Next Steps

Now that you've learned how to manage documents, explore:

Need Help?

If you're having trouble with managing documents or have questions, we're here to help!