> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ooneex.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Embeddings

> Generate embeddings with OpenAI embedding models

Embeddings turn the `text` of each record into a numeric vector so that semantically similar content sits close together in vector space. `@ooneex/rag` generates embeddings with OpenAI models — you pick the model in your [vector database](/ai/rag/vector-database), and embedding happens automatically on `add()` and `search()`.

## Choosing a model

Declare the provider and model in `getEmbeddingModel()`:

```typescript theme={null}
public getEmbeddingModel = (): {
  provider: EmbeddingProviderType;
  model: EmbeddingModelType["model"];
} => ({ provider: "openai", model: "text-embedding-3-small" });
```

The provider is `openai`. The available models:

| Model                    | Dimensions | Notes                                              |
| ------------------------ | ---------- | -------------------------------------------------- |
| `text-embedding-3-small` | 1536       | Best price/performance — a strong default.         |
| `text-embedding-3-large` | 3072       | Highest quality; larger vectors, higher cost.      |
| `text-embedding-ada-002` | 1536       | Previous-generation model, kept for compatibility. |

<Tip>
  Start with `text-embedding-3-small`. Move to `text-embedding-3-large` only if retrieval quality on your data measurably improves — the larger vectors cost more to store and search.
</Tip>

## Configuration

Embeddings are generated through OpenAI, so set your API key in the environment:

```bash theme={null}
OPENAI_API_KEY=sk-...
```

The key is read when embeddings are generated, so a missing or invalid key surfaces when you first `add()` or `search()`, not at construction time.

## How embedding works

You never compute or pass vectors yourself. When a table is created, the schema wires the embedding model into two roles:

* The **source field** is `text` — the column that gets embedded.
* The **vector field** is `vector` — where the generated embedding is stored and indexed.

From there:

1. On `add()`, the `text` of each record is sent to the embedding model and the resulting vector is stored on the row.
2. On `search()`, your query string is embedded with the **same** model and compared against stored vectors.

Because the same model embeds both stored text and queries, the model is fixed for the life of a table.

<Note>
  The embedding model is part of a table's schema. Changing `getEmbeddingModel()` after a table already exists does not re-embed existing rows — embed new data into a fresh table instead, then switch over.
</Note>

## Where embeddings fit

Embeddings power the **vector** half of retrieval. At query time they are combined with a full-text search over the same `text` and merged with an RRF reranker — see [Search](/ai/rag/search) for how the two halves come together.
