> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ooneex.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Vector Database

> Define and connect custom vector databases with LanceDB

A vector database describes *where* your data lives, *which* embedding model turns text into vectors, and *what* the schema looks like. You define one by extending `AbstractVectorDatabase`, then connect to it and open tables.

## Defining a database

Extend `AbstractVectorDatabase<DataType>` and implement three getters. The type parameter must have a `metadata` object — it carries your custom fields through records, filters, and results.

```typescript theme={null}
import { AbstractVectorDatabase } from "@ooneex/rag";
import type {
  EmbeddingModelType,
  EmbeddingProviderType,
  FieldValueType,
} from "@ooneex/rag";
import { Utf8 } from "apache-arrow";

type ArticleData = {
  metadata: {
    title: string;
    category: string;
  };
};

class ArticleVectorDatabase extends AbstractVectorDatabase<ArticleData> {
  public getDatabaseUri = (): string => "./data/articles.lance";

  public getEmbeddingModel = (): {
    provider: EmbeddingProviderType;
    model: EmbeddingModelType["model"];
  } => ({ provider: "openai", model: "text-embedding-3-small" });

  public getSchema = (): { [K in keyof ArticleData]: FieldValueType } => ({
    metadata: new Utf8(),
  });
}
```

### `getDatabaseUri()`

Returns the URI where LanceDB stores its data — a local directory path (`./data/articles.lance`) or a remote/object-store URI supported by LanceDB.

### `getEmbeddingModel()`

Returns the embedding `provider` and `model`. The provider is `openai`; the model is one of the OpenAI embedding models. See [Embeddings](/ai/rag/embeddings) for the full list.

### `getSchema()`

Returns the schema for your custom columns, using [Apache Arrow](https://arrow.apache.org) data types. The `id`, `text`, and `vector` columns are added automatically — you only declare the rest. The accepted types are listed in [`FieldValueType`](#fieldvaluetype).

## Connecting

Call `connect()` before any table operation. It opens the LanceDB connection at the database URI.

```typescript theme={null}
const db = new ArticleVectorDatabase();
await db.connect();
```

`getDatabase()` returns the underlying LanceDB `Connection`. It throws a `VectorDatabaseException` if you call it before `connect()`.

```typescript theme={null}
const connection = db.getDatabase(); // throws if not connected
```

## Opening a table

`open()` returns a [`VectorTable`](/ai/rag/vector-table). If the table already exists it is opened as-is; if it does not exist, it is created with the schema and indexed automatically.

```typescript theme={null}
const table = await db.open("articles");
```

On **creation**, three indexes are built so search is fast immediately:

| Column   | Index           | Purpose                                         |
| -------- | --------------- | ----------------------------------------------- |
| `id`     | btree           | Fast lookups by identifier.                     |
| `text`   | full-text (FTS) | Keyword half of hybrid search.                  |
| `vector` | IVF-PQ          | Approximate nearest-neighbor (semantic) search. |

### Options

| Option | Type                      | Default       | Description                                                                                                                        |
| ------ | ------------------------- | ------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| `mode` | `"create" \| "overwrite"` | `"overwrite"` | How to create the table when it does not yet exist. `"create"` fails if it exists at the storage layer; `"overwrite"` replaces it. |

```typescript theme={null}
const table = await db.open("articles", { mode: "create" });
```

<Note>
  The `mode` option only applies when the table is being created. If a table with that name already exists, `open()` returns the existing table and ignores `mode`.
</Note>

## Registering with the container

Use the `decorator.vectorDatabase()` decorator to register your database class with the DI container, then resolve it anywhere.

```typescript theme={null}
import { decorator } from "@ooneex/rag";

@decorator.vectorDatabase()
class ArticleVectorDatabase extends AbstractVectorDatabase<ArticleData> {
  // ...
}
```

By default the class is registered as a singleton. Pass a scope to change that:

```typescript theme={null}
import { EContainerScope } from "@ooneex/container";

@decorator.vectorDatabase(EContainerScope.Transient)
class ArticleVectorDatabase extends AbstractVectorDatabase<ArticleData> {
  // ...
}
```

Resolve it from the container:

```typescript theme={null}
import { container } from "@ooneex/container";

const db = container.get(ArticleVectorDatabase);
await db.connect();
```

## `FieldValueType`

Schema fields accept these Apache Arrow types:

`Null`, `Bool`, `Int8`–`Int64`, `Uint8`–`Uint64`, `Float16`–`Float64`, `Utf8`, `LargeUtf8`, `Binary`, `LargeBinary`, `Decimal`, `DateDay`, `DateMillisecond`, and `EmbeddingFunction`.

## Exceptions

`VectorDatabaseException` is thrown when a database operation fails — most commonly calling `getDatabase()` (directly or via `open()`) before `connect()`. It carries a machine-readable `key` (for example `VECTOR_DB_NOT_CONNECTED`) and a human-readable `message`.

```typescript theme={null}
import { VectorDatabaseException } from "@ooneex/rag";

try {
  db.getDatabase();
} catch (error) {
  if (error instanceof VectorDatabaseException) {
    console.error(`[${error.key}] ${error.message}`);
  }
}
```
