> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ooneex.com/llms.txt
> Use this file to discover all available pages before exploring further.

# PDF

> Generate, edit, split, and extract text and images from PDF files with page-level access

`@ooneex/pdf` is a toolkit for working with PDF documents on the file system. A single `PDF` class, pointed at a file path, lets you create documents, add pages, read and update metadata, split documents, remove pages, and extract text and images page by page. It builds on `pdf-lib`, `unpdf`, `pdf-to-img`, and `sharp`.

## Installation

Add the package to your project with Bun.

```bash theme={null}
bun add @ooneex/pdf
```

## Usage

Create a `PDF` instance with the path to a file. The same instance is used both to author new documents and to read or transform existing ones.

```typescript theme={null}
import { PDF } from "@ooneex/pdf";

// Create a new document with metadata, then add a page of text
const pdf = new PDF("/path/to/output.pdf");

await pdf.create({
  title: "My Document",
  author: "Jane Doe",
  keywords: ["example", "pdf"],
});

await pdf.addPage({ content: "Hello, World!", fontSize: 24 });
```

### Reading text content

Extract text from a single page, or stream every page with the async generator.

```typescript theme={null}
const pdf = new PDF("/path/to/document.pdf");

const text = await pdf.getPageContent(1);

for await (const { page, text } of pdf.pagesToText()) {
  console.log(`Page ${page}: ${text}`);
}
```

### Metadata and page count

```typescript theme={null}
const pdf = new PDF("/path/to/document.pdf");

const meta = await pdf.getMetadata();
console.log(meta.title, meta.author, meta.pageCount);

await pdf.updateMetadata({ title: "Updated Title", modificationDate: new Date() });
```

### Splitting and removing pages

`split` writes each range to its own file under `outputDir`; `removePages` rewrites the source file in place.

```typescript theme={null}
const pdf = new PDF("/path/to/document.pdf");

// Split into pages 1-3, page 5, and pages 7-10
for await (const part of pdf.split({
  outputDir: "/output",
  ranges: [[1, 3], 5, [7, 10]],
})) {
  console.log(part.path, part.pages);
}

// Remove individual pages and ranges (1-indexed)
const { remainingPages } = await pdf.removePages([1, [4, 6], 10]);
```

### Converting pages and extracting images

Render pages to PNG images, or pull embedded images out of the document. Both save to disk and yield results as they go.

```typescript theme={null}
const pdf = new PDF("/path/to/document.pdf", { scale: 3 });

// Render every page to a PNG
for await (const { page, path } of pdf.pagesToImages({ outputDir: "/output" })) {
  console.log(`Rendered page ${page} to ${path}`);
}

// Extract embedded images from page 1
for await (const img of pdf.getImages({ outputDir: "/output", pageNumber: 1 })) {
  console.log(`${img.path} (${img.width}x${img.height})`);
}
```

## When to use it

* Generating simple PDF documents on the server (reports, receipts, exports) with text and metadata.
* Extracting plain text from uploaded PDFs for search, indexing, or parsing.
* Splitting a large PDF into smaller files or removing unwanted pages.
* Rendering pages to PNG thumbnails/previews, or pulling embedded images out of a document.
* For encrypted PDFs, pass a `password` in the constructor options.
* You do not need it for rich, pixel-perfect page layouts or HTML-to-PDF rendering — reach for a dedicated layout/print engine instead.
