@ooneex/html wraps Cheerio in a small, typed Html class for parsing HTML and pulling out structured data. Load markup from a string or a URL, then call focused extractors that return plain typed objects for images, links, headings, videos, and checkbox tasks. It’s built for scraping and content analysis rather than DOM mutation.
Installation
Install the package with Bun.Usage
Create anHtml instance with a markup string (or empty), then query it. All extractors return typed arrays you can iterate directly.
Loading from a URL
UseloadUrl to fetch and parse a remote page. It returns the instance, so you can chain an extractor right away.
load, which returns this for chaining.
Extracting videos and tasks
getVideos collects <video> elements with their attributes and nested <source> tags, and getTasks reads checkbox list items.
getHtml().
When to use it
- Scraping or analyzing remote pages — fetch with
loadUrland pull out links, images, or headings. - Extracting a table of contents or outline from rendered HTML via
getHeadings(). - Collecting media references (
getImages,getVideos) from user-supplied or fetched markup. - Parsing checkbox task lists out of HTML (e.g. rendered Markdown) with
getTasks(). - Grabbing the clean text content of a document with
getContent().