librarian

Changelog

1.7.1 - 2026-06-26

1.7.0 - 2026-06-26

1.6.1 - 2026-06-14

Relicensed to MIT, plus an OKF output mode in the Mac app and a small PDF cleanup. (The v1.6.0 tag was accidentally created on the v1.5.0 commit before this work merged; protected tags cannot be moved, so it is retained as inert history and superseded by 1.6.1.)

1.5.0 - 2026-06-14

Librarian can now emit a processed corpus as an Open Knowledge Format (OKF) v0.1 bundle — a vendor-neutral, agent- and human-readable knowledge format (spec). Turn a pile of scanned PDFs, transcripts, and documents into a portable knowledge wiki an agent can reason over.

1.4.0 - 2026-06-13

Makes the CLI fully scriptable, so an agent can drive bulk document processing end to end without scraping human-readable tables.

1.3.0 - 2026-06-13

Scanned and image-based PDFs now work in the Mac app, out of the box.

1.2.0 - 2026-06-12

Cleaned documents now come out of the pipeline shelf-ready: named, summarized, and tagged.

1.1.8 - 2026-06-11

Closes the last silent black hole. A stale preference combination — built-in engine disabled plus an empty external server address — could survive reinstalls indefinitely; the app started no engine, sent every file to an empty address, and showed nothing wrong.

1.1.7 - 2026-06-11

Fixes “Couldn’t reach the AI provider” failures on Macs where the app could connect but the embedded engine could not. The app’s own networking follows macOS system settings; the bundled Python runtime does not. The app now bridges both into the engine’s environment at launch:

1.1.6 - 2026-06-11

Settings becomes connect-first and idiot-proof:

1.1.5 - 2026-06-11

The Mac app is redesigned around its real job — a pipeline, not a database browser. One window, one column, one verb: drop files, pick a destination, let it cook.

1.1.3 - 2026-06-11

First fully published release of the 1.1 line, containing all 1.1.0 and 1.1.1 changes below. This repository publishes immutable releases, so assets cannot be attached after publication; the release workflow now waits for the Mac app DMG builds, collects them as workflow artifacts, and includes them — checksummed alongside the engine artifacts — in a single atomic release creation. The v1.1.1 release published with engine artifacts only (its DMG attach step was rejected by release immutability) and is superseded by this version. The v1.1.0–v1.1.2 tags remain as inert history: v1.1.2 was accidentally created on the 1.1.1 commit before this release’s pipeline fix merged, and protected tags cannot be moved.

1.1.1 - 2026-06-11

Patch release on top of the unpublished 1.1.0. The Docker image build now upgrades base-layer packages, picking up Debian’s fix for CVE-2026-45447 (OpenSSL), which was published mid-release and blocked the image scan gate. The v1.1.0 GitHub release was never published: its tag hit a release-assembly race (fixed in this version’s workflows) and is retained as an inert tag. v1.1.1 is the first release with attached Mac app DMGs; all 1.1.0 changes below are included.

1.1.0 - 2026-06-11

Librarian 1.1.0 introduces the native macOS app: a self-contained download with the entire engine inside. Release builds bundle a relocatable Python runtime plus the Librarian wheel in Librarian.app, launch the backend automatically on a loopback port secured by a random per-launch API key, and store data in ~/Library/Application Support/Librarian. The app offers drag-and-drop ingest, live per-run progress with expandable run events, cleaned-output viewing with classification, full-text search, Markdown export, and a backend readiness checklist — all over the same public HTTP API the CLI uses. DMG installers for Apple Silicon and Intel are built by the new macapp.yml workflow and attached to releases, with optional Developer ID signing and notarization via repository secrets, plus a download landing page under site/.

Engine and tooling changes:

1.0.0 - 2026-05-22

Librarian 1.0.0 is the stable release of the local-first document ingestion, cleaning, classification, and search engine. It ships a focused user CLI and FastAPI service for converting documents, importing corpora, running provenance-rich LLM cleaning, classifying outputs with Dewey-style labels, searching SQLite FTS indexes, and exporting cleaned content with optional transcript citation evidence. The release supports Markdown, text-like files, DOCX, PDFs, OCR images, and SRT/VTT transcript normalization, including page-aware PDF extraction with durable OCR page manifests for long-running jobs. Operational commands are grouped under librarian admin, while evaluation and benchmark tools are grouped under librarian maintainer so the production surface stays clear. The release workflow keeps secret scanning, dependency audit, SBOM generation, checksums, artifact attestations, wheel smoke installation, Docker build, and image scanning, while removing alpha-era mock evidence artifacts from published releases.