Librarian uses hexagonal architecture so the core can run through a CLI, an API service, tests, or future workers without changing business logic.
Adapters
CLI: Typer
API: FastAPI
Mac app: SwiftUI client of the API (apps/macos), embeds the backend in release builds
Storage: SQLite repository and SQLite-backed content store
LLM: OpenAI-compatible, mock
Extraction: txt, md, csv, json, docx, pdf, OCR images
Application
IngestDocument
ProcessDocument
SearchLibrary
ExportDocument
Domain
Document
SourceFile
Chunk
ProcessingRun
CleanedOutput
Classification
Taxonomy
Ports
DocumentRepository
RunRepository
ContentStore
TextExtractor
LLMProvider
TaxonomyProvider
SearchIndex
EventSink
RunQueue
RunQueue is the durable job backend port. Queue adapters must support enqueue, claim, heartbeat,
complete, fail/retry, cancel, and paginated list operations so CLI/API operator views keep the same
shape when SQLite is replaced by a networked backend.
The pipeline is a resumable DAG:
The default execution model should favor throughput:
fast: flat parallel chunk cleaning using the configured chunk overlap for boundary context.balanced: parallel chunk groups with local carry-forward inside each group.max-coherence: sequential carry-forward across the full document.The default production mode is balanced, which keeps local context while preserving parallelism.
The content store persists raw text, chunks, cleaned chunks, final outputs, and the FTS mirror in SQLite. This keeps the first release portable and easy to back up, but it duplicates large text payloads. A filesystem or object-store content adapter remains a future option for very large hosted deployments; SQLite is the supported 1.0 backend.
Search goes through the application-layer SearchIndex port. The default adapter is SQLite FTS over
cleaned and raw outputs, with snippets, facets, pagination, and filters. Results use BM25 ranking
with deterministic created-at and document-ID tie-breakers so pagination is stable. User queries are
normalized before MATCH so ordinary punctuation and hyphenated terms behave like word queries
instead of exposing raw FTS syntax. Future semantic or hybrid indexes should implement the same port
rather than changing API or CLI route code.
Prompts live under src/librarian/prompts. Prompt text is versioned and recorded in run metadata.
The default cleaning prompt is cmos_v2, which preserves the prototype’s CMOS copy-editing intent
while adding explicit instructions for OCR cleanup, structure preservation, context-marker handling,
and chunk-local fidelity. cmos_v1 remains bundled so older run provenance and cache keys stay
resolvable. Classification prompts are versioned the same way with dewey_v1 and dewey_v2.
Startup settings reject prompt versions that are not bundled with the package.
SQLite schema changes live in src/librarian/storage/migrations and are applied in filename order. Applied versions are recorded in schema_migrations.
The API submits processing work through an application-level job runner instead of FastAPI
BackgroundTasks. The default runner is bounded and in-process for local use. Production
deployments can set LIBRARIAN_JOB_BACKEND=sqlite and run librarian worker as a separate
process. The SQLite queue uses leases, retry backoff, attempt limits, and persisted state so API
processes can restart independently of workers.
Run events can be fetched as JSON or streamed over server-sent events.
The librarian maintainer benchmark command uses deterministic synthetic text and the configured cleaner to measure chunking and cleaning throughput. This is the baseline harness for comparing chunking policies, coherence modes, providers, and concurrency settings.
The librarian maintainer eval command runs JSON eval suites against the configured chunking,
prompt, and provider stack. Evals are intentionally file-based so contributors can add sanitized
cases without coupling the harness to private corpora. Operational tuning guidance lives in
docs/OPERATIONS.md.