Almanak

An end-to-end digitization studio, from input document (PDF or image folder) to accurate transcription with sophisticated markdown-based annotation. Handles preprocessing, layout analysis, OCR, and transcription. Users can choose any locally-installed Ollama LLM, Kraken OCR models, Apple's Vision framework, or Tesseract, and compare results from up to 4 models simultaneously. Designed to replace OCR-4-ALL or Transkribus in most workflows.

Status: In development