Digital Projects
Tools and applications built to solve specific research problems. Most emphasise privacy, local processing, and researcher control over data. They are designed with an acute focus on researcher's workflows and needs.
Open-Source Software
Open-NotebookLM-Ollama ↗
A local, privacy-focused fork of open-notebooklm that generates podcast-style conversations from documents using Ollama instead of paid APIs. Features include local LLM support, focus areas, deep discussion mode, and extended dialogues.
Technologies: Python
Facet Manager ↗
An OpenRefine extension (Java and JavaScript) that allows users to import and export Facets to/from an OpenRefine project, bypassing the limited built-in Permalink functionality.
Technologies: Java, JavaScript
Web Projects
VERITRACE Research Platform ↗
A Flask-based web application with Elasticsearch backend that enables sophisticated searches across a 413,000+ corpus of digital texts in 6 languages spanning over 200 years. Sentence-level vector embeddings power the Text Matching tool.
Technologies: Python, Flask, JavaScript, Elasticsearch
williamcullen.org ↗
Digital archive of 18th-century writings with semantic entity tagging (people, institutions, works, places), full-text search, and analytics dashboards. Features William Cullen's published works and clinical correspondence, with occasional additions from contemporaries like Joseph Black, David Hume, or Lord Kames.
Technologies: SvelteKit, JavaScript, Tailwind CSS
MacOS Applications
Native applications built with Swift/SwiftUI
Almanak →
An end-to-end digitization studio, from input document (PDF or image folder) to accurate transcription with sophisticated markdown-based annotation. Handles preprocessing, layout analysis, OCR, and transcription. Users can choose any locally-installed Ollama LLM, Kraken OCR models, Apple's Vision framework, or Tesseract, and compare results from up to 4 models simultaneously. Designed to replace OCR-4-ALL or Transkribus in most workflows.
Status: In development
Scribe [see the video walkthrough] →
Built on top of the power of a Knowledge Graph, Scribe rethinks the historical research workflow from the ground up and is designed to meet the needs of historians (and humanities researchers more generally) - how they work, write, and conduct research, not in theory but in actuality. Planning and Search tools are built into the application; a Triage folder supports directed, uninterrupted research; notes are always connected to the citations they reference and are shown alongside project documents; a powerful database-backed reference manager can handle tens of thousands of citations; and Scribe offers a powerful word processing workflow that exports to Microsoft Word, whilst automatically formatting citations and bibliographies based on the chosen citation style. Under the hood, analytics are automatically calculated directly from the Knowledge Graph.
Status: In development
Quire (open source) [see the video walkthrough] →
A paleographic analysis tool for historians and researchers working with manuscript handwriting. Combines computer vision techniques (SIFT, FAST) with intuitive interfaces for analyzing, comparing, and managing handwriting samples, including creating and comparing handwriting 'profiles'.
Status: In development
Proso (open source) →
A biographical research tool that streamlines prosopographical work by accessing modern authority control systems (VIAF, LOC, Wikidata) and aggregating narrative biographical content from authoritative sources (Wikisource, Wikipedia, Deutsche Biographie, BnF). Users enter names and the app helps identify the correct name authority and automatically downloads biographical data.
Status: In development
Variorum (coming soon) →
Variorum is a native macOS app that transforms the creation of critical editions from tedious manual labor into intelligent scholarly analysis. Designed for textual scholars working with multiple editions of the same text, Variorum allows you to import page images and transcriptions, refine them side-by-side, and then automatically align corresponding passages across witnesses—even when pagination, chapter numbering, or organization differs. The app detects all textual variants, helps you classify their significance, and generates properly formatted critical apparatus in traditional notation, reducing months of variant-hunting and manual formatting to days of focused editorial work. Export your finished edition as publication-ready markdown or structured JSON for digital humanities projects, maintaining a single authoritative source for both print and digital outputs.
Status: Early development
Postmark (coming soon) →
Postmark is a native macOS app for scholars managing large-scale correspondence projects. Rather than focusing on transcription, Postmark helps you build and analyze systematic databases of letters—tracking who wrote to whom, when, from where, and about what. Import existing spreadsheets or start fresh, creating rich records for each letter (dates, locations, repositories, transcription status) and each correspondent (biographical data, relationships, roles). Visualize correspondence networks with interactive graphs, plot letters geographically and chronologically, identify gaps in collections, and generate proper scholarly citations. Whether you're editing a complete papers project or studying an epistolary network, Postmark transforms scattered metadata into a queryable, analyzable research database—helping you see patterns, connections, and gaps that would remain invisible in spreadsheets alone.
Status: Early development
Analect (coming soon) →
Analect is a macOS app that transforms how you search and navigate your research document collection. It indexes PDFs, markdown files, text documents, and more, creating a powerful dual-mode search engine that combines traditional keyword matching with advanced semantic search—letting you find documents by meaning and context, not just exact words. With native macOS integration and a clean three-pane interface, Analect lets you instantly discover relevant documents whether you search for "machine learning algorithms" or "how computers learn from data," view results with highlighted context snippets, and read files with proper rendering—whether that's native PDF viewing or formatted markdown. Instead of hunting through folders or relying on vague filename searches, Analect gives you Google-like search power with AI-enhanced understanding over your entire document library, making it perfect for researchers, writers, students, and anyone managing a large collection of reference materials.
Status: Early development
View source code and additional projects on GitHub ↗