Python pipeline that automatically transforms any PDF into a fully PDF/UA-1 compliant, accessible document — passing PAC and veraPDF validation out of the box.
Engineer — Architecture, Pipeline Design, PDF Structure Implementation, Validation
PDFs distributed by organizations are rarely accessible to screen readers or assistive technologies. Manual remediation is expensive, slow, and requires specialist knowledge of PDF structure standards — creating a compliance bottleneck for legal, educational, and government documents.
Built a 4-stage Python pipeline that automatically injects structure tags, embeds fonts, writes XMP metadata, and wires link annotations to produce PDF/UA-1 compliant documents. Includes both a CLI and a Streamlit web UI. Validated against PAC and veraPDF — addresses 8 Matterhorn Protocol checkpoints.