Producing Technical Documents as Tagged PDF documents – Accessibility in Latex

Latex's PDF tagging features now start to be usable.

Accessibility with Latex

The problem

Ensuring the accessibility of scientific and technical documents that we rely on in our research and teaching poses a significant challenge: Lots of our documents are produced using LaTex – the format of choice for equation-heavy documents. Unfortunately, LaTex does not natively support tagged pdfs, quite yet.

Preparatory Steps

However, the required features are now ready to be used (though still only experimentally so). Hence, you can achieve PDF tagging by including some specialised Latex packages, and this requires only a few extra tweaks:

  • make sure to use the very latest Latex engines, as older versions do not support tagged pdf at all. The TexLive 2024 distribution and later are appropriate.
  • update your tex distribution so you get the latest development versions for all packages
  • make sure you are careful to have a clean latex file satisfying all coding conventions closely. You can check this using tools such as ‘chktex’
  • not all Tex functionality is currently supported for tagged pdf output, so it’s best to stick to simple packages and classes

Concrete Implementation

Once those conditions are met, you need to add a few things in the preamble of your latex file to activate the functionality of the “pdfmanagement-testphase” package. Unlike most packages this requires you to include a few lines preceding the \documentclass command. (For a full documentation of the package, see its CTAN page.)

To satisfy the requirements for screen readers, you also need to include some document metadata. This includes defining the title, author, language, etc. These can be set directly via the \DocumentMetadata command, or via the \hypersetup entries of the hyperref package. Here is an example:

\RequirePackage{pdfmanagement-testphase} % Enable PDF management for tagging
\DocumentMetadata{
testphase = phase-II, % Use Phase-II for intermediate level tagging (Phase-III features are more comprehensive, but not fully functional currently)
pdfversion = 2.0, % Set the PDF version to 2.0 for better compatibility
lang=en-UK % set language for the document
}

\documentclass[11pt]{article}

\usepackage{amsmath} % AMS Math Package

\usepackage{hyperref}
\hypersetup{
pdftitle={Your title goes here}, % Title of the PDF
pdfauthor={Gunnar Moeller}, % Author
pdfkeywords={some topics maybe},% Keywords for the PDF
pdfsubject={What this is all about}, % Subject or description
}

\usepackage{tagpdf} % For tagged PDFs

% Activate tagging and PDF resource management
\tagpdfsetup{activate-all,uncompress}

… remainder of your header and the usual

\begin{document} … \end{document}

That’s all. Some LaTex environments may use offending syntax and lead to compilation errors. Most of them are not serious errors though, and you can convince latex to ignore these and produce a tagged pdf by responding “R” to any queries.

In terms of LaTex engines, lualatex is advertised to be most compatible with “pdfmanagement-testphase”, but in practice I found that the more common pdflatex engine is doing fine, as well.

 

A complete example

A live version of an example is available to view and compile on overleaf (see overleaf project). For convenience, you can also find below the latex code which you can compile the following test document to produce this DummyTaggedPDF:

% !TEX encoding = UTF-8 Unicode
\RequirePackage{pdfmanagement-testphase}
\DocumentMetadata{
testphase = phase-II, % Use Phase-II for intermediate level tagging
pdfversion = 2.0, % Set the PDF version to 2.0 for better compatibility
lang=en-UK % set language for the document
}

\documentclass{article}
\usepackage{hyperref}
\hypersetup{
pdftitle={A sample document supporting tagged pdf}, % Title of the PDF
pdfauthor={Dr Gunnar Möller, PQM Group, Physics and Astronomy, University of Kent}, % Author
pdfkeywords={how to make tagged latex},% Keywords for the PDF
pdfsubject={trying out tagged pdf}, % Subject or description
}

\title{Sample Document supporting tagged pdf}
\author{Dr Gunnar Möller, PQM Group, Physics and Astronomy, University of Kent}
\date{\today}

\begin{document}

\maketitle

\section{Introduction}
%\section{Introduction}
This section should be recognized as a heading in the PDF structure.

\subsection{Background}
This is a subsection that should also be tagged.

\subsubsection{Details}
This is a subsubsection.

\section{Compilation}

You can compile this document using within an up-to date Latex distribution. \verb|lualatex| is the recommended engine, but \verb|pdflatex| can also handle most situations.

\section{Notes}

\begin{itemize}
\item For full compatibility, you should use UTF-8 encoding for the source file.
\end{itemize}

It is also possible to tag figures by embedding the \verb|tagpdf| package
% Example of adding alt-text to an image
\tagstructbegin{tag=Figure, alttext={This is a random smiley face.}}
\tagmcbegin{tag=Figure}
\includegraphics[width=0.2\textwidth]{a-smiley-face.png} % Your image here
\tagmcend
\tagstructend

\end{document}