Pandoc, Pagefind and Make
Recently I've refresh my approach to website generation using three programs.
- Pandoc
- Pagefind for providing a full text search of documentation
- GNU Make
- website.mak Makefile
Pandoc does the heavy lifting. It renders all the HTML pages, CITATION.cff (from the projects codemeta.json) and rendering an about.md file (also from the project's codemeta.json). This is done with three Pandoc templates. Pandoc can also be used to rendering man pages following a simple page recipe.
I've recently adopted Pagefind for indexing the HTML for the project's website and providing the full text search UI suitable for a static website. The Pagefind indexes can be combined with your group or organization's static website providing a rich cross project search (exercise left for another post).
Finally I orchestrate the site construction with GNU Make. I do this with a simple dedicated Makefile called website.mak.
website.mak
The website.mak file is relatively simple.
#
# Makefile for running pandoc on all Markdown docs ending in .md
#
PROJECT = PROJECT_NAME_GOES_HERE
MD_PAGES = $(shell ls -1 *.md) about.md
HTML_PAGES = $(shell ls -1 *.md | sed -E 's/.md/.html/g') about.md
build: $(HTML_PAGES) $(MD_PAGES) pagefind
about.md: .FORCE
cat codemeta.json | sed -E 's/"@context"/"at__context"/g;s/"@type"/"at__type"/g;s/"@id"/"at__id"/g' >_codemeta.json
if [ -f $(PANDOC) ]; then echo "" | pandoc --metadata title="About $(PROJECT)" --metadata-file=_codemeta.json --template codemeta-md.tmpl >about.md; fi
if [ -f _codemeta.json ]; then rm _codemeta.json; fi
$(HTML_PAGES): $(MD_PAGES) .FORCE
pandoc -s --to html5 $(basename $@).md -o $(basename $@).html \
--metadata title="$(PROJECT) - $@" \
--lua-filter=links-to-html.lua \
--template=page.tmpl
git add $(basename $@).html
pagefind: .FORCE
pagefind --verbose --exclude-selectors="nav,header,footer" --bundle-dir ./pagefind --source .
git add pagefind
clean:
@if [ -f index.html ]; then rm *.html; fi
@if [ -f README.html ]; then rm *.html; fi
.FORCE:
Only the "PROJECT" value needs to be set. Typically this is just the name of the repository's base directory.
Pandoc, filters and templates
When write my Markdown documents I link to Markdown files instead of the HTML versions. This serves two purposes. First GitHub can use this linking directory and second if you decide to repurposed the website as a Gopher or Gemini resource you don't linking to the Markdown file makes more sense. To convert the ".md" names to ".html" when I render the HTML I use a simple Lua filter called links-to-html.lua.
# links-to-html.lua
function Link(el)
el.target = string.gsub(el.target, "%.md", ".html")
return el
end
The "page.tmpl" file provides a nice wrapper to the Markdown rendered as HTML by Pandoc. It includes the site navigation and project copyright information in the wrapping HTML. It is based on the default Pandoc page template with some added markup for navigation and copyright info in the footer. I also update the link to the CSS to conform with our general site branding requirements. You can generate a basic template using Pandoc.
pandoc --print-default-template=html5
I also use Pandoc to generate an "about.md" file describing the project and author info. The content of the about.md is taken directly from the project's codemeta.json file after I've renamed the "@" JSON-LD fields (those cause problems for Pandoc). You can see the preparation of a temporary "_codemeta.json" using cat and sed to rename the fields. This is I use a Pandoc template to render the Markdown from.
---
title: $name$
---
About this software
===================
$name$ $version$
----------------
$if(author)$
### Authors
$for(author)$
- $it.givenName$ $it.familyName$
$endfor$
$endif$
$if(description)$
$description$
$endif$
$if(license)$- License: $license$$endif$
0$if(codeRepository)$- GitHub: $codeRepository$$endif$
$if(issueTracker)$- Issues: $issueTracker$$endif$
$if(programmingLanguage)$
### Programming languages
$for(programmingLanguage)$
- $programmingLanguage$
$endfor$
$endif$
$if(operatingSystem)$
### Operating Systems
$for(operatingSystem)$
- $operatingSystem$
$endfor$
$endif$
$if(softwareRequirements)$
### Software Requiremets
$for(softwareRequirements)$
- $softwareRequirements$
$endfor$
$endif$
$if(relatedLink)$
### Related Links
$for(relatedLink)$
- [$it$]($it$)
$endfor$
$endif$
This same technique can be repurposed to render a CITATION.cff if needed.
Pagefind
Pagefind provides three levels of functionality. First it will generate indexes for a full text search of your
project's HTML pages. It also builds the necessary search UI for your static site. I include the search UI via a
Markdown document that embeds the HTML markup described at Pagefind.app's Getting started
page. When I invoke Pagefind I use the --bundle-dir option to be "pagefind" rather than "_pagefind". The reason is GitHub Pages ignores the "pagefind" (probably ignores all directories with "" prefix).
If you need a quick static web server while you're writing and developing your documentation website Pagefind can
provide that using the --serve option. Assuming you're in your project's directory then something like this should do the trick.
pagefind --source . --bundle-dir=pagefind --serve