Browser based site search

By R. S. Doiel, 2022-11-18

I recently read Brewster Kahle’s 2015 post about his vision for a distributed web. Many of his ideas have carried over into DWeb, Indie Web, Small Web, Small Internet and the like. A point he touches on is site search running in the web browser.

I’ve use this approach in my own website relying on LunrJS by Oliver Nightingale. It is a common approach for small sites built using Markdown and Pandoc. In the Brewster article he mentions js-search, an implementation I was not familiar with. Like LunrJS the query engine runs in the browser via JavaScript but unlike LunrJS the indexes are built using PHP rather than JavaScript. The last couple of years I’ve used to generating indexes for my own website site while using LunrJS for the browser side query engine. Today I check to see what the Hugo community is using and found Pagefind. Pagefind looks impressive. There was a presentation on at Hugo Conference 2022. It takes building a Lucene-like index several steps further. I appears to handle much larger indexes without requiring the full indexes to be downloaded into the browser. It seems like a good candidate for prototyping personal search engine.

How long have been has browser side search been around? I do not remember when I started using. I explored seven projects on GitHub that implemented browser side site search. This is an arbitrary selection projects but even then I had no idea that this approach dates back a over decade!

Project Indexer query engine earliest commit1 recent commit2
LunrJS JavaScript JavaScript 2011 2020 JavaScript/Typescript JavaScript/Typescript 2012 2022
search-index JavaScript JavaScript 2013 2016
js-search (cebe) PHP JavaScript 2014 2022
js-search (bvaughn) JavaScript JavaScript 2015 2022 Python Python or JavaScript 2018 2022
Pagefind Rust WASM and JavaScript 2022 2022

  1. Years are based on checking reviewing the commit history on GitHub as of 2022-11-18.↩︎

  2. Years are based on checking reviewing the commit history on GitHub as of 2022-11-18.↩︎