Browser based site search
By R. S. Doiel, 2022-11-18
I recently read Brewster Kahle’s 2015 post about his vision for a distributed web. Many of his ideas have carried over into DWeb, Indie Web, Small Web, Small Internet and the like. A point he touches on is site search running in the web browser.
I’ve use this approach in my own website relying on LunrJS by Oliver Nightingale. It is a common approach for small sites built using Markdown and Pandoc. In the Brewster article he mentions js-search, an implementation I was not familiar with. Like LunrJS the query engine runs in the browser via JavaScript but unlike LunrJS the indexes are built using PHP rather than JavaScript. The last couple of years I’ve used Lunr.py to generating indexes for my own website site while using LunrJS for the browser side query engine. Today I check to see what the Hugo community is using and found Pagefind. Pagefind looks impressive. There was a presentation on at Hugo Conference 2022. It takes building a Lucene-like index several steps further. I appears to handle much larger indexes without requiring the full indexes to be downloaded into the browser. It seems like a good candidate for prototyping personal search engine.
How long have been has browser side search been around? I do not remember when I started using. I explored seven projects on GitHub that implemented browser side site search. This is an arbitrary selection projects but even then I had no idea that this approach dates back a over decade!
Project | Indexer | query engine | earliest commit1 | recent commit2 |
---|---|---|---|---|
LunrJS | JavaScript | JavaScript | 2011 | 2020 |
Fuse.io | JavaScript/Typescript | JavaScript/Typescript | 2012 | 2022 |
search-index | JavaScript | JavaScript | 2013 | 2016 |
js-search (cebe) | PHP | JavaScript | 2014 | 2022 |
js-search (bvaughn) | JavaScript | JavaScript | 2015 | 2022 |
Lunr.py | Python | Python or JavaScript | 2018 | 2022 |
Pagefind | Rust | WASM and JavaScript | 2022 | 2022 |