Extremely lightweight fuzzy + highlighted search with @leeoniya/uFuzzy
swyx
2024 update: @leeoniya says “you might be interested in doing a follow-up/update to your post, to use the .search() api, which does more stuff out of the box, such as outOfOrder, quoted, and negatives”
1min youtube short
Context
I’ve wanted to add search to SwyxKit from day 1. Most developer blogs don’t really have search, they have filters - exact matching of lowercased substrings. That is not only the simplest implementation of search, it is also guaranteed to be the lightest-weight version of it in terms of clientside JS weight. The most popular fuzzy search library in JS is Fuse.js, which weighs 15kb - not horrendous, but almost half as much as the rest of the JS i ship put together (33kb).
Filters are clearly not the best user experience possible, yet the existing menu of options either involve a third party SaaS (Algolia), a running server (Typesense), network latency (Fuse.js on server) or clientside burden (Fuse.js on client).
This changed for me as I learned about alternatives, for example https://github.com/kbrsh/wade which is 1kb but has no fuzzy search. The breakthrough came when @leeoniya posted uFuzzy.js on Hacker News 3 months ago, with decent fuzzy search and only coming in at 4kb total size.
I implemented it today and even a very naive implementation wasnt too bad
You can see the PR in #169, but I’ll describe the steps here.
Making Hay
The first problem to encounter is that the uFuzzy docs offer cute API examples that only assume an array of strings, but most blog search comes in structured data of objects. I solved it by simple concat (as I later found others also did):
// cleaned up code example
let haystack = rawdata.map(v => [v.title, v.subtitle, v.tags, v.content, v.description].join(' '))
let needle = 'foo' // search term;
let u = new uFuzzy();
let idxs = u.filter(haystack, needle);
let info = u.info(idxs, haystack, needle);
let order = u.sort(info, haystack, needle);
let searchResults = order.map(i => haystack[info.idx[order[i]]])
and this worked nicely enough.
Highlighting results
uFuzzy also offers a highlight
functionality, which is super cool. It involves inserting HTML into the text:
// as before
let idxs = u.filter(haystack, needle);
let info = u.info(idxs, haystack, needle);
let order = u.sort(info, haystack, needle);
// highlighted search results
let searchResults = order.map(i => {
const x = haystack[info.idx[order[i]]]
const hl = uFuzzy.highlight(
haystack[info.idx[order[i]]],
info.ranges[order[i]],
// marker function. add color for a bit of flair
(part, matched) => matched ? '<b style="color:var(--brand-accent)">' + part + '</b>' : part;
)
})
There are 3 small details to handle here:
- I don’t normally render the text as raw HTML, so I had to create a new path for rendering that HTML
- while I want to render the added marker HTML, my text itself has some html entities that I dont want to render
- The marker just marks the part of the text, but doesnt do anything to help you truncate the long text to hone in on the marker
Solution
<script>
// ...
let searchResults = order.map(i => {
const x = haystack[info.idx[order[i]]]
const hl = uFuzzy.highlight(
haystack[info.idx[order[i]]]
// replace html with same # of character whitespace as we dont actually want to render those html entities
.replace("<", " ")
.replace("/>", " ")
.replace(">", " "),
info.ranges[order[i]],
mark // mark function adds back the mark HTML we actually DO want to render
)
// nice slice surrounding the marked HTML
.slice(Math.max(info.ranges[order[i]][0]-100,0), Math.min(info.ranges[order[i]][1]+100, haystack[info.idx[order[i]]].length));
return {...x, highlightedResults: hl}
})
</script>
{#each searchResults as item}
<li class="mb-8 text-lg">
{#if item.highlightedResults}
{@html item.highlightedResults}
{:else}
{item.description}
{/if}
</li>
{/each}
This leads to a very nice search experience: