Avoid duplicates when reindexing
Fix test for static why page.
Don't pre-check sources checkboxes; selecting none is fine.
My own personal search engine.
Installation:
sudo apt install fennel lua-sql-sqlite3 lua-luv pandoc
(You might not find fennel
in apt yet depending on your distro, but
it's available in my third-party repo or
you can install manually easily enough.)
Create a file of URLs, then index it with:
make index URLS=/path/to/urls
Run server:
make run PORT=8080
Originally based on this post: https://hey.hagelb.org/users/technomancy/statuses/01J1AYF55SMFK81JS13S64YF5V
Well, it's very simple you see.
You start out with a file that contains a list of URLs. Maybe it's your bookmarks file that you've been lovingly collecting over the past 9 years. Maybe you got it from grepping some archive you found. Maybe it came from an RSS feed? Or you could extract it from your social media account. Doesn't matter.
Point the indexer at this file, and it will crawl it. For each successfully-fetched response, if it's text, it will toss it straight into SQLite's full-text search. If it's HTML, it'll pass it thru pandoc first to get something a little more palatable.
Once the indexing is done, you can launch a web server that will serve up a search page and give you responses in your browser! You probably want to put Caddy in front of it to give it TLS, but this isn't strictly required.
I lied! It's not that simple.
Every time you index a set of URLs, you provide a source
field which
will be stored with the pages from that indexing run. When you go to
make a search, you can select what sources you wish to use on a
per-query basis.
Local sources get added automatically at indexing time. But you can also add remote sources so that searches include results across a number of sites.
To add a remote, run this:
fennel main.fnl add-remote "https://search.technomancy.us?q=%s" \
"technomancy searches" "https://search.technomancy.us" \
"a search engine"
Then it will show up in the sources listing for each search.
https://search.technomancy.us/q.json?q=hello
{
"results": [
{
"url": "https://fennel-lang.org",
"rank": 1.9,
"title": "The Fennel Programming Language"
},
{
"url": "https://leiningen.org",
"rank": 1.68,
"title": "Leiningen"
},
{
"url": "https://technomancy.us/202",
"rank": 1.44,
"title": "in which things once suspended are resumed - Technomancy"
}
]
}
Install tidy
from apt or whatever.
Run make test-server
in the background before make test
.
Copyright © 2025 Phil Hagelberg
Released under the MIT license.
Dependencies: