~evan-hoose/a-shared-404

05404996830a78f5ecbf8e4a899bacbc0403f1ab — Evan 9 months ago 0fc2ecd
 On branch master
 Changes to be committed:
	modified:   top/blog/a-better-search-solution/a-better-search-solution.md
	modified:   top/blog/a-better-search-solution/index.html
 Changes not staged for commit:
	deleted:    footer.html
M top/blog/a-better-search-solution/a-better-search-solution.md => top/blog/a-better-search-solution/a-better-search-solution.md +33 -1
@@ 1,4 1,4 @@
#A less stupid search solution.
#A better search solution.

Drew DeVault wrote [this](https://drewdevault.com/2020/11/17/Better-than-DuckDuckGo.html).
(Read that first. It'll provide useful context I won't explain.)


@@ 9,8 9,40 @@ What is the best way to implement a search engine in this style?

---

##Before you begin:

This is a living document. I will make edits and other changes without warning.

I am mostly using this as notekeeping for myself, and have made it publicly
availalbe in the hopes that it will be useful. 

I do intend to keep playing around with this, but it is strictly on a spare
time/I feel like it basis.

##Resources for learning:

As I've started studying this, I've come across some resources. I'll link them
here in case anyone else is looking.

[Introduction to Information Retrieval](https://nlp.stanford.edu/IR-book/information-retrieval-book.html)
 -- A book by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. 
Just started it, but it looks promising.

[Information Retrieval Resources](https://nlp.stanford.edu/IR-book/information-retrieval.html)
 -- Resource link dump provided by the above authors.

[Arden Dertat's Blog](http://www.ardendertat.com/2011/05/30/how-to-implement-a-search-engine-part-1-create-index/)
 -- Conveniently, Arden Dertat has a series of blog posts about building search
engines. It looks good, but I don't know enough about the topic to confirm or deny. 


##Proposed Architecture

NOTE: Most of what is described below is either frontend, or the very front of
the backend. Why? Because that's what I knew enough to write about when I
started. I'm currently studying the resources linked above, and will update as
I learn more.

We will have three main components:

* The web crawler

M top/blog/a-better-search-solution/index.html => top/blog/a-better-search-solution/index.html +32 -1
@@ 72,7 72,7 @@ code {
        <a href="/other-stuff" class="inactive">Other Stuff</a>
        <hr class="tab-bar-hr">
</div>
<h1>A less stupid search solution.</h1>
<h1>A better search solution.</h1>

<p>Drew DeVault wrote <a href="https://drewdevault.com/2020/11/17/Better-than-DuckDuckGo.html">this</a>.
(Read that first. It'll provide useful context I won't explain.)</p>


@@ 83,8 83,39 @@ code {

<hr />

<h2>Before you begin:</h2>

<p>This is a living document. I will make edits and other changes without warning.</p>

<p>I am mostly using this as notekeeping for myself, and have made it publicly
availalbe in the hopes that it will be useful. </p>

<p>I do intend to keep playing around with this, but it is strictly on a spare
time/I feel like it basis.</p>

<h2>Resources for learning:</h2>

<p>As I've started studying this, I've come across some resources. I'll link them
here in case anyone else is looking.</p>

<p><a href="https://nlp.stanford.edu/IR-book/information-retrieval-book.html">Introduction to Information Retrieval</a>
 -- A book by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. 
Just started it, but it looks promising.</p>

<p><a href="https://nlp.stanford.edu/IR-book/information-retrieval.html">Information Retrieval Resources</a>
 -- Resource link dump provided by the above authors.</p>

<p><a href="http://www.ardendertat.com/2011/05/30/how-to-implement-a-search-engine-part-1-create-index/">Arden Dertat's Blog</a>
 -- Conveniently, Arden Dertat has a series of blog posts about building search
engines. It looks good, but I don't know enough about the topic to confirm or deny. </p>

<h2>Proposed Architecture</h2>

<p>NOTE: Most of what is described below is either frontend, or the very front of
the backend. Why? Because that's what I knew enough to write about when I
started. I'm currently studying the resources linked above, and will update as
I learn more.</p>

<p>We will have three main components:</p>

<ul>