Site search using Golang?

Sibert · August 5, 2019, 8:42am

Almost all web sites have a “search” button for searching for contents in a website.
As a temporary solution I am using the Goggle custom search, but I wonder if there is a simple way to do this using Go. Like:

Fetching all html sites (already in memory using Golang web server?)
Search and analyze contents within tags from a form input.
Presenting the result as a list on a web page.

I have searched a lot, but found nothing.

Any tip or clue how to find or think?

NobbZ · August 5, 2019, 9:58am

Usually I’d just ask the database, if you set up your indices correctly, even full text search is quite efficient.

But in general the solution is very dependent on your implementation of the server itself… If you haven’t stored something in the database you can’t really ask it

acim · August 5, 2019, 2:37pm

If you have large amount of data, you may consider using ElasticSearch or Solr. There are also two Golang projects:

Sibert · August 6, 2019, 7:28am

There is a bunch of html templates. Consider it as a database with two columns: url and text. Is there no way to search amongst html templates?

Sibert · August 6, 2019, 7:38am

Is this not overkill for about 100 html pages?

NobbZ · August 6, 2019, 7:45am

Usually one wants to search the dynamic data that is filled into the template, rather then the static stuff surrounding that.

So from my point of view, searching the templates doesn’t make sense. But if you really want to, then build yourself an index over the searchable area of the templates and use that index for searches. You’ll probably need to study DB implementation to get that part right.

Sibert · August 6, 2019, 7:48am

If I interpret you correctly, I have to build a sort of CMS web site?

NobbZ · August 6, 2019, 7:51am

Well, I thought you already built something.

But I do not know what you have built and how it works, therefore I tell you what one usually would do.

And you said you have “templates”, which usually on itself do not contain useful searchable data, but a page skeleton instead. While the content comes from somewhere else. You still haven’t said where it what this somewhere else is.

Sibert · August 6, 2019, 9:15am

Well, you can use templates in at least two ways. Data driven or “plain html”. I decided to start with “plain html”. So I guess I did chose an old way. The data is in the template itself.

<div class="container">

<div class="submenu">
 {{template "sub_form"}}
</div>

  <div class="content">
    <h1>Lorem ipsum</h1>
    <p>Lorem ipsum etc Lorem ipsum etc Lorem ipsum etc Lorem ipsum etc </p>

  <br><br>
</div></div>

NobbZ · August 6, 2019, 10:32am

This approach makes it harder to maintain everything in the long run. It is probably easier to have a database containing the contents. But as I said already, if you stick to your current approach you’ll need to manually maintain an index over the searchable content of your pages, as you probably do not want to find articles containing the <article>-tags when you search for the word “article”.

acim · August 6, 2019, 11:07am

In that case it is, but you should consider the quality of the results as well. But I would choose either Bleeve or Riot in your case because you have small amount of data. And like someone said, you should either index just contents or strip out html before indexing.

system · November 4, 2019, 11:07am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.