23
Search Engine with Rails and Elastic Search
I had always wondered "How do I search through the relational database, in any table, in any column and get the related result?". After searching for a bit, I reached to the conclusion YOU DON'T.
Searching through more than one table and add to that, more than one column is very complex with relational database. That's where Elasticsearch comes into play. Elasticsearch stores all records in documents and provides search functionality that is very fast.
DISCLAIMER
Before we move into the implementation part of the tutorial, I want to make this clear:
This is not the tutorial for building the web search engine like Google. What you will be building is a search engine for the Rails App where you can search for any string inside any table in the app. This will help you in adding the functionality of app wide search.
Now then, let's create a search engine for our Rails App.
I have prepared an example app for making it more easier to follow through the tutorial and pushed to Github. You can clone it from here.
Or with the following command:
git clone [email protected]:coolprobn/rails-search-engine.git
You can install Elasticsearch by following instructions in the official website.
When you have installed Elasticsearch in your machine, you will also be presented with commands to run the server. You can also find commands to run Elasticsearch for your OS here.
Following command is for MacOS where I installed Elasticsearch with Homebrew:
-
Run only once
elasticsearch
-
Run in background and on machine restart
brew services start elastic/tap/elasticsearch-full
NOTE: It can take some time to fully start the server.
You can check if Elasticsearch is working by opening localhost on port 9200 in your browser:
http://localhost:9200/
You should see content similar to this:
{
"name" : "Prabins-MacBook-Pro.local",
"cluster_name" : "elasticsearch_cool",
"cluster_uuid" : "J2CAnnSoRI6p2zZGV3K8eg",
"version" : {
"number" : "7.13.3",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "5d21bea28db1e89ecc1f66311ebdec9dc3aa7d64",
"build_date" : "2021-07-02T12:06:10.804015202Z",
"build_snapshot" : false,
"lucene_version" : "8.8.2",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Add the following to your Gemfile.rb
# Elasticsearch for powerful searching
gem 'elasticsearch-model'
gem 'elasticsearch-rails'
Install gems with bundle install
Add following to all the models. At present, there are three models inside the app; Author, Article and Category.
Example for Author model:
class Author < ApplicationRecord
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
end
This will allow the models in the app to be searched through elasticsearch.
Hmm, that was cumbersome 😪. You can refactor and DRY the code by creating a concern called Searchable and just include that module in all models. This module can also have other configurations required to make the search more powerful.
Create the concern inside app/models/concerns/searchable.rb
and add the following
module Searchable
extend ActiveSupport::Concern
included do
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
end
end
Now, you can include the Searchable module in the models, at the moment there are there are three models in the app: Author, Article and Category, you can include the module as required in each one of them.
Example for Author model:
# app/models/author.rb
class Author < ApplicationRecord
include Searchable
# remaining code
end
Now all the codes inside Searchable module will be accessible to models where this module is configured or included.
To make all records inside the database available to the Elasticsearch they need to be indexed/stored in various documents first. When comparing to relational database, indexing is same like storing records in tables.
Create a file lib/tasks/elastic_search.rake
and add the following:
namespace :elastic_search do
desc 'Index models to elasticsearch'
task :index_models, [:models] => :environment do |_, args|
# eager load first so that models are available for next step
Rails.application.eager_load!
# include all models inside the app/models folder
default_models = ApplicationRecord.descendants.map(&:to_s)
argument_models = args[:models]&.split(',')&.map(&:strip)
models = argument_models || default_models
model_classes = models.map { |model| model.underscore.camelize.constantize }
model_classes.each do |model|
model.import force: true
end
end
end
Inside the rake task, following thing is happening:
-
[:models]
allows the rake task to accept arguments, in this case rake task accepts string separated by "," i.e. each model name which needs to be indexed is separated by a comma. -
Rails.application.eager_load!
loads all the classes inside the "app" folder, this is so that all model names are available to this rake task in the next step. -
ApplicationRecord.descendants.map(&:to_s)
returns array of model names inside the folderapp/models
meaning all models in the app will be indexed. If this is not what you desire you can replace the code with array of model name like['Author', 'Article']
- If argument is passed which should be in the form of string separated by comma as mentioned above, they are converted to array by using the
split
method and then unnecessary whitespace are removed withstrip
method - In
model_classes
, each model name is converted to camel case to maintain consistency and avoid errors, and then each model name which is in string is converted to constant with the help of the methodconstantize
. - Finally, each model is looped through and indexed one by one with the method
import
provided by elasticsearch gem.
-
Index all models
If you are running the rake task for the first time, it's better if you don't pass any argument since you want records from all models to be indexed.
Run the following command in that case:
rails elastic_search:index_models
-
For new models
If you add new models, you will normally want to only index that model, for that you can pass the names of new models when executing the rake task
rails "elastic_search:index_models[Comment\, Tag]"
You need to escape comma (,) with
\
otherwise it will be treated as second argument to rake task and only "Comment" will be passed to the argument model_names.
Now that all the records from required models are indexed, it's finally the time to search through them.
> Author.search('Jane').results
> response.first.as_json
=> {"_index"=>"authors", "_type"=>"_doc", "_id"=>"2", "_score"=>0.6931471, "_source"=>{"id"=>2, "first_name"=>"Jane", "last_name"=>"Jones", "email"=>"[email protected]", "nickname"=>"marvellous.jane", "created_at"=>"2021-07-25T15:18:18.192Z", "updated_at"=>"2021-07-25T15:18:18.192Z"}}
You can search only one model with ModelName.search 'query'
e.g. Author.search 'Jane'
> Author.search('Jane').results
> response.first.as_json
=> {"_index"=>"authors", "_type"=>"_doc", "_id"=>"2", "_score"=>0.6931471, "_source"=>{"id"=>2, "first_name"=>"Jane", "last_name"=>"Jones", "email"=>"[email protected]", "nickname"=>"marvellous.jane", "created_at"=>"2021-07-25T15:18:18.192Z", "updated_at"=>"2021-07-25T15:18:18.192Z"}}
For the result part, you can also achieve the similar result by using the method "records" instead of "results". Difference between them is, "results" always returns Elasticsearch result while "records" convert Elasticsearch results to active record query.
You can read more about records here
You can search in multiple models with Elasticsearch::Model.search('query', [ModelName1, ModelName2])
e.g. Elasticsearch::Model.search('ruby', [Article, Category])
Elasticsearch::Model.search('Ruby', [Article, Category]).results.as_json
=> [{"_index"=>"categories", "_type"=>"_doc", "_id"=>"1", "_score"=>1.5697745, "_source"=>{"id"=>1, "title"=>"ruby", "created_at"=>"2021-07-25T15:18:18.202Z", "updated_at"=>"2021-07-25T15:18:18.202Z"}}, {"_index"=>"articles", "_type"=>"_doc", "_id"=>"2", "_score"=>1.2920684, "_source"=>{"id"=>2, "title"=>"Build Twitter Bot with Ruby", "content"=>"Today, we will be building a bot for Twitter that will retweet all hashtags related to #ruby or #rails. We can also configure it to retweet any hashtags so you can use this tutorial to create bot that can retweet whatever hashtag you want. Yes, and we will be building this Twitter bot with Ruby.\n\nWe will be using Twitter gem (Github) to help us in getting up and running quickly with Twitter APIs.\n", "published_on"=>"2021-04-23T05:00:00.000Z", "author_id"=>1, "created_at"=>"2021-07-25T15:18:18.236Z", "updated_at"=>"2021-07-25T15:18:18.236Z"}}, {"_index"=>"articles", "_type"=>"_doc", "_id"=>"4", "_score"=>0.83619946, "_source"=>{"id"=>4, "title"=>"Setup Factory Bot in Rails", "content"=>"Factory Bot is a library for setting up test data objects in Ruby. Today we will be setting up Factory Bot in Rails which uses RSpec for testing. If you are using different test suite, you can view all supported configurations in the official github repository of Factory Bot.\n", "published_on"=>"2021-06-13T13:00:00.000Z", "author_id"=>2, "created_at"=>"2021-07-25T15:18:18.245Z", "updated_at"=>"2021-07-25T15:18:18.245Z"}}]
You can convert search result to active record with to_a
For e.g. Author.search('john').records.to_a
Author.search('john').records.to_a
Author Load (0.4ms) SELECT "authors".* FROM "authors" WHERE "authors"."id" = $1 [["id", 1]]
=> [#<Author id: 1, first_name: "John", last_name: "Doe", email: "[email protected]", nickname: "john101", created_at: "2021-07-25 15:18:18.190094000 +0000", updated_at: "2021-07-25 15:18:18.190094000 +0000">]
This tutorial won't use this technique nor should this be used in actual implementation because it executes an extra query and adds more time to the request since each record should be converted to Active Record. In production application and especially for search app even 1 millisecond matters which is why in this tutorial, all search results will be converted to JSON and same records will be rendered inside the "view".
Adding the search functionality to the app means everything that happened above in rails console should be replicated and added to the API.
From command line, run the command to create the controller: touch app/controllers/search_controller.rb
Add the following to it:
class SearchController < ApplicationController
def search
if params[:q].blank?
@results = []
else
@results = Elasticsearch::Model
.search(params[:q])
.results.as_json
.group_by { |result| result['_index'] }
end
end
end
Results are grouped by "_index" so that results can be separated for each model name when rendering in the view.
Add the following to config/routes.rb
get :search, to: 'search#search'
To provide the search capability to the users, you will need to have:
- Search box where user can input their search query
- Show list of results for the search
Create a view
$ mkdir app/views/search
$ touch app/views/search/search.html.erb
Add the following inside:
<h1>App Search</h1>
<%= form_for search_path, method: :get do |f| %>
<p>
<%= f.label "Search for" %>
<%= text_field_tag :q, params[:q] %>
<%= submit_tag "Search", name: nil %>
</p>
<% end %>
<% if params[:q] && @results.blank? %>
<p>No results found for <%= params[:q] %></p>
<% end %>
<% @results.each do |group, records| %>
<h3><%= group.titleize %></h3>
<ul>
<% records.each do |record| %>
<% record_link = "#{group}/#{record['_id']}" %>
<li>
<%= link_to record_link, record_link %>
</li>
<% end %>
</ul>
<% end %>
Fire up the rails server rails s
and go to localhost:3000/search
, you will see a view with search box in it like this:
Type relevant text e.g. "ruby" and hit search.
Tada 🎉
You will see search results grouped by model name and link to individual record's detail page like this:
If there are no results you will see "No results found for [query]" like this:
Since there aren't any APIs and required Views for other features, link will not work at the moment.
Elasticsearch also provides the feature of highlighting the matched text like what Google does in its search results. You can take the search feature to next level by adding the highlighted text and rendering them in the view.
Update the code inside else part of the controller in "search" action with the following:
@results = Elasticsearch::Model
.search(params[:q], [], { body: highlighted_fields })
.results.as_json
.group_by { |result| result['_index'] }
private
def highlight_fields
{
highlight: {
fields: {
pre_tags: ['<strong>'],
post_tags: ['</strong>'],
first_name: {},
last_name: {},
nickname: {},
email: {},
title: {},
content: {}
}
}
}
end
Inside "fields" in the method "highlight_fields", you can add column names of any model that you want to highlight the text of. Current configuration includes highlighting for all 3 models available in the app.
By default, highlighted texts are wrapped around "em" tag and can easily be overridden by specifying pre
and post
tags; here "em" tag is overridden by "strong" tag because I felt that bold text catches more attention than italicized text. You can ignore these two tags and remove them completely if you think italicized texts work great.
Your final controller will look like this:
class SearchController < ApplicationController
def search
if params[:q].blank?
@results = []
else
@results = Elasticsearch::Model
.search(params[:q], [], { body: highlight_fields })
.results.as_json
.group_by { |result| result['_index'] }
end
end
private
def highlight_fields
{
highlight: {
fields: {
pre_tags: ['<strong>'],
post_tags: ['</strong>'],
first_name: {},
last_name: {},
nickname: {},
email: {},
title: {},
content: {}
}
}
}
end
end
Add following code just below the "link_to":
<% record['highlight']&.each do |key, snippet| %>
<p><%= "#{key} - " %> <%= sanitize(snippet[0]) %></p>
<% end %>
Highlighted result will be available inside the key "highlight" and text to highlight will be available inside the "snippet" key which is what we are using to render highlighted text.
Your final view will look like this:
<h1>App Search</h1>
<%= form_for search_path, method: :get do |f| %>
<p>
<%= f.label "Search for" %>
<%= text_field_tag :q, params[:q] %>
<%= submit_tag "Search", name: nil %>
</p>
<% end %>
<% if params[:q] && @results.blank? %>
<p>No results found for <%= params[:q] %></p>
<% end %>
<% @results.each do |group, records| %>
<h3><%= group.titleize %></h3>
<ul>
<% records.each do |record| %>
<% record_link = "api/v1/#{group}/#{record['_id']}" %>
<li>
<%= link_to record_link, record_link %>
<% record['highlight']&.each do |key, snippet| %>
<p><%= "#{key} - " %> <%= sanitize(snippet[0]) %></p>
<% end %>
</li>
<% end %>
</ul>
<% end %>
You can search again for the same query and you will see the highlighted text like this:
There are many functionalities that I have skipped deliberately for making this tutorial small and more simpler, you can add the following functionalities if you want to play more with this app:
- Add API and required view to make the links to detail page work
- Show related articles when associated author or category is searched, for e.g. if user searches for "jane" show articles of the author "Jane", or for "ruby" show all articles that have categories "Ruby"
In real world application, I am sure that search functionality can be a lot complex than what is shown here, but this is the start and you can build as required on top of this.
I had always wanted to explore the idea of app wide search, and this blog is the result of my habit of exploring new technology every Sunday. It was fun to learn about elasticsearch, research through the internet on how other have implemented similar search features and actually implementing this with the sample app and in existing project for the client.
I hope you enjoyed this blog as much as I enjoyed it building and writing. I thank you for sticking with me to the very end of the blog.
Full code of this tutorial is available in the branch "app-search", you can find it here.
Happy tinkering and happy coding!
23