-
Notifications
You must be signed in to change notification settings - Fork 2
Auto Indexing With Indexers
Suppose you have a model named FooThing
that is all set up with index definitions, and you can build its index fine using the rake task.
First, make sure this model includes the IndexedSearch::Index
module:
# app/models/foo_thing.rb
class FooThing < ActiveRecord::Base
include IndexedSearch::Index
# ...
end
Then, generate an indexer in app/indexers/foo_thing.rb
using this rake task:
$ rails generate indexed_search:indexer foo_thing
That’s it! That model will now automatically reindex itself whenever it is changed, without you having to run any further rake tasks. You do not have to do anything further most of the time (but you can… keep reading).
Basically, at their heart indexers are regular Rails observers. Except that the standard ActiveRecord::Observer
class has been extended with short predefined after_create
, after_update
, and <tt>after_delete<tt> callbacks to provide reindexing capabilities by default.
The default generators create small files in the app/indexers
directory that glue them to your models. Those small files also act as stub files for you to add more specific application-specific optimizations and extensions to.
The getting started example creates a simple indexer that reindexes each model row as it’s changed (or added, or deleted). But it does it when any attribute is changed, even attributes that are not used by the index at all. This is inherently inefficient, especially if you have a lot of attributes that have nothing to do with the index.
So, in order to refine the indexing, we can edit the indexer created in the previous example:
# app/indexers/foo_thing_indexer.rb
class FooThingIndexer < ApplicationIndexer
observe FooThing
# override what happens when a foo record is changed
def after_update(foo)
# modify to only update for things used in search_index_info (not unrelated changes):
# note: foo.<attribute>_changed? doesn't seem to work right with null values... (rails bug or feature?)
if foo.name_was != foo.name || foo.description_was != foo.description || foo.abstract_was != foo.abstract
foo.update_search_index
# if an attribute is only used by search_priority, then this is much more efficient than a full reindex:
elsif foo.public != foo.public_was
foo.update_search_priority
end
end
end
The default thing is usually fine for adding and deleting, so we can leave those alone in this example.
Now suppose the index for a given model not only uses its own attributes, but also uses attributes from a related model. For example, suppose your models are set up like such:
# app/models/foo_thing.rb
class FooThing < ActiveRecord::Base
has_many :bar_things
def search_index_info
[
# ...
[bar_things.collect(&:name), 10]
]
end
# ...
end
# app/models/bar_thing.rb
class BarThing < ActiveRecord::Base
belongs_to :foo_thing
end
Now the problem is that when BarThing
objects change, their associated FooThing
objects should be reindexed. This does not happen by default.
To fix it, first generate an indexer for BarThing
too:
$ rails generate indexed_search:indexer bar_thing
Then modify its default behavior to index the other associated model, instead of itself:
# app/indexers/bar_thing_indexer.rb
class BarThingIndexer < ApplicationIndexer
observe BarThing
def after_update(bar)
# if the relationship itself changed, update old and/or new one
if bar.foo_thing_id_was != bar.foo_thing_id
bar.foo_thing.update_search_index unless bar.foo_thing_id.nil?
if ! bar.foo_thing_id_was.nil? && ! (old_foo = FooThing.find(bar.foo_thing_id_was)).nil?
old_foo.update_search_index
end
# otherwise if just the name changed, update current one if there is one
elsif bar.name_was != bar.name && ! bar.foo_thing_id.nil?
bar.foo_thing.update_search_index
end
end
def after_create(bar)
bar.foo_thing.update_search_index unless bar.foo_thing_id.nil?
end
def after_destroy(bar)
bar.foo_thing.update_search_index unless bar.foo_thing_id.nil?
end
end
If you find yourself doing a lot of similar things over and over, feel free to add common indexer code to your ApplicationIndexer
class at app/indexers/application_indexer.rb
, that’s what it’s there for! :)
It is quite normal to use this file as your indexers grow.
Sometimes you’re updating so many rows, that it would be more efficient to not do any reindexing for a while, and when you’re done reindex them all afterward. This might be the case, for example, in a mass import rake task.
This is easy. Just wrap your long running tasks inside a without_indexing
block, and then call update_search_index
, like so:
# lib/tasks/import_foo_things.rb
task :import_foo_things => :environment do
FooThing.without_indexing do
# ... do your long import code here
end
FooThing.update_search_index
end
Note that if you do this, you should make sure any custom changes you’ve done to your indexers wrap themselves in an unless no_indexing?
block like the default ones do:
# app/indexers/foo_thing_indexer.rb
class FooThingIndexer < ApplicationIndexer
observe FooThing
def after_update(foo)
unless foo.no_indexing?
if foo.name_was != foo.name || foo.description_was != foo.description
foo.update_search_index
end
end
end
end
In some cases you might also find it faster to run delete_search_index
and create_search_index
instead of update_search_index
, because that way it simply nukes and writes a new index, instead of doing a lot of reading and comparing.
But if you do this, your site should probably be offline (i.e. in maintenance mode), so that users aren’t confused by search mysteriously not returning expected results sometimes. It’s also very important to make sure any indexers don’t get triggered on a given row before create_search_index
processes it or it will blindly create bogus duplicate indexes, that’s why this is not the usual way. Usually it’s better to take a certain percentage longer and keep the site running.