You don’t hear this one every day: I just made a critical database query 10x faster by moving the bulk of the complexity out of SQL and back into Ruby. This post will explain how I did it, plus how you can apply a similar approach to rethinking search in your own application. (In a hurry? Jump straight to the codey bits.)

I maintain an app for studying Japanese called KameSame that’s built with Rails and Postgres (I talked about it in The Selfish Programmer). Last weekend I rewrote its Japanese-English dictionary search feature for at least the third time and I finally managed to strike the right balance between flexibility and speed.

Search is hard

If you’ve never built a search feature, it’s rarely as straightforward as it first seems ("how hard can it be to query a single text input?"). But search is a rare example of a problem whose implementation complexity is inversely proportional to the simplicity of the user interface. A user supplies a single piece of information (a query string in this case) and the computer needs to interpret the user’s intention before gathering and normalizing results from multiple sources of data. Oh, and the user is almost always blocked from proceeding until the initial results are displayed, so everything has to be as fast as possible.

Things can start out simply enough, though:

Item.where("text like ?", "%pants%")

There—that was easy!

But it won’t be long before it becomes apparent how much isn’t taken into account by a naive search implementation. What about searching text with different punctuation and casing than the query? Or misspelled words? Or stop words? And why doesn’t it search the title, tags, or author? And shouldn’t it rank results based on relevance?

Even if you overcome the odds and gracefully manage these complications, take your eyes off it for a year and you’ll have a new problem: your custom search query will likely have become much slower as more and more data was added to the system.

Postgres probably isn’t the bottleneck

At this point, a lot of developers will throw up their hands and look for an external solution (especially Rails developers, who often exhibit a SQL allergy on account of how well Active Record can help you hide from it). Sure, you could keep trying to implement search in Postgres, but commenters on Hacker News make it seem like nobody actually does that. Maybe you’re hearing about other teams reaching for alternatives like Elasticsearch or Solr or Amazon CloudSearch—but adopting one usually means complicating your operational infrastructure while also maintaining some kind of ETL process to sync your data from your application database to the search service. And as soon as your data resides in multiple places, you’ll have a whole new category of problems to worry about.

But if I’ve learned anything about data in the last decade, it’s that Postgres in particular has defied the conventional wisdom about what relational databases should and shouldn’t be used for. Its notify and listen operations combine to form a better pub/sub solution than most queuing systems. And its jsonb type can be used as a key-value store more effectively than most dedicated NoSQL platforms.

So it should come as no surprise that Postgres is incredibly competent at searching text, as well! It offers robust full-text search based on the tsvector data type and the tsquery syntax—including conveniences like websearch_to_tsquery to build queries from the sorts of strings users already type into search fields. And Postgres separately sports a performant “fuzzy” search based on the text similarity analysis included in its official pg_trgm module. And the performance of both full-text search and trigram analysis benefit from rather astonishingly-good index support.

Granted, the learning curve for all of this stuff is way, way too high. But when my naive search queries grew to be unacceptably slow, my decision was ultimately to either learn how to integrate a totally unknown-to-me dedicated search server or to dive headlong into the arcane documentation of the database I was already using. I chose the latter and ended up learning just enough about Postgres’s search capabilities to be dangerous.

Doing everything in SQL has gotta be faster, right?

In the second iteration of my search feature, I moved the entire query out of Active Record and into Postgres, on the theory that Ruby is slower than the database and if I just send everything to Postgres and then optimize each part of the query sufficiently, I’ll get the best performance possible.

And I ended up moving a lot of complexity into the database, combining 11 subqueries with UNION across two mega-queries: one reasonably minimal primary query and one more resource-intensive secondary query that was only executed if the first didn’t find any results. They combined for 167 lines of SQL. And that’s not counting the materialized view of pre-processed terms or the dozen indexes that the feature necessitated.

The result of all my efforts sprinkled over two months of evenings and weekends? The search feature was slower somehow. And because it was basically an inscrutable wall of SQL, even after painstakingly optimizing every query plan with the excellent tool pgMustard, the performance of my Ruby-free search implementation was middling, at best.

Welp.

On reflection, my issue hadn’t been that I had implemented the search with Active Record instead of raw SQL, it had been that I was running every single search term through every potential search condition. My fundamental failure was that I hadn’t appreciated the scope of my search feature’s requirements.

Appreciating the scope of a search feature’s requirements

One way to estimate the number of logical queries a search feature needs to consider (regardless of how they’re implemented) is to take the product of ❶ the number of intentions the user might have behind each term they type and ❷ the number of places the system might store data relevant to each of those intentions.

Take a simple example: suppose a user types into an address book search field. They might intend to find a contact by typing either a name or an email address. And the system might store names in either of two string columns and e-mail addresses in a third. That means there are 2 intentions and 3 places where the data might reside, or up to 6 logical queries. Suppose someone searches for "Jerry jerry@example.com". Not considering any implementation approach, that means 6 cases need to be covered:

  1. "Jerry" could be in users.name
  2. "Jerry" could be in nicknames.name
  3. "Jerry" could be in emails.address
  4. "jerry@example.com" could be in users.name
  5. "jerry@example.com" could be in nicknames.name
  6. "jerry@example.com" could be in emails.address

So I went through the same exercise for my Japanese-English dictionary and identified:

  • 8 user intentions: Latin characters representing English definitions, Latin strings of misspelled English definitions, Latin characters representing pronunciation of Japanese, Latin characters representing Japanese words that themselves contain Latin characters (like “NGワード”), Japanese characters representing Japanese kanji or words, hiragana or katakana strings representing pronunciation of Japanese, half-width characters representing either Japanese words or pronunciation, and Japanese strings of misspelled or mispronounced Japanese
  • 6 data locations: an array column of English definitions, a string column of the canonical textual representation of the Japanese kanji or word, an array column of alternate textual representations of the Japanese kanji or word, an array of Japanese pronunciations, a table of user-defined English definitions, a table of user-defined Japanese spellings

That means that there are a whopping 48 permutations my search feature needs to incorporate in order to cover every case that might be represented by a given search string. It was no small effort to cover all those cases in a “mere” 11 subqueries, in hindsight. But a single query like 待望 waiting expectantly たいぼう would still result in my mega-query having to churn through several dozen cheap-but-not-free operations over a quarter million records. As a result, it didn’t take many users at all to bring my meager Standard-0 Heroku Postgres instance to its knees.

It turned out the real thing crushing the performance of my search wasn’t the performance of any particular SQL query, it was the sheer combinatorial explosion of user intentions, data locations, and search terms I had been naively throwing over the fence for the database to deal with. If I wanted the feature to perform well, I needed to do a better job filtering which queries were appropriate for each query string by interpreting more about the user’s intention up front.

The Whereable Query Pattern

It had dawned on me that the path to better search performance would be to start doing more work up front: normalizing search input, segregating terms by potential intent, and then filtering out irrelevant subqueries. I had recently learned that trying to cram a bunch of conditionally-enabled filters inside a multi-part SQL query would be hard to pull off (I’ll save you some work: that approach usually leads to countless StackOverflow tabs full of people unhelpfully chiding that well, actually SQL is a relational language). And I separately had enough experience to know that maintaining a long PL/pgSQL function can be difficult to inspect and adjust.

So, I decided to change course and pull some of the complexity back into Ruby by leveraging Active Record’s or method in conjunction with a custom value type to represent each of the search conditions I needed to consider.

[I actually meant to implement all this months ago, but the implementation depended on my Japanese improving to the point that I could clearly understand this incredibly gracious and detailed blog post written by @ktou in response to a question I had raised about full-text Japanese search in Postgres.]

How to use or and merge

If you’ve never used or before, it will do the relational algebra necessary to reduce arbitrarily many relations into a single, minimal SQL query. A simple example:

Item.where("text like ?", "%柱%").or(
  Item.where("? = ANY(meanings)", "Pillar")
)

Upon evaluation, AR will combine those two conditions into one SQL query:

SELECT "items".* FROM "items"
WHERE (
  text like '%柱%'
  OR
  'Pillar' = ANY(meanings)
)

But our end goal in this case is to have a bunch of search conditions that can be individually enabled or disabled based on the type of search terms a user has entered. To do that, we can start down the path of building an abstraction based on a collection of unrelated search conditions.

To do that, we could store each where in a data structure and then reduce it for the same result:

wheres = [
  Item.where("text like ?", "%柱%"),
  Item.where("? = ANY(meanings)", "Pillar")
]

wheres.reduce(:or)

If there are any conditions that should be common across all subqueries (i.e. SQL AND) or any joins needed (as everything passed to or has to assume a common query structure), those can be applied to an initial relation and then the or reduction can be merged in using the merge method:

Item.where("type in ('kanji', 'vocabulary')")
  .merge(wheres.reduce(:or))

Which will produce this SQL:

SELECT "items".* FROM "items"
WHERE (type in ('kanji', 'vocabulary'))
AND (
  text like '%柱%'
  OR
  'Pillar' = ANY(meanings)
)

Both or and merge are handy tools to keep in mind whenever you’re interested in logically separating parts of a query or struggling to extract pieces from a large block of Active Record-dependent code.

Building a Whereable type

The purpose of this adventure is to speed up search performance by ensuring we’re only executing the search conditions that are necessary based on what we can infer from the terms provided. Most basically, that requires two pieces of information that need to travel together: ❶ a way to determine whether a particular condition is necessary, and ❷ the where clause itself. And whenever I need to tie a couple attributes together, I’ve learned to resist the urge to chuck them in an untyped hash and instead give myself the breathing room of a class or Struct with meaningful names.

So let’s start with these:

Whereable = Struct.new(
  :valid,
  :where,
  keyword_init: true
)

And then wrap our queries in it:

whereables = [
  Whereable.new(
    valid: true,
    where: Item.where("text like ?", "%柱%")
  ),
  Whereable.new(
    valid: true,
    where: Item.where("? = ANY(meanings)", "Pillar")
  )
]

results = Item.where("type in ('kanji', 'vocabulary')")
  .merge(whereables.select(&:valid).map(&:where).reduce(:or))

This refactor has now put us in a position to finally start doing the thing we set out to do: disable the query conditions that aren’t relevant. Doing this requires special knowledge of the safe assumptions that can be made about each place where your data is stored. For context and in this case, I know the text column of every single row will always contain at least one double-width Japanese character and the array of strings in the meanings column will never contain anything other than typical Latin characters.

As you might have inferred above, if we imagine a user entered 柱 Pillar into a search field, that’s why the above example would put "柱" in one query and "Pillar" into the other. But our new factoring should allow the new Whereable data structure to go to work for us here:

def search(q)
  japanese, english = q.split.partition { |s| s.match?(/\p{Han}/) }

  whereables = [
    Whereable.new(
      valid: japanese.present?,
      where: Item.where("text like ?", "%#{japanese.join}%")
    ),
    Whereable.new(
      valid: english.present?,
      where: Item.where("? = ANY(meanings)", english.join(" "))
    )
  ]

  Item.where("type in ('kanji', 'vocabulary')").merge(
    whereables.select(&:valid).map(&:where).reduce(:or)
  )
end

The conditions above have been trivialized for the sake of brevity, but hopefully this illustrates that by partitioning the terms based on whether they contain a Japanese kanji character (using the \p{Han} regex character class), the valid flag can be set so that Whereable instances are filtered out when they aren’t necessary.

The upshot is that the SQL that Active Record ultimately generates is now contingent on the search terms provided, so the database isn’t asked to search for results in more places than necessary.

For example, if called only with English, like search("To Swim"):

SELECT "items".* FROM "items"
WHERE (type in ('kanji', 'vocabulary'))
AND ('To Swim' = ANY(meanings))

And if called with Japanese only, such as search("幸い"):

SELECT "items".* FROM "items"
WHERE (type in ('kanji', 'vocabulary'))
AND (text like '%幸い%')

But if both are provided, as with search("幸い Happiness"), both queries are included:

SELECT "items".* FROM "items"
WHERE (type in ('kanji', 'vocabulary'))
AND (
  text like '%幸い%'
  OR
  'Happiness' = ANY(meanings)
)

Ordering results

The next major requirement of any search feature is that results be ordered in some meaningful way. Because Whereable helped us separate the search criteria, and the order of the results necessarily depends on which criteria we’re searching on, it stands to reason that the Whereable should also contain some means of ranking results, too.

We could add a rank attribute to our Whereable struct and then try something like this by updating our search method:

whereables = [
  Whereable.new(
    valid: japanese.present?,
    where: Item.where("text like ?", "%#{japanese.join}%"),
    rank: ActiveRecord::Base.sanitize_sql_for_conditions([
      "similarity(text::bytea::text, ?::bytea::text)", japanese.join
    ])
  ),
  Whereable.new(
    valid: english.present?,
    where: Item.where("? = ANY(meanings)", english.join(" ")),
    rank: ActiveRecord::Base.sanitize_sql_for_conditions([
      "similarity(array_to_string(meanings, ' '), ?)", english.join(" ")
    ])
  )
]

Item.where("type in ('kanji', 'vocabulary')").merge(
  whereables.select(&:valid).map(&:where).reduce(:or)
).order(Arel.sql(
  "greatest(#{whereables.select(&:valid).map(&:rank).join(", ")}) desc"
))

Again, like the search filters themselves, the ranking methods are contrived, but the point is each Whereable’s rank returns a SQL expression that evaluates to a number between 0 and 1 to represent the relevance of each match. In this way, ordering results based on the greatest of all query rankings can effectively compare results across all the criteria we’ve specified. However, just like with the search criteria, a ranking is only included if its correspondent search was also executed.

So the fullest query generated by the above with search("幸い Happiness") would look like:

SELECT "items".* FROM "items"
WHERE (type in ('kanji', 'vocabulary'))
AND (
  text like '%幸い%'
  OR 'Happiness' = ANY(meanings)
)
ORDER BY greatest(
  similarity(text::bytea::text, '幸い'::bytea::text),
  similarity(array_to_string(meanings, ' '), 'Happiness')
) desc

But a query containing only one type of search term like search("To Finish") will only include its where and rank expressions:

SELECT "items".* FROM "items"
WHERE (type in ('kanji', 'vocabulary'))
AND ('To Finish' = ANY(meanings))
ORDER BY greatest(
  similarity(array_to_string(meanings, ' '), 'To Finish')
) desc

Making it real

From here, I continued iterating on my Whereable type (which eventually graduated into a real class, with its own file listing and everything).

Once I got things working, I folded in the searching & sanitization behavior into Whereable so most queries could be expressed declaratively, like this one that searches Japanese pronunciation if any search terms could be construed as a valid reading based on a tsvector @@ tsquery search:

Whereable.new(
  valid: readings.present?,
  where: ["readings_tsvector @@ (#{readings.tsquery})", *readings.tokens],
  ranking_conditions: [
    ["(case when readings && Array[?]::character varying[] then 0.9 else 0 end)",
      readings.terms],

    ["ts_rank(readings_tsvector, (#{readings.tsquery}))", *readings.tokens]
  ]
),

Further, to support search conditions for which a preliminary query against a different table was necessary, I added a data_source proc to return an initial set of results and only include the where if its results were not empty, as in this case of user-defined definitions:

Whereable.new(
  valid: english.present?,
  data_source: -> {
    Definition
      .select(:item_id, sanitize("similarity(text, ?) rank", english))
      .where(user_id: user.id)
      .where("text % ?", english)
  },
  where: ->(definitions) {
    Item.where(id: definitions.map(&:item_id))
  },
  ranking_conditions: ->(definitions) {
    [[
      "case #{definitions.map { "when items.id = ? then ?" }.join(" ")} end",
      *definitions.flat_map { |d| [d.item_id, d.rank] }
    ]]
  }
)

If you’re interested in a more complete example of what a more fleshed-out Whereable type might look like, here’s the full source of my Whereable type.

In the end the results were pretty phenomenal. Easier to read and organize code. Significantly better and faster tests. And performance that’s so fast that twice now I’ve accidentally confused production for my local server, wondering why my code changes weren’t taking effect. Seriously, both very basic, single-term searches are often 15 to 20 times faster while longer queries that include every category of search term are still well over 10 times as fast as the previous implementation.

Search is still hard

Like so many things with software, search is an evergreen problem that’s never truly solved, because both our users’ behavior and our application’s data are moving targets that change over time. As such, a quality search feature needs to be regularly revisited, analyzed, and tuned. That’s the primary reason why I found it helpful to lean on the sort of composition afforded by Active Record’s oft-overlooked or and merge methods to generate the SQL instead of opting for a more verbose, harder-to-debug pure SQL implementation.

Every application’s search needs will differ greatly, but the Whereable design pattern may help you create some of the building blocks you need to focus on crafting one search condition at a time, to test each operation in isolation, and to save resources by culling unnecessary searches at runtime.

If this all feels harder than it should be: don’t worry, you’re not alone! If all you know is that your app’s search features need improvement and you aren’t sure where to start, feel free to email or send me a tweet or DM and it would be my pleasure to help you out.

Justin Searls

Person An icon of a human figure Status
Double Agent
Hash An icon of a hash sign Code Name
Agent 002
Location An icon of a map marker Location
Orlando, FL