Skip to content

rjocoleman/thinking-sphinx

Repository files navigation

Thinking Sphinx

Welcome to Thinking Sphinx version 3 – a complete rewrite from past versions. Right now it is a work-in-progress, missing many of the features in TS 2 and earlier. It’s also currently built for Rails 3.1 only.

Still interested? Well, read on!

Installation

There’s no gem (prerelease or otherwise) at this point – so you’ll need to use the git reference:

gem 'thinking-sphinx',
  :git    => 'git://github.com/freelancing-god/thinking-sphinx.git',
  :branch => 'edge',
  :ref    => 'current-commit-ref'

Usage

Indexes are no longer defined in models – they now live in `app/indices` (which you will need to create yourself). Each index should get its own file, and look something like this:

# app/indices/article_index.rb
ThinkingSphinx::Index.define :article, :with => :active_record do
  indexes title, content
  indexes user.name, :as => :user
  indexes user.articles.title, :as => :related_titles

  has published
end

You’ll notice the first argument is the model name downcased and as a symbol, and we are specifying the processor – :active_record. Everything inside the block is just like previous versions of Thinking Sphinx. Same goes for config/thinking_sphinx.yml (formerly config/sphinx.yml).

Other changes:

  • SphinxQL is now used instead of the old socket connections (hence the dependency on the mysql2 gem).
  • You’ll need to include ThinkingSphinx::Scopes into your models if you want to use Sphinx scopes.
  • The match mode is always extended – SphinxQL doesn’t know any other way.
  • ActiveRecord::Base.set_sphinx_primary_key is now an option in the index definition (alongside :with in the above example): :primary_key – and therefore is no longer inheritable between models.
  • If you’re explicitly setting a time attribute’s type, instead of :datetime it should now be :timestamp.
  • Delta arguments are passed in as an option of the define call, not within the block:
ThinkingSphinx::Index.define :article, :with => :active_record, :delta => true do
  # ...
end
  • Suspended deltas are no longer called from the model, but like so instead:
ThinkingSphinx::Deltas.suspend :article do
  article.update_attributes(:title => 'pancakes')
end
  • Excerpts through search results behaves the same way. Excerpt options (like :before_match, :after_match and :chunk_separator) can be passed through when searching under the :excerpts option:
ThinkingSphinx.search 'foo',
  :excerpts => {:chunk_separator => ' -- '}

If you want to excerpt any string without going through an object, you can use the search result set’s excerpter object:

results = ThinkingSphinx.search 'foo'
results.excerpter.excerpt!('The Etymology of Foo')
  • When indexing models on classes that are using single-table inheritance (STI), make sure you have a database index on the type column. Thinking Sphinx will need to determine which subclasses are available, and we can’t rely on Rails having loaded all models at any given point, so it queries the database.
  • The option :rank_mode has now become :ranker – and the options (as strings or symbols) are as follows: proximity_bm25, bm25, none, wordcount, proximity, matchany, and fieldmask.
  • There are no explicit sorting modes – all sorting must be on attributes followed by ASC or DESC. For example: :order => 'weight DESC, created_at ASC’@.
  • If you specify just an attribute name as a symbol for the :order option, it will be given the ascending direction by default. So, :order => :created_at is equivalent to :order => 'created_at ASC'.
  • If you want to use a calculated expression for sorting, you must specify the expression as a new attribute, then use that attribute in your :order option. This is done using the :select option to specify extra columns available in the underlying SphinxQL (not ActiveRecord/SQL) query.
ThinkingSphinx.search(
  :select => '@weight * 10 + document_boost as custom_weight',
  :order  => :custom_weight
)
  • Support for latitude and longitude attributes named something other than ‘lat’ and ‘lng’ or ‘latitude’ and ‘longitude’ has been removed. May add it back in if requested, but would be surprised if it’s a necessary feature.
  • Set INDEX_ONLY to true in your shell for the index task to re-index without regenerating the configuration file.

Limitations

Basic indexing and searching should be fine. There’s currently no facets or excerpts, and limited delta support. Many settings haven’t yet been brought across. Many of the smaller features don’t yet exist either. Some may actually not return… we’ll see.

A list of what still needs to be implemented, in no particular order:

  • Facets
  • Bundled Searches
  • Delayed Deltas
  • Datetime Deltas
  • Attribute Updates
  • JRuby support
  • Abstract Inheritance support (maybe – not sure this is something many of people want).
  • Rails 3.0 support
  • Searching from association contexts
  • Searching by specific indices
  • Matching many values on a single MVA attribute (:with_all)
  • Matching none of many values on a single MVA attribute (:without_any)
  • Sinatra support
  • Grouped search results
  • Group enumerators (with group, with count, with both)
  • File fields
  • ActiveRecord :include and :select options being passed through on searches
  • Test Helpers
  • Overwritable toggle_delta? method on model
  • Default Sphinx scopes
  • Relaxed error handling for deletions and delta calls
  • 64-bit integer detection
  • 64-bit integer attribute consistency
  • Handling of NULLs in group concatenation
  • Timezone support
  • Using :though association shortcuts in index definitions
  • Wordcount fields and attributes
  • Manual MVA type declarations
  • Timestamp MVAs
  • Query and Ranged Query sources for Attributes
  • Namespaced models support
  • Capistrano Tasks
  • Infixing and Prefixing of specific fields
  • Multiple sources for an index
  • sanitise_sql method in index definitions
  • Bitmask weighting helper
  • Query times for searches
  • Starred queries

Sphinx Versions

TS 3 is built for Sphinx 2.x only. You cannot use 1.10-beta, 0.9.9 or anything earlier than that.

Rails Versions

Currently TS 3 is built to support Rails 3.1. 3.0 will almost certainly be added – not promising anything for Rails 2.3, and anything earlier than that definitely won’t be supported.

Sinatra is not yet supported, but it will be. Merb support has been discontinued.

Ruby Versions

Built on MRI 1.9.3, but tested on MRI 1.8.7, MRI 1.9.2, Rubinius and REE. Will never support 1.8.6, but will hopefully support JRuby (the one catch there is the different MySQL interfaces).

Database Versions

MySQL 5.x and Postgres 8.4 or better are supported.

Contributing

You’re brave! To contribute, clone this repository and have a good look through the specs – you’ll notice the distinction between acceptance tests that actually use Sphinx and go through the full stack, and unit tests (everything else) which use liberal test doubles to ensure they’re only testing the behaviour of the class in question. I’ve found this leads to far better code design.

If you’re still interested in helping evolve this, then write the tests and then the code to get them passing, and send through a pull request. No promises on merging anything, but we shall see!

For some ideas behind my current approach, have a look through sketchpad.rb in the root of this project. If you can make sense of that, you’re doing very well indeed.

Licence

Copyright © 2011, Combustion is developed and maintained by Pat Allan, and is released under the open MIT Licence. Many thanks to all who have contributed patches.

Packages

No packages published

Languages

  • Ruby 100.0%