Running Sunspot and Solr with Rails 4 in Production

I recently added full text search capabilities to one of my projects and I decided to use Apache Solr and the Sunspot gem for Rails. Sunspot is awesome and ridiculously easy to integrate with Rails in a development setting, and it comes with a built-in Solr binary that is very useful for development and testing. However, I had some trouble finding good examples of how to deploy it to production with a stand-alone Solr deployment on Tomcat. I ultimately figured it out, and here is what I had to do.

First, a few links to related documents that helped me on my way:

Installing Lucene/Solr on CentOS 6

Install Solr 4.4 with Jetty in CentOS, and set up Solr server to work with Sunspot Gem

Install & Configure Solr on Ubuntu, the quickest way,-the-quickest-way

If you follow the approaches in any of these docs, you end up with Solr deployed on Tomcat with a default Solr configuration for both schema and cores in Solr. I used Solr 4.9.0 on Tomcat 7.0.54, but you should be able to use whatever combination suits your Linux distro package manager. The default configuration is nice for sanity testing, but it must be customized for Sunspot before anything will actually work from Rails.

These are the four steps I used to customize a default configuration Solr deployment for Sunspot:

1) Update the directory structure within the Solr home directory

The default directory structure of the Solr home directory looks like this:

4 drwxr-xr-x. 2 tomcat 4096 Jun 27 14:59 bin/
4 drwxr-xr-x. 4 tomcat 4096 Jul 22 06:42 collection1/
4 -rw-r--r--. 1 tomcat 2473 Jun 27 14:59 README.txt
4 -rw-r--r--. 1 tomcat  446 Jul 22 06:46 solr.xml
4 -rw-r--r--. 1 tomcat  501 Jun 27 14:59 zoo.cfg

As this stands, there is a single SolrCore configured as collection1, with config files residing within the collection1/conf/ directory. Sunspot, by default (in config/sunspot.yml in Rails), will look for SolrCores named after its development, test, and production modes. And we also want a single conf/ directory for all three cores. So we need to modify the directory structure to look like this:

4 drwxr-xr-x. 2 tomcat 4096 Jun 27 14:59 bin/
4 drwxr-xr-x. 6 tomcat 4096 Jul 22 06:43 conf/
4 drwxr-xr-x. 3 tomcat 4096 Jul 22 06:44 development/
4 drwxr-xr-x. 3 tomcat 4096 Jul 22 06:46 production/
4 -rw-r--r--. 1 tomcat 2473 Jun 27 14:59 README.txt
4 -rw-r--r--. 1 tomcat  446 Jul 22 06:46 solr.xml
4 drwxr-xr-x. 3 tomcat 4096 Jul 22 06:44 test/
4 -rw-r--r--. 1 tomcat  501 Jun 27 14:59 zoo.cfg

The new top-level conf/ directory contains everything previously in collection1/conf (i.e. “mv collection1/conf .”), which we will soon customize. The new directories development, test, and production are empty for now, but Solr will populate them when it restarts. Note that, strictly speaking, you only need to add the production directory to support Rails in production mode, but I added directories for development and test just in case I ever need to test things against the production Solr instance.

2) Configure the custom Solr schema for Sunspot in schema.xml
3) Configure the custom Solr config for Sunspot in solrconfig.xml

Steps 2 and 3 customize Solr for Sunspot and ActiveModel. You will need to find customized versions of these config files and place them in the new top-level conf/ directory. If you are developing your Rails app with Sunspot in development mode with its built-in binary, then you should see a solr/ directory in your Rails development directory and you can find Sunspot’s schema.xml and solrconfig.xml in solr/conf/. You can also grab these directly from Sunspot on Github here:

Either way, overwrite the existing versions of schema.xml and solrconfig/xml with the customized versions.

4) Configure the SolrCores that Sunspot expects to see for Rails in production mode in solr.xml

This last step was the one I didn’t find in any examples and this took the longest to figure out. By default, the solr.xml file in the top-level Solr home directory looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/cores" host="${host:}" hostPort="${jetty.port:}">
    <core name="collection1"     instanceDir="." dataDir="default/data"/>

In some distrubutions of Solr this file is omitted entirely, and the comments in this file state that in the absence of this file the default configuration will silently be the same — a single core named collection1 — but without an explicit file-based definition. Fun.

We want to eliminate the unneeded default collection1 core definition and replace it with definitions for our production, test, and development cores. Note again that you can skip the test and development cores if you want. I want cores for test and development, so I updated solr.xml to this:

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/cores" host="${host:}" hostPort="${jetty.port:}">
    <core name="development" instanceDir="." dataDir="development/data"/>
    <core name="test"        instanceDir="." dataDir="test/data"/>
    <core name="production"  instanceDir="." dataDir="production/data"/>

And finally, after all this, you should be able to restart Tomcat and Sunspot and Rails should be able to operate against this stand-alone instance of Tomcat. You can check that the cores are configured correctly by looking at the Solr Admin UI at http://[your Tomcat host]:[your Tomcat port]/solr/#/~cores/. You should see your configured cores in the list on the left.

Two final pieces of advice:

— Make sure your Rails config/sunspot.yml port definitions match the configured TCP port of Tomcat. Sunspot defaults to port 8983 for production (and 8981 for development and 8982 for test), so be sure to either configure Tomcat to listen on these ports or change the port definitions in sunspot.yml to match Tomcat’s TCP listen port (which defaults to 8080).

— If you see HTTP 404 errors from Tomcat in your Rails log that mention “The requested resource is not available”, then your production SolrCore is probably not configured correctly. Check the URI that Rails is hitting on Tomcat and see if it matches your SolrCore config.

Hope this helps.

2 thoughts on “Running Sunspot and Solr with Rails 4 in Production

  1. Hi,
    I followed the above steps and successfully got three cores.
    But what command to write to reindex ?
    In devlopment I was using the "sunspot_solr" gem and starting/stoping/reindexing using rake sunspot:solr:[start/stop/reindex]
    But it doesnt work in production.
    If I put the gem out of the development group and use it RAILS_ENV=production bundle exec rake sunspot:solr:start, It just starts an embedded solr on the same port and uses that .. not the core I just made.
    Any help ?

Leave a Reply

Your email address will not be published. Required fields are marked *