Get Social

Installing and Configuring Sphinx 2.2.11 (Yii with delta Indexes) on Ubuntu / Debian and RHEL / CentOS 7

There are a lot of changes after upgrade Sphinx search from version 1 to 2. Queries on the command line no longer work through searh, many directives were removed from the configuration file, and more. We’ll figure out.

Setup Sphinx 2.2.11 on Ubuntu / Debian и RHEL / CentOS 7

First we need to install the package:

Ubuntu/Debian:
aptitude install sphinxsearch

RHEL/CentOS:
yum install -y postgresql-libs unixODBC
wget http://sphinxsearch.com/files/sphinx-2.2.11-1.rhel7.x86_64.rpm
yum install sphinx-2.2.11-1.rhel7.x86_64.rpm

Configuring Sphinx 2.2.11 на Ubuntu / Debian и RHEL / CentOS 7

Ubuntu/Debian:
nano /etc/sphinxsearch/sphinx.conf

RHEL/CentOS:
vi /etc/sphinx/sphinx.conf

Sphinx configuration file consists of blocks: source, indexer and searchd.
There is example Sphinx + delta-indexes:

# for db connect to include in other sources
source dbconnect
{
	type			= mysql
	sql_host		= localhost
	sql_user		= your_db_user
	sql_pass		= your_db_pass
	sql_db			= your_db_name
	sql_port		= 3306
}

# --------- products --------- #
# sources products
source sphinx_source_products : dbconnect
{
        sql_query_pre		= SET NAMES utf8mb4
	sql_query_pre = \
	        update sphinx_delta_counter \
	        set last_post_id = (select max(id) from tb_products) \
	        where index_name = 'sphinx_index_products';
	sql_query = \
		SELECT id, title, descr \
		FROM tb_products \
		WHERE status = 1 # or any your condition
#	sql_attr_uint = id
	sql_field_string = title
	sql_field_string = descr
}
source sphinx_source_products_delta : dbconnect
{
        sql_query_pre		= SET NAMES utf8mb4
        sql_query = \
		SELECT id, title, descr \
		FROM tb_products \
		WHERE status = 1 AND id > (select last_post_id from sphinx_delta_counter where index_name = 'sphinx_index_products');
#	sql_attr_uint = id
	sql_field_string = title
	sql_field_string = descr
}
# indexes products
index sphinx_index_products
{
	source			= sphinx_source_products
	path			= /home-path-to-web-dir/example.com/sphinx_data/sphinx_products
	dict			= keywords
	morphology		= stem_ru, stem_en
	min_word_len		= 2
	docinfo			= extern
}
index sphinx_index_products_delta
{
        source                  = sphinx_source_products_delta
        path                    = /home-path-to-web-dir/example.com/sphinx_data/sphinx_products_delta
        dict                    = keywords
        morphology              = stem_ru, stem_en
        min_word_len            = 2
        docinfo                 = extern
}
# --------- / products --------- #

# some tables
# what you need
# just as it is done section above

indexer
{
	mem_limit		= 32M
}

searchd
{
	listen			= 9312
	listen			= 9306:mysql41
	log			= /var/log/sphinx/searchd.log
	query_log		= /var/log/sphinx/query.log
	read_timeout		= 5
	max_children		= 30
	pid_file		= /var/run/sphinx/searchd.pid
#	max_matches		= 1000
	seamless_rotate		= 1
	preopen_indexes		= 1
	unlink_old		= 1
	workers			= threads # for RT to work
	binlog_path		= 
#	binlog_path		= /var/log/sphinx
 # use /var/log/sphinxsearch/ and  /var/run/sphinxsearch/ paths for ubuntu
}

Replace “your_db_user”, “your_db_pass”, “your_db_name”, “/home-path-to-web-dir/example.com/” with your data. Create a folder for storing indexes, for example “sphinx_data”. Edit sql_query and sql_query_pre under your database. To do this, you need to think over the query to search, and also to select the search column and assign it to sql_field_string (if several – it means several lines must be added accordingly).

Sphinx delta index. Indexing big data.

We need use delta-indexes in order avoid to re-index the entire database during the day (if you have a large mysql database). Simply create the table “sphinx_delta_counter” in the mysql database with the column “last_post_id”. I think it is clear that the last indexed id is written here (See the query in the config). After that, we will then need to include both indexes to search query.

To work with the Russian language (or other non latin languages), you need to set the encoding of the database: SET NAMES utf8 (or SET NAMES utf8mb4), as well as the morphology: parameter stem_ru. The string “sql_query_pre = SET NAMES utf8mb4” must be written to each source separately, if you add it to the common connect – > this does not work. I spent many time, trying to resolve issue: sphinx did not index Russian words .

Configuring Sphinx for Yii2

  1. Install plugin for Yii: yii2-sphinx.
  2. Edit config/main.php (“components” section):
    'sphinx' => [
    'class' => 'yii\sphinx\Connection',
    'dsn' => 'mysql:host=127.0.0.1;port=9312;',
    'username' => '', # there is usually no need to specify anything
    'password' => '', # there is usually no need to specify anything
    ],
  3. Full-text search with morphology and other buns:
    use yii\sphinx\Query;
    $query_search = new Query();
    $search_result = $query_search->from('siteSearch')->match($q)-all();

Indexing, starting and checking Sphinx

After configuring the sphinx configuration file, we need to index all the sources that we created:
indexer --all

Then we need add records to cron service (as you can see – once a day recreates the full index, and every 5 minutes – delta):
crontab -e
0 1 * * * /usr/bin/indexer --config /etc/sphinx/sphinx.conf sphinx_index_course --rotate > /dev/null
*/5 * * * * /usr/bin/indexer --config /etc/sphinx/sphinx.conf sphinx_index_course_delta --rotate > /dev/null

Before starting the service it will be useful to make the owner “sphinxsearch” (or “sphinx”) of the folder with indexes:
chown sphinxsearch: /home-path-to-web-dir/example.com/sphinx_data

Start Sphinx service with command:
systemctl start searchd

In the Ubuntu (or Debian), possible, we get message:
Failed to start searchd.service: Unit searchd.service not found.
In this case, try other command:
systemctl start sphinxsearch
We can also include sphinx with an alternative command:
/etc/init.d/sphinxsearch start
Also, perhaps, it is necessary to replace “START=yes” (insted “START=no”) in the config file /etc/default/sphinxsearch

To verify, we can use this command:
systemctl status searchd
or
systemctl status sphinxsearch

Also, to make sure that the service is working and listening to the specified ports:
ps aux | grep search
lsof -i tcp:9306
lsof -i tcp:9312

In previous post (in sphinx 1.x) I wrote that it was possible to check the search in the console:
search -i some_index_name "search word"

Now this option was removed, but there is still a possibility to check whether everything is good with indexing and whether the necessary words fall into the index.

We can use the following command, instead of searching in the sphinx console:
indextool --dumphitlist some_index_name "search word" | more

In addition, you can check the search using a MySQL client, as suggested by the developer of Sphinx:

# To query search daemon using MySQL client:
mysql -h 0 -P 9306
mysql> SELECT * FROM test1 WHERE MATCH(‘test’);

Issues when start Sphinx service

any deprecated errors – this is a legacy from previous versions – comment out or delete these parameters