RediSearch: FullText Search and Secondary Index module for Redis

Andreas.Motl · April 19, 2019, 6:31pm

Overview

RediSearch implements a search engine on top of Redis, but unlike other Redis search libraries, it does not use internal data structures like Sorted Sets.

Inverted indexes are stored as a special compressed data type that allows for fast indexing and search speed, and low memory footprint.

This also enables more advanced features, like exact phrase matching and numeric filtering for text queries, that are not possible or efficient with traditional Redis search approaches.

Features

Full-Text indexing of multiple fields in documents.
Incremental indexing without performance loss.
Document ranking (provided manually by the user at index time).
Complex boolean queries with AND, OR, NOT operators between sub-queries.
Optional query clauses.
Prefix based searches.
Field weights.
Auto-complete suggestions (with fuzzy prefix suggestions).
Exact Phrase Search, Slop based search.
Stemming based query expansion in many languages (using Snowball).
Support for logographic (Chinese, etc.) tokenization and querying (using Friso)
Support for custom functions for query expansion and scoring (see Extensions).
Limiting searches to specific document fields (up to 128 fields supported).
Numeric filters and ranges.
Geographical search utilizing Redis’ own Geo commands.
A powerful aggregations engine.
Unicode support (UTF-8 input required).
Retrieve full document content or just ids.
Automatically index existing HASH keys as documents.
Document deletion and updating with index garbage collection.
Partial and conditional document updates.
Sortable properties (i.e. sorting users by age or name).

Sources

Andreas.Motl · April 19, 2019, 6:31pm

RediSearch Python Client

Usage

# Creating a client with a given index name
client = Client('myIndex')

# Creating the index definition and schema
client.create_index((TextField('title', weight=5.0), TextField('body')))

# Indexing a document
client.add_document('doc1', title = 'RediSearch', body = 'Redisearch impements a search engine on top of redis')

# Simple search
res = client.search("search engine")

# Searching with snippets
res = client.search("search engine", snippet_sizes = {'body': 50})

# Searching with complext parameters:
q = Query("search engine").verbatim().no_content().paging(0,5)
res = client.search(q)


# the result has the total number of results, and a list of documents
print res.total # "1"
print res.docs[0].title

Andreas.Motl · April 19, 2019, 6:41pm

Elasticsearch is a great feature-rich search product from created by the great people at Elastic.co, but when it comes to performance, it has inherent architecture deficiencies comparing RediSearch as can be seen in the following table.

– Search Benchmarking: RediSearch vs. Elasticsearch | Redis Labs

Component	RediSearch	Elasticsearch
Search engine	Dedicated engine based on modern and optimized data-structures	20 years old Lucene engine
Programing language	C-based, extremely optimized	Java
Memory technology	Runs natively on DRAM and Persistent Memory	Disk-based with a caching option
Protocol	The optimized RESP (REdis Serialization Protocol)	HTTP