RediSearch: FullText Search and Secondary Index module for Redis

Overview

RediSearch implements a search engine on top of Redis, but unlike other Redis search libraries, it does not use internal data structures like Sorted Sets.

Inverted indexes are stored as a special compressed data type that allows for fast indexing and search speed, and low memory footprint.

This also enables more advanced features, like exact phrase matching and numeric filtering for text queries, that are not possible or efficient with traditional Redis search approaches.

Features

  • Full-Text indexing of multiple fields in documents.
  • Incremental indexing without performance loss.
  • Document ranking (provided manually by the user at index time).
  • Complex boolean queries with AND, OR, NOT operators between sub-queries.
  • Optional query clauses.
  • Prefix based searches.
  • Field weights.
  • Auto-complete suggestions (with fuzzy prefix suggestions).
  • Exact Phrase Search, Slop based search.
  • Stemming based query expansion in many languages (using Snowball).
  • Support for logographic (Chinese, etc.) tokenization and querying (using Friso)
  • Support for custom functions for query expansion and scoring (see Extensions).
  • Limiting searches to specific document fields (up to 128 fields supported).
  • Numeric filters and ranges.
  • Geographical search utilizing Redis’ own Geo commands.
  • A powerful aggregations engine.
  • Unicode support (UTF-8 input required).
  • Retrieve full document content or just ids.
  • Automatically index existing HASH keys as documents.
  • Document deletion and updating with index garbage collection.
  • Partial and conditional document updates.
  • Sortable properties (i.e. sorting users by age or name).

Sources

RediSearch Python Client

Usage

# Creating a client with a given index name
client = Client('myIndex')

# Creating the index definition and schema
client.create_index((TextField('title', weight=5.0), TextField('body')))

# Indexing a document
client.add_document('doc1', title = 'RediSearch', body = 'Redisearch impements a search engine on top of redis')

# Simple search
res = client.search("search engine")

# Searching with snippets
res = client.search("search engine", snippet_sizes = {'body': 50})

# Searching with complext parameters:
q = Query("search engine").verbatim().no_content().paging(0,5)
res = client.search(q)


# the result has the total number of results, and a list of documents
print res.total # "1"
print res.docs[0].title 

Elasticsearch is a great feature-rich search product from created by the great people at Elastic.co, but when it comes to performance, it has inherent architecture deficiencies comparing RediSearch as can be seen in the following table.

Search Benchmarking: RediSearch vs. Elasticsearch | Redis Labs

Component RediSearch Elasticsearch
Search engine Dedicated engine based on modern and optimized data-structures 20 years old Lucene engine
Programing language C-based, extremely optimized Java
Memory technology Runs natively on DRAM and Persistent Memory Disk-based with a caching option
Protocol The optimized RESP (REdis Serialization Protocol) HTTP