Elasticsearch in Java projects – query documents
The previous article 'Elasticsearch in Java projects – index and read documents’ presents how to store the documents in the index. However, the primary function of the Elasticsearch is fast and efficient searching ability of the indexed documents based on provided queries. This article presents the basics about queries – how they are structured and used as a part of a search requests.
Queries and filters
By default, when returns the search result, Elasticsearch will sort them based on the relevance score that indicates how well the document matches the query. The score is a floating point value that is calculated by the query and stored as a parameter in document metadata. The score calculation depends on whether the query was run either in query context or filter context.
Filter context
A filter answers the question „Does the document match the query at all or not?” and the score is not calculated. Frequently used filters will be cached automatically by Elasticsearch, to speed up performance.
Query context
A query is similar to a filter, but also answers on the question „How well does the document match the query?”. For instance, the query is to contain the word „run”, but words „runs” or „running” also match.
In some cases both can be used interchangeable e.g. query the countryIso but filter is simple faster.
You can find below the example of queried documents where the score is already calculated.
Demo project
The idea is to demonstrate usage of queries by adding the possibility to filter the search results based on the provided parameters. The indexed documents are checked against following simple criteria:
- is driver active
- has driver’s specified nationality
- is driver’s birthdate is within provided year frames
Java API
The previous article introduced already the SearchRequest
and SearchSourceBuilder
objects that are used by Rest High Level Client for any searching operation. In order to add some more specific queries that are executed within a search request, the query builder that implements the interface QueryBuilder
needs to be added to the request. Every type of query supported by the Query DSL has its supportive query builder. This query builder object can be created either by calling the constructor or by calling the helper methods of QueryBuilders
.
The basic search request with match query against index drivers
looks following:
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("nationality", "italian");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(matchQueryBuilder);
SearchRequest searchRequest = new SearchRequest("drivers");
searchRequest.source(searchSourceBuilder);
If there are more conditions, they can be combined using BoolQueryBuilder
where different queries are put as parameter to filter
method. Below you can find the code snippet that shows how to create the QueryBuilder
with some conditions. This query is equivalent to sentence: „All drivers that are active AND are from Germany AND have no titles AND have at least one win AND were born after 01-01-1985″
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.filter(termQuery("active", true));
boolQueryBuilder.filter(termQuery("nationality", "German"));
boolQueryBuilder.filter(termQuery("statistics.titles", 0));
boolQueryBuilder.filter(rangeQuery("statistics.wins").gte(AT_LEAST_ONE));
boolQueryBuilder.filter(rangeQuery("dateOfBirth").gte("01-01-1985").includeLower(true));
}
Most important Queries and filters
Below you can find the list of the most popular search queries with short description and method of QueryBuilders
is presented.
Match All Query
- match all query –
QueryBuilders.matchAllQuery
– all documents are returned because all of them are considered to be equally relevant. It is the default query that is used if no other query has been specified;
Full Text Queries
- match query –
QueryBuilders.matchQuery
– the standard query for performing full-text search. Documents that match provided keyword (text, number, date or boolean) are returned; - multi match query –
QueryBuilders.multiMatchQuery
– The multi match query allows you to run the same match query on multiple fields; - match phrase query –
QueryBuilders.matchPhraseQuery
– the query analyzes the text and creates a phrase query out of the analyzed text;
Term level queries
- term query –
QueryBuilders.termQuery
– the query returns only documents that contain an exact value in a provided field. The Term Query is preferable to find precise value like driver ID over Match Query; - terms query –
QueryBuilders.termsQuery
– the query is similar to the term query but allows you to specify multiple values to match. If the field contains any of the specified values, the document matches; - range query –
QueryBuilders.rangeQuery
– the query returns documents for which numbers or dates fall into a specified range. It accepts following operators: - gt – grater than
- gte – grater than or equal to
- lt – less than
- lte – less than or equal to;
- exists query –
QueryBuilders.existsQuery
– the query returns documents that contain an indexed value for a field as it may happen that indexed value may not exist for document’s field; - regexp query –
QueryBuilders.regexpQuery
– the query returns documents that contains terms matching a regular expression; - ids query –
QueryBuilders.idsQuery
– the query returns the documents based on their IDs;
Compound queries
- bool query-
QueryBuilders.boolQuery
– the query returns documents that match boolean combinations of other queries. It is built using one or more boolean clauses, each clause with a typed occurrence like: must
– the query must be present in matching documents;filter
– similar to must but score is ignored;should
– the query should be present in matching documents;mustNot
– the query must not be present in matching documents;;- constant score query –
QueryBuilders.constantScoreQuery
– the query returns documents that their relevance score match a provided value;
Joining queries
- nested Query –
QueryBuilders.nestedQuery
– the query returns documents that field, which belongs to a nested object, matches the provided value.
Summary
This article introduces the queries – heart of searching in the Elasticsearch. It presents how to add specific queries to the project in a way that allows to return only the scope of the documents that fulfill the requirements. Beside that there were mentioned filters and queries that are commonly used.