Solr is an open-source search platform from the Apache Lucene project, and is written in Java. It includes full-text search, spell-checking, hit highlighting, faceted search, real-time indexing, dynamic clustering, and database integration, NoSQL features and rich document handling make it easy for developers to build high-performance search applications.
Thanks to its distributed search and index replication functionality, Apache Solr is built with scalability and fault tolerance in mind. In fact, it's often used for enterprise search and analytics use cases.
After deployment, Solr will use Lucene to create an inverted index — this is because it inverts a page-centric data structure (meaning one that converts documents to words) into a keyword-centric structure (meaning one that converts words into documents). In a lot of ways, it's like the index you might see at the end of a book where you can see where certain words occur within the previous pages.
Solr runs as a standalone full-text search server and has a very active development community behind it. It offers regular releases as well.
Why use a Telegraf plugin for Apache Solr?
Since Solr is used to provide high-performance search in your application, keeping this component performant and available is important. Therefore, monitoring is crucial. Administration and monitoring can be performed using the Solr Telegraf plugin which collects stats when they are exposed via the MBean Request Handler. These stats are per core and are the same information provided on the Plugin/Stats page of the Solr Admin UI. This page shows information and statistics about the status and performance of various plugins running in each Solr core. Information about the performance of the Solr caches, the state of Solr’s searchers, and the configuration of Request Handlers and Search Components are all available to collect with this Telegraf plugin for storage and visualization in InfluxDB.
How to monitor Apache Solr using the Telegraf plugin
The Apache Solr Telegraf plugin collects metrics via the MBean Request Handler. You simply configure Telegraf with a list of Solr servers, a list of Solr cores, and your credentials.
To properly configure Apache Solr within your own environment, you would use the following configuration:
[[inputs.solr]] ## specify a list of one or more Solr servers servers = ["http://localhost:8983"] ## ## specify a list of one or more Solr cores (default - all) # cores = ["main"] ## ## Optional HTTP Basic Auth Credentials # username = "username" # password = "pa$$word"
Obviously, you would want to fill in the blanks with information relevant to your environment. After specifying a list of one or more Solr servers that you want to monitor, you can also list one or more cores as well. Then, after authorizing Solr using the appropriate username and password, it will start collecting all relevant information for display within Telegraf.
Key Apache Solr metrics to use for monitoring
Some of the important Apache Solr metrics that you should proactively monitor include:
- Memory utilization
- Thread usage details
- waiting threads, blocked threads, terminated threads, peak threads, etc.
- How well the query handler is processing incoming requests search
- requests per minute, search errors per minute, search timeouts per minute
- Cache level details
- lookups, hit ratio, evictions, and cache size
- How the update handler is handling update requests
- number of commits, rollbacks, the documents that are added/deleted, the pending documents, the errors per minute, etc.
Note that if you don't want to retrieve all metrics at once (as is the default), you can also single out individual ones like counter, gauge, histogram, meter and timer. You can also specify more than one at a time by separating each request by a comma.