A Teachable Moment...
I've been pushing Bro logs to ELK for some time now. I think it's a brilliant way to index and search the 125GB+ of Bro logs my organisation generates each day. That particular installation was not tweaked beyond what was necessary in logstash to read in the Bro logs and do some basic manipulation (which I'll cover here in a future post!) to make them a tiny bit more useful.
Since I didn't really change anything, I left the number of shards at five...but I only have three elasticsearch nodes in that cluster. That means two of the nodes have to do two searches *every time* I search an index.
Which brings me to tonight's post. What do you do if you have one node in your cluster but the default number of shards is five? Let's see!
Laying Out a Template
Remember that every item in elasticsearch is a JSON object - and that includes templates. For example, an object to set the number of shards to two for an index named "my_index" would look like this:
{
"template" : "my_index*",
"settings" :
{
"index" :
{
"number_of_shards" : 2
}
}
}
This could also be represented:
{ "template" : "my_index", "settings" : { "index" : { "number_of_shards" : 2 } } }
Or, if I wanted to set the number of shards to one and disable replication (all on one line):
{ "template" : "my_index", "settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 } } }
There are two ways to apply the template -- by using curl to interact with elasticsearch directly or by telling logstash how to define the index when it writes to elasticsearch.
Create the Template With curl
First, I want to make sure I don't have a template or index named "my_index" using:
curl http://192.168.1.111:9200/_cat/indices?pretty
curl http://192.168.1.111:9200/_cat/templates?pretty
I have one template, "logstash", that gets applied to any new indexes with a name starting with "logstash-", and one index, ".kibana", that's used to store the settings for kibana. Note that the ".kibana" index has a status of "green" - that means elasticsearch has been able to allocate all of the shards for that index (I've already set it to only have one shard):
Now I can create my template with one curl command (this would all be on one line):
curl -XPUT http://192.168.1.111:9200/_template/my_template?pretty -H 'Content-Type: application/json' -d '{ "template" : "my_index*", "settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 } } } '
And then I can verify it with:
curl http://192.168.1.111:9200/_template/my_template?pretty
When it runs, it looks like this:
Now if I create two indexes, "test" and "my_index-2017.05.07", I can see whether my template is used as intended. The date isn't really important in that index name, I just added it because I used "my_index*" in the template and I want to show that anything beginning with "my_index" has the template applied. I can create an index with:
curl -XPUT http://192.168.1.111:9200/test?pretty
curl -XPUT http://192.168.1.111:9200/my_index-2017.05.07?pretty
When I run it, I get the following output:
Then I can verify the indexes were created and the template applied with:
curl http://192.168.1.111:9200/test
curl http://192.168.1.111:9200/my_index-2017.05.07
Notice how "test" has the default five shards/one replica and "my_index" has one shard/zero replicas. The template worked!
Create the Template With Logstash
Defining the template with curl is interesting but you can also create/apply the template in your logstash configuration using the "template"/"template_name" option. I did this with the same information I used when I defined it with curl - I want to name the index "my_index-<date>", I want to name the template "my_template" and I want to set one shard with zero replicas.
First, I wrote out a file in the /etc/logstash/ directory called "my_template.json". It contains the JSON object that I want to use as my template:
And the copy/paste version:
{
"template": "my_template",
"settings": { "index" : { "number_of_replicas": 0, "number_of_shards": 1 } }
}
Then I added a section in my output { } block to tell the elasticsearch plugin to use my template:
At that point I can restart logstash and it will create the index (and template) as soon as it uses the elasticsearch plugin for storage.
In Closing
The default settings for elasticsearch are Pretty Good; if you're going to change anything then it's good to make small, incremental changes and document the results after each one. Maybe you need to increase/decrease shard counts per index, maybe you want to change the number of replica shards per index, maybe you want to specify a data type for a given field - all of these are set at index creation by templates.
So what's the best way to test? I'd say take a sample data set, import it with a given configuration, see what you think. Don't like it? No problem! Delete the index, tweak the template, import the data again. For me, that's the real benefit of setting the template via the elasticsearch plugin -- if I don't like something I don't have to try to copy/paste and then edit a command, I can just open the file in an editor and continue testing.
As I've said before, don't be afraid to try different things and see what works for you!