Elasticsearch: Add a method to compute alias names for index templates

Created on 6 Mar 2014  ·  67Comments  ·  Source: elastic/elasticsearch

This is a proposal to extend the functionality to support aliases within templates #5180, #1825, #2739

One use case that the current implementation does not cover is the ability to handle an index per day scenario. In this case it is nice to be able to present a single view of the indices which for the sake of argument look like "product_elasticsearch_20140306" as a unified alias which look like "product_elasticsearch".

The idea is to allow alias names to be formatted similarly to how python handles format strings http://docs.python.org/2/library/string.html#formatstrings

If we have a script define the value of the alias's format string parameters then regexes are not required(although still an option) to get a substring of the index name.

Here is an example:

PUT _template/template_1

{
    "template" : "product_*",
    "settings" : {
        "number_of_shards" : 1
    },
    "aliases": {
       "alias1" : {},
        "alias2" : {
            "filter" : {
                "term" : {"user" : "kimchy" }
            },
            "routing" : "kimchy"
        },
        "{index}-alias-for-{gender}": {
            "filter" : {
                "term" : {"product" : "Elasticsearch" }
            },
            "routing" : "Elasticsearch",
            "format": {
                "index": {
                    "script": "_index['name'].value",
                },
                "gender": {
                    "script": "doc['isFemale'].value ? f : m",
                    "params": {
                        "m": "males",
                        "f": "females"
                    }
                }
            }
        }
    }
}
:CorFeatureIndices APIs >enhancement discuss

Most helpful comment

Please, let's leave it open. #20367 solves only one use case (agreed that probably most popular), but what has been discussed here has broader scope.

All 67 comments

Along similar lines, the definitive elasticsearch guide says to "use aliases instead of indices in your application", however with templates and timestamped indicies this AFAIK not possible, eg. I can't have a template to match my_index_v1-2014.03.26 and create an alias of my_index-2014.03.26

This is basically the same as Shay's comment on #5180

I would really like to have too the alias with date part added from index name, as it will make query to hit only those indexes with specific date otherwise it will hit all the indexes and degrade performance.

Would love to revisit this. It is rather clunky to hack around this with a cron job to create aliases every day.

It would be nice to be able to do some scripting for creating an alias name. But most of the times it would be used to implement the index per day pattern. So it still seems like a hack.
That's why I've created a separate issue ( #7631 ) for natively supporting this common pattern.

What do you think?

An alternative way for triggering aliases creations, instead of cronning jobs, would be to know if the index was automatically created after performing the index action. I've created a separate issue ( #7634 ) for it as well. I guess it is the easiest approach to implement (compared to scripting in templates or native rolling indexes).

+1

To start with I would love to be able to specify time-based aliases. For example:

PUT /_template/my_template

...
"aliases": { "my_alias-{YYYY}-{MM}-{DD}" } 

would create my_alias-2015.03.27 on the new index, etc. Although the full scripting would be awesome.

We discussed this today and thought about the regex-based solution that Shay suggested in the linked issue, for example, if we supported full regexps in the template name with captured vars:

{ 
  "template": "(logstash-\d\d\d\d)-*",
  ...
  "aliases": {
    "{index:$1}": { ... }
  }
}

Then the alias for an index logstash-2014-03-20 would be logstash-2014. I am concerned about running scripts during index creation (especially with dynamic scripts being disabled for Groovy).

@javanna what do you think of this? You having been a pretty involved in the alias templates stuff.

what do you think of this?

what can I say? I don't like regexes... if I have to choose between groovy scripts and regexes...I might actually go with scripts... sorry.

Scripts seem much too complicated here. In fact we already basically support regexes in the template parameter (just with the only operator being *?), this would just extend them to be _real_ regexes so that you could use pattern subs in other places.

I think the regexes with captured vars would be sufficiently powerful to solve this problem. Would be fantastic if it worked with #7634 as well.

I was hoping for an alias name, computed/based on a field's value.

+1

I definitely agree that something like this is necessary. I am currently also using cronjobs to create aliases for daily indices, but solving this in the template would definitely be a better way.
A full blown regex solution is not even really necessary, a simple fix would be to allow for substitution of the fixed part of the "template" value, so if the template applies to "logstash-*", keep the part matched by the * but allow for substitution of the logstash part for another fixed string.
That would effectively cater to any recurring index.

+1

+1

+1

+1

+1

+1

+1

+1

+1

+1

+1

+1

Another common use case that will benefit from extending support for aliases is shared indices with routing. Would be helpful to be able to generate these dynamically using a template (for some deployments, can be thousands+ of them).

POST /_aliases
{ 
"actions": [{ 
"add": { 
"index": "sales", 
"alias": sales_<CUST_ID>, 
"filter": {"term": {"customer_id": <CUST_ID>}}, 
"routing": <CUST_ID> 
} 
}] 
}

@martijnvg has this issue been solved by https://github.com/elastic/elasticsearch/pull/12209 ?

Partly yes, and I'm sure it would solve my immediate issue.

More interesting would be to extend the concept, and have capturing groups in the name regex for index templates, so to be able to reuse part of the index name in the alias definition. But I can call myself happy enough with #12209 :+1:

If I define an alias per user and I have millions of users, this enhancement would allow me to keep one alias for all users instead of manually defining an alias per-user?

As a result of less aliases, some load would be reduced from the cluster's state, right?

Could really use this feature . Maybe make a path to be definable in the alias like it works in the mapping for other other fields so we can do a workaround ?

Like I have for _id field for example to make it custom you could also make it possible for alias to support it .
"_id": {
"path": "event_id"
},

+1

:+1: Just curious if there is any update on whether this is still being considered and/or worked on?

+1

+1

+1

Is there a way to rotate the alias by time based events?
I was wonrdering if I could create a "current", "yesterday", "last_month", "last_year" like aliases and make it rotate based on current date. For instance, when the day changes from let's say 2016.06.30 to 2016.07.01 the aliases would be automatically changed in a way that current would be the index 2016.07.01, yesterday would be 2016.06.30 and so on.
So I would always have an updated alias pointing to the correct index.

Thank you for you answer Clinton, but I'm not sure if that would solve my problem. This rolls indexes based on incremental integer.
I do not want to roll my indices, I want to roll my aliases over the newly created indices.
I want that the alias "current" always point to the index of today (eg, index-2016.07.01) .
But when the day ends, "current" alias should point to the new created index index-2016.07.02.
I solved the automatically daily index creation part.
I need to solve the pointers of alias.
And there's one more restriction: I'm currently using Elasticsearch 1.7.3.
So I wouldn't have that feature anyway.

+1

+1

+1

+1

https://github.com/elastic/elasticsearch/issues/20367 should mostly solve this for the logging use case

Discussed in fix it friday. Closing in favour of #20367.

Please, let's leave it open. #20367 solves only one use case (agreed that probably most popular), but what has been discussed here has broader scope.

Did something change from the last post? Any chance for having full features?

+1

+1

+1

+1

+1

+1

+1

Can this get reopened? @clintongormley

+1

+1
I would like to say that this conversation do not speak about the case where index contain a version number
As an example, I'm using index with this kind of name: july_dec_2018_v2, (where v2 is the version number of the index, and should be incremental).
The alias of this index looks like: @july_dec_2018

Would be great to be able to this with template ! I don't think it's possible for now, is it ?

I am sorry, This is me setting .

{
"template" : "debug-*",
"settings" : {
"number_of_shards" : 1
},
"aliases": {
"{application}-alias-for-{env}": {
"format": {
"env": {
"lang": "expression",
"script": "doc['name'].value"
},
"application": {
"lang": "expression",
"script": "doc['app'].value"
}
}
}
}
}

but result is
"debug-2018": {
"aliases": {
"application": {},
"{application}-alias-for-{env}": {}
}
}

+1

+1

We also use versioned time-based indices and could really use this feature. As it stands now, we have to create indices before we do any writing so that we can create the versionless alias we use for writing.

+1

+1

@rbraley has this been introduced?

In general, if ElasticSearch recommends people to use an approach, it's important that the feature can satisfy people's needs. However, they're still not a well-designed way to use alias. #20367 only solves a very specific problem.

+1

I found this, maybe it helps:
https://stackoverflow.com/questions/37690083/elasticsearch-alias-auto-update-for-rolling-index

PUT /_template/my_elmah_template
   {
       "template" : "elmah_*",
       "aliases" : {
             "elmah_all" : { }
       }
   }
Was this page helpful?
0 / 5 - 0 ratings