Update Index sizing guidelines in docs after Autosharding#7151
Update Index sizing guidelines in docs after Autosharding#7151ankikuma wants to merge 5 commits into
Conversation
Elastic Docs AI PR menuCheck the box to run an AI review for this pull request.
Powered by GitHub Agentic Workflows and docs-actions. For more information, reach out to the docs team. |
✅ Elastic Docs Style Checker (Vale)No issues found on modified lines! The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale. |
| ## Elasticsearch index sizing guidelines [elasticsearch-differences-serverless-index-size] | ||
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations. | ||
|
|
||
| If you created your index after **June 1, 2026**, your index can grow upto 4.8TB without any performance impact. |
There was a problem hiding this comment.
Is 4.8TB the max index size that has been tested across all use-cases w/o any performance impact?
There was a problem hiding this comment.
Thanks for reviewing Yuvi.
I am actually not sure if we have specific tests with large number of shards. Let me check and get back.
The original guidelines calculate the maximum index size based on 100 GB shards and default shard count per project. This is because performance is expected to degrade when shards grow larger than 100 GB.
With Autosharding, we are able to increase the number of shards in the index once the shards get larger than 100 GB. That means we can support a larger index while still keeping the shard size to 100 GB. We currently have an upper limit on the number of shards an index can auto shard upto (=48). This is just because autosharding is a new feature and we wanted to prevent against runaway autosharding. So the 4.8TB number is just 48 x 100 GB.
We believe that autoscaling will take care of distributing the shards across nodes as needed, as the index grows.
There was a problem hiding this comment.
Thanks Ankita for the comment. Let's confirm if we have tests that can align with the statement (without any performance impact) so that we can say it with confidence.
There was a problem hiding this comment.
I wonder if we want to be this specific. 48 shards is just a limit of autosharding feature and i can imagine us manually resharding to 96 (and beyond?) on a case by case basis. AFAIK there is no technical limitation here?
There was a problem hiding this comment.
I agree @lkts. We can round that up to ~5 TB. I do mention that the limit can be raised further via overrides. We could reword that as well to say something like "This is a soft limit and can be raised on a case-by-case basis".
I wanted to see if Yuvi agrees with this doc change and how much information we want to give the customer.
I talked to Jason about performance tests and he agrees that it will be good to set a baseline for 6 shards x 100 GB indices to compare against larger 48 x 100 GB indices. I am working with him to run some tests.
There was a problem hiding this comment.
I would maybe frame this as "There is no limit and we have confirmed internally that 5 TB definitely works. Work with support if your use case exceeds that.". What do you think (obviously wording can be way nicer)?
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations. | ||
|
|
||
| If you created your index after **June 1, 2026**, your index can grow upto 4.8TB without any performance impact. | ||
| That limit can be raised further via certain project overrides. |
There was a problem hiding this comment.
Is there a knob that we control internally that we can use to increase the index size above 4.8TB without any performance impact? If yes, do we know the max size?
There was a problem hiding this comment.
The knob is AUTO_RESHARD_MAX_SHARDS_SETTING, but again, I am not sure about tests.
There was a problem hiding this comment.
This seems to be an internal knob that we don't expose to customers, right?
There was a problem hiding this comment.
That's correct, my understanding is that we don't want to expose shard level details to serverless customers.
| ## Elasticsearch index sizing guidelines [elasticsearch-differences-serverless-index-size] | ||
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations. | ||
|
|
||
| If you created your index after **June 1, 2026**, your index can grow upto 4.8TB without any performance impact. |
There was a problem hiding this comment.
Are there any caveats to June 1 2026 or we can saw explicitly that all indices after June 1 2026 across all project types (O11y, Security, ES3) benefit from this.
There was a problem hiding this comment.
The June 1, 2026 date comes from the fact that the minimum required index version SHARD_OBLIVIOUS_SLICING was committed on May 22 and was picked up in production by June 1.
I can't think of any other caveats.
| ## Elasticsearch index sizing guidelines [elasticsearch-differences-serverless-index-size] | ||
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations. | ||
|
|
||
| If you created your index after **June 1, 2026**, your index can grow upto 4.8TB without any performance impact. |
There was a problem hiding this comment.
I wonder if we want to be this specific. 48 shards is just a limit of autosharding feature and i can imagine us manually resharding to 96 (and beyond?) on a case by case basis. AFAIK there is no technical limitation here?
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations. | ||
|
|
||
| If you created your index after **June 1, 2026**, your index can grow upto 4.8TB without any performance impact. | ||
| That limit can be raised further via overrides. |
There was a problem hiding this comment.
I am not sure we should advertise overrides in public documentation. Is there precedence for this?
There was a problem hiding this comment.
So "overrides" is not a public facing word that we use; so we cannot use that.
We have precedence on the index limit (as an example) where we mention that there are limits that can be increased.
Verbiage from the public doc:
The index limit is adjustable and can be increased by request, while others are fixed. To request a limit increase, open a support case, and include your preferred new value and a brief description of your use case. Providing meaningful details around your use case and desired outcome ensures that Elastic can make recommendations that best suit your workload.
So similar to that, if we are mentioning a limit (that can be increased), we can use similar verbiage (or just combine within existing verbiage that it also applies to this limit).
There was a problem hiding this comment.
Agreed. Perhaps we can say "This is a soft limit and can be raised on a case-by-case basis" ?
| That limit can be raised further via overrides. | ||
|
|
||
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations: | ||
| If you created your index before **June 1, 2026**, follow these recommendations according to project type: |
There was a problem hiding this comment.
can we tell them to reindex if the size becomes an issue? will they automatically get the benefit for models where the index is abstracted, like data streams?
|
|
||
|
|
||
| ## Elasticsearch index sizing guidelines [elasticsearch-differences-serverless-index-size] | ||
| To ensure optimal performance in Serverless Elasticsearch projects, follow these sizing recommendations. |
There was a problem hiding this comment.
the heading and therefore body might be wrong here. I assume that this index sizing guideline applies to all projects equally, not just ES ones?
Update the docs here.
Indices that are create with index version greater than or equal to
SHARD_OBLIVIOUS_SLICINGcan grow upto 48 shards (a dynamic settingAUTO_RESHARD_MAX_SHARDS_SETTING), via Autosharding. An index is resharded (to 2x shards) when its' average shard size crosses 100GB.