Skip to content

Only call doCache one per request#6001

Open
cwperks wants to merge 4 commits intoopensearch-project:mainfrom
cwperks:skip-do-cache
Open

Only call doCache one per request#6001
cwperks wants to merge 4 commits intoopensearch-project:mainfrom
cwperks:skip-do-cache

Conversation

@cwperks
Copy link
Member

@cwperks cwperks commented Mar 11, 2026

Description

We recently saw this method in the hotthreads output of a cluster that had many aliases pointing to the same concrete index.

   100.4% (502.1ms out of 500ms) cpu usage by thread 'opensearch[...][search][T#11]'
     3/10 snapshots sharing following 49 elements
       org.opensearch.security.privileges.dlsfls.AbstractRuleBasedPrivileges.hasRestrictedRulesExplicit(AbstractRuleBasedPrivileges.java:286)
       org.opensearch.security.privileges.dlsfls.AbstractRuleBasedPrivileges.isUnrestricted(AbstractRuleBasedPrivileges.java:212)
       org.opensearch.security.privileges.dlsfls.FieldMasking.isUnrestricted(FieldMasking.java:53)
       org.opensearch.security.configuration.DlsFlsValveImpl.hasFlsOrFieldMasking(DlsFlsValveImpl.java:519)
       org.opensearch.security.OpenSearchSecurityPlugin$4.doCache(OpenSearchSecurityPlugin.java:839)
       app//org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:981)
       app//org.opensearch.search.internal.ContextIndexSearcher.createWeight(ContextIndexSearcher.java:239)
       app//org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:58)
       app//org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:265)
       app//org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:979)
       app//org.opensearch.search.internal.ContextIndexSearcher.createWeight(ContextIndexSearcher.java:239)
       app//org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:58)
       app//org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:265)
       app//org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:979)
       app//org.opensearch.search.internal.ContextIndexSearcher.createWeight(ContextIndexSearcher.java:239)
       app//org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:58)
       app//org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:265)
       app//org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:979)
       app//org.opensearch.search.internal.ContextIndexSearcher.createWeight(ContextIndexSearcher.java:239)

On further investigation, I found that doCache can sometimes be called many times per query when I would expect it to run once per request. The changes in this PR ensure that its called once per request per shard, stores the result in the threadcontext and uses that for subsequent clauses.

For instance, in the query below doCache would be called 5x per shard

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "field_b": "other_0" } },
        { "term": { "field_b": "other_1" } },
        { "term": { "field_b": "other_2" } },
        { "term": { "field_b": "other_3" } },
        { "term": { "field_b": "other_4" } }
      ]
    }
  }
}
  • Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation)

Bug fix

Issues Resolved

To be filed

Check List

  • New functionality includes testing
  • New functionality has been documented
  • New Roles/Permissions have a corresponding security dashboards plugin PR
  • API changes companion pull request created
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.81%. Comparing base (1d77e46) to head (0e8791a).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6001      +/-   ##
==========================================
- Coverage   73.82%   73.81%   -0.02%     
==========================================
  Files         439      439              
  Lines       27087    27104      +17     
  Branches     4018     4022       +4     
==========================================
+ Hits        19998    20007       +9     
- Misses       5180     5189       +9     
+ Partials     1909     1908       -1     
Files with missing lines Coverage Δ
.../opensearch/security/OpenSearchSecurityPlugin.java 85.19% <100.00%> (+0.12%) ⬆️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cwperks cwperks closed this Mar 12, 2026

@Override
public Weight doCache(Weight weight, QueryCachingPolicy policy) {
if (!indexSettings.getValue(IndicesRequestCache.INDEX_CACHE_REQUEST_ENABLED_SETTING)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure if this setting is the correct one here. The method is a about the query cache, but this setting is index cache specific.

Maybe we can approach the issue by adding a cache to hasRestrictedRulesExplicit() to avoid recurring rule calculations. I can look into this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I closed it because I'm not confident that I understand the different levels of caching deeply enough yet. I will re-open once I can take a deeper dive if it make sense.

Copy link
Member Author

@cwperks cwperks Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing, I'm finding that doCache can be called multiple times per query. For the example below it would be called 5x for each clause:

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "field_b": "other_0" } },
        { "term": { "field_b": "other_1" } },
        { "term": { "field_b": "other_2" } },
        { "term": { "field_b": "other_3" } },
        { "term": { "field_b": "other_4" } }
      ]
    }
  }
}

I think we can do some caching of the first result and short-circuit accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i can take care of that.

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks cwperks reopened this Mar 12, 2026
@cwperks cwperks changed the title Skip logic in doCache if index request cache is disabled Only call doCache one per request Mar 12, 2026
@cwperks cwperks closed this Mar 12, 2026
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks cwperks reopened this Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants