-
Notifications
You must be signed in to change notification settings - Fork 355
Description
What is the bug?
For clusters with large numbers of indices (e.g. ~11,000 in the case I first observed this issue) with the plugins.security.audit.config.resolve_indices OpenSearch setting and appender.rolling_audit.layout.maxMessageLength=0 log4j setting enabled (to disable log message truncation), the audit_trace_indices and audit_trace_resolved_indices fields in audit logs that relate to all indices can be extremely long. For example, 11,000 indexes with 100 character long names requires 1,100,000 characters to just to represent the index names alone (plus even more characters for the surrounding quotes and commas between each of them).
These messages can get so large as to cause problems for downstream parts of your logging pipeline (for example, the default Apache Kafka maximum message is 1mb). In these cases, it is usually recommended to split large messages into smaller ones, as they are able to be handled more efficiently than giant messages.
It would be ideal if OpenSearch could be configured to split audit messages with huge numbers of index names (such as by specifying a maximum number of index name characters per log message) into multiple smaller messages, keeping all other audit message fields the same apart from audit_trace_indices and audit_trace_resolved_indices. For a simple example, if we could set the maximum index name characters per message to 18, this original message:
{
"audit_trace_indices": [
"index*"
],
"audit_trace_resolved_indices": [
"index1",
"index2",
"index3",
"index4",
"index5",
"index6",
"index7",
"index8"
],
"audit_category": "AUTHENTICATED"
}Would then be split into the following 3 messages:
{
"audit_trace_indices": [
"index*"
]
"audit_trace_resolved_indices": [
"index2",
"index3"
],
"audit_category": "AUTHENTICATED"
}{
"audit_trace_resolved_indices": [
"index4",
"index5",
"index6"
],
"audit_category": "AUTHENTICATED"
}{
"audit_trace_resolved_indices": [
"index7",
"index8"
],
"audit_category": "AUTHENTICATED"
}These 3 split messages contain the exact same information as the source message, so no information is lost (although it is more effort to re-construct the original event).
How can one reproduce the bug?
Steps to reproduce the behavior:
- Enable log4j audit logging on your cluster, with an unlimited max message size, e.g.
appender.rolling_audit.layout.maxMessageLength=0andplugins.security.audit.config.resolve_indicesenabled. - Create 11,000 indexes with names at least 100 characters long.
- Perform a simple request that relates to every index, e.g. "GET /*"
- Observe the giant audit logs produced in the log4j output file.
What is the expected behavior?
It would be ideal if OpenSearch could be configured to split audit messages with huge numbers of index names (such as by specifying a maximum number of index name characters per log message) into multiple smaller messages, keeping all other audit message fields the same apart from audit_trace_indices and audit_trace_resolved_indices.
What is your host/environment?
- OS: Debian 12
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
Somewhat relates to #5363