Skip to content

[EVALS] Kiali tasks run during standard evaluations despite filtering infrastructure #642

@manusa

Description

@manusa

Description

The gevals workflow (gevals.yaml) includes infrastructure intended to filter out Kiali tasks when Istio/Kiali is not installed. However, this filtering mechanism is not working as intended, causing Kiali tasks to run and fail during standard evaluation runs, which negatively impacts the overall success rate.

This is also a problem when running gevals locally - there is currently no easy or documented way to exclude Kiali tasks from a local evaluation run.

Background / Analysis

Current Workflow Logic

The workflow at .github/workflows/gevals.yaml has the following logic:

  1. Lines 82-87: Sets kiali-run=true ONLY if the task-filter input contains the word "kiali"
  2. Lines 109-111: Istio/Kiali infrastructure is ONLY installed when kiali-run=true
  3. Lines 113-116: The MCP server enables the kiali toolset ONLY when kiali-run=true
  4. Line 124: The task-filter parameter defaults to empty string ''

The Problem

When task-filter is empty (the default for scheduled runs and local executions):

  • All tasks run because the glob pattern in eval.yaml (../tasks/*/*/*.yaml) matches both kubernetes/ and kiali/ task directories
  • Istio/Kiali is NOT installed because task-filter doesn't contain "kiali"
  • Result: All ~19 Kiali tasks run but fail because the required infrastructure isn't present

Why Regex Filtering Won't Work

The gevals tool filters tasks using the --run flag, which matches against taskSpec.Metadata.Name (the task name in the YAML file), NOT the file path.

Kiali task names (no consistent prefix):

  • "mesh-status", "get-namespaces", "Create a gateway", "Remove fault Injection", "List all VS in bookinfo namespace", etc.

Kubernetes task names (no consistent prefix):

  • "create-canary-deployment", "fix-crashloop", "horizontal-pod-autoscaler", "scale-deployment", etc.

There is no naming convention that would allow a regex to distinguish between Kiali and Kubernetes tasks. A regex like ^kubernetes/ would not work because it matches file paths, not task names.

Impact

  • Scheduled evaluations (weekly) run all ~43 tasks but only have infrastructure for kubernetes tasks (~24)
  • ~19 Kiali tasks fail because Istio/Kiali is not installed
  • Overall pass rate drops significantly (~44% of tasks fail just from missing Kiali infrastructure)
  • Local gevals runs have no easy way to exclude Kiali tasks, making local testing difficult

Evidence

See workflow run: https://github.com/containers/kubernetes-mcp-server/actions/runs/20913812421/job/60082412281

Proposed Solutions

Option 1: Adopt a Task Naming Convention (Recommended)

Prefix all task names with their category:

  • Kiali tasks: kiali-mesh-status, kiali-get-namespaces, etc.
  • Kubernetes tasks: k8s-create-canary-deployment, k8s-fix-crashloop, etc.

This would allow filtering with regex like ^k8s- or ^(?!kiali-).

Option 2: Separate Eval Configuration Files

Create separate eval.yaml files:

  • eval-kubernetes.yaml with glob ../tasks/kubernetes/*/*.yaml
  • eval-kiali.yaml with glob ../tasks/kiali/*/*.yaml

Update the workflow to use the appropriate config based on the run type. This also makes local execution clearer.

Recommended Next Steps

Consider enhancing gevals to support path-based filtering (e.g., a --path-filter or --exclude-path flag that matches against file paths instead of task names). This would be a more robust solution that doesn't require changes to existing task definitions.

Acceptance Criteria

  • Standard (non-Kiali) evaluation runs do NOT execute Kiali tasks
  • Kiali-specific runs (with appropriate task-filter) still work correctly
  • Local gevals runs have a documented and easy way to exclude Kiali tasks
  • The solution is maintainable when new tasks are added

Labels

bug, enhancement

Tests

  • Run standard gevals evaluation and verify only kubernetes tasks execute
  • Run kiali-specific gevals evaluation and verify kiali tasks execute with Istio installed
  • Verify local ./gevals-linux-amd64 eval evals/claude-code/eval.yaml can filter out kiali tasks

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions