-
Notifications
You must be signed in to change notification settings - Fork 224
Description
Description
The gevals workflow (gevals.yaml) includes infrastructure intended to filter out Kiali tasks when Istio/Kiali is not installed. However, this filtering mechanism is not working as intended, causing Kiali tasks to run and fail during standard evaluation runs, which negatively impacts the overall success rate.
This is also a problem when running gevals locally - there is currently no easy or documented way to exclude Kiali tasks from a local evaluation run.
Background / Analysis
Current Workflow Logic
The workflow at .github/workflows/gevals.yaml has the following logic:
- Lines 82-87: Sets
kiali-run=trueONLY if thetask-filterinput contains the word "kiali" - Lines 109-111: Istio/Kiali infrastructure is ONLY installed when
kiali-run=true - Lines 113-116: The MCP server enables the kiali toolset ONLY when
kiali-run=true - Line 124: The
task-filterparameter defaults to empty string''
The Problem
When task-filter is empty (the default for scheduled runs and local executions):
- All tasks run because the glob pattern in
eval.yaml(../tasks/*/*/*.yaml) matches bothkubernetes/andkiali/task directories - Istio/Kiali is NOT installed because
task-filterdoesn't contain "kiali" - Result: All ~19 Kiali tasks run but fail because the required infrastructure isn't present
Why Regex Filtering Won't Work
The gevals tool filters tasks using the --run flag, which matches against taskSpec.Metadata.Name (the task name in the YAML file), NOT the file path.
Kiali task names (no consistent prefix):
"mesh-status","get-namespaces","Create a gateway","Remove fault Injection","List all VS in bookinfo namespace", etc.
Kubernetes task names (no consistent prefix):
"create-canary-deployment","fix-crashloop","horizontal-pod-autoscaler","scale-deployment", etc.
There is no naming convention that would allow a regex to distinguish between Kiali and Kubernetes tasks. A regex like ^kubernetes/ would not work because it matches file paths, not task names.
Impact
- Scheduled evaluations (weekly) run all ~43 tasks but only have infrastructure for kubernetes tasks (~24)
- ~19 Kiali tasks fail because Istio/Kiali is not installed
- Overall pass rate drops significantly (~44% of tasks fail just from missing Kiali infrastructure)
- Local gevals runs have no easy way to exclude Kiali tasks, making local testing difficult
Evidence
See workflow run: https://github.com/containers/kubernetes-mcp-server/actions/runs/20913812421/job/60082412281
Proposed Solutions
Option 1: Adopt a Task Naming Convention (Recommended)
Prefix all task names with their category:
- Kiali tasks:
kiali-mesh-status,kiali-get-namespaces, etc. - Kubernetes tasks:
k8s-create-canary-deployment,k8s-fix-crashloop, etc.
This would allow filtering with regex like ^k8s- or ^(?!kiali-).
Option 2: Separate Eval Configuration Files
Create separate eval.yaml files:
eval-kubernetes.yamlwith glob../tasks/kubernetes/*/*.yamleval-kiali.yamlwith glob../tasks/kiali/*/*.yaml
Update the workflow to use the appropriate config based on the run type. This also makes local execution clearer.
Recommended Next Steps
Consider enhancing gevals to support path-based filtering (e.g., a --path-filter or --exclude-path flag that matches against file paths instead of task names). This would be a more robust solution that doesn't require changes to existing task definitions.
Acceptance Criteria
- Standard (non-Kiali) evaluation runs do NOT execute Kiali tasks
- Kiali-specific runs (with appropriate task-filter) still work correctly
- Local gevals runs have a documented and easy way to exclude Kiali tasks
- The solution is maintainable when new tasks are added
Labels
bug, enhancement
Tests
- Run standard gevals evaluation and verify only kubernetes tasks execute
- Run kiali-specific gevals evaluation and verify kiali tasks execute with Istio installed
- Verify local
./gevals-linux-amd64 eval evals/claude-code/eval.yamlcan filter out kiali tasks