[EVALS] Kiali tasks run during standard evaluations despite filtering infrastructure

## Description

The gevals workflow (`gevals.yaml`) includes infrastructure intended to filter out Kiali tasks when Istio/Kiali is not installed. However, this filtering mechanism is not working as intended, causing Kiali tasks to run and fail during standard evaluation runs, which negatively impacts the overall success rate.

**This is also a problem when running gevals locally** - there is currently no easy or documented way to exclude Kiali tasks from a local evaluation run.

## Background / Analysis

### Current Workflow Logic

The workflow at `.github/workflows/gevals.yaml` has the following logic:

1. [**Lines 82-87**](https://github.com/containers/kubernetes-mcp-server/blob/f6a3e4fbab9d7cff72a426129c8c34ffb8c9b290/.github/workflows/gevals.yaml#L82-L87): Sets `kiali-run=true` ONLY if the `task-filter` input contains the word "kiali"
2. [**Lines 109-111**](https://github.com/containers/kubernetes-mcp-server/blob/f6a3e4fbab9d7cff72a426129c8c34ffb8c9b290/.github/workflows/gevals.yaml#L109-L111): Istio/Kiali infrastructure is ONLY installed when `kiali-run=true`
3. [**Lines 113-116**](https://github.com/containers/kubernetes-mcp-server/blob/f6a3e4fbab9d7cff72a426129c8c34ffb8c9b290/.github/workflows/gevals.yaml#L113-L116): The MCP server enables the kiali toolset ONLY when `kiali-run=true`
4. [**Line 124**](https://github.com/containers/kubernetes-mcp-server/blob/f6a3e4fbab9d7cff72a426129c8c34ffb8c9b290/.github/workflows/gevals.yaml#L124): The `task-filter` parameter defaults to empty string `''`

### The Problem

When `task-filter` is empty (the default for scheduled runs and local executions):
- **All tasks run** because the glob pattern in `eval.yaml` (`../tasks/*/*/*.yaml`) matches both `kubernetes/` and `kiali/` task directories
- **Istio/Kiali is NOT installed** because `task-filter` doesn't contain "kiali"
- **Result**: All ~19 Kiali tasks run but fail because the required infrastructure isn't present

### Why Regex Filtering Won't Work

The gevals tool filters tasks using the `--run` flag, which matches against `taskSpec.Metadata.Name` (the task name in the YAML file), NOT the file path.

**Kiali task names** (no consistent prefix):
- `"mesh-status"`, `"get-namespaces"`, `"Create a gateway"`, `"Remove fault Injection"`, `"List all VS in bookinfo namespace"`, etc.

**Kubernetes task names** (no consistent prefix):
- `"create-canary-deployment"`, `"fix-crashloop"`, `"horizontal-pod-autoscaler"`, `"scale-deployment"`, etc.

There is **no naming convention** that would allow a regex to distinguish between Kiali and Kubernetes tasks. A regex like `^kubernetes/` would not work because it matches file paths, not task names.

## Impact

- **Scheduled evaluations** (weekly) run all ~43 tasks but only have infrastructure for kubernetes tasks (~24)
- **~19 Kiali tasks fail** because Istio/Kiali is not installed
- **Overall pass rate drops significantly** (~44% of tasks fail just from missing Kiali infrastructure)
- **Local gevals runs** have no easy way to exclude Kiali tasks, making local testing difficult

## Evidence

See workflow run: https://github.com/containers/kubernetes-mcp-server/actions/runs/20913812421/job/60082412281

## Proposed Solutions

### Option 1: Adopt a Task Naming Convention (Recommended)
Prefix all task names with their category:
- Kiali tasks: `kiali-mesh-status`, `kiali-get-namespaces`, etc.
- Kubernetes tasks: `k8s-create-canary-deployment`, `k8s-fix-crashloop`, etc.

This would allow filtering with regex like `^k8s-` or `^(?!kiali-)`.

### Option 2: Separate Eval Configuration Files
Create separate eval.yaml files:
- `eval-kubernetes.yaml` with glob `../tasks/kubernetes/*/*.yaml`
- `eval-kiali.yaml` with glob `../tasks/kiali/*/*.yaml`

Update the workflow to use the appropriate config based on the run type. This also makes local execution clearer.

## Recommended Next Steps

Consider enhancing gevals to support path-based filtering (e.g., a `--path-filter` or `--exclude-path` flag that matches against file paths instead of task names). This would be a more robust solution that doesn't require changes to existing task definitions.

## Acceptance Criteria

- [ ] Standard (non-Kiali) evaluation runs do NOT execute Kiali tasks
- [ ] Kiali-specific runs (with appropriate task-filter) still work correctly
- [ ] Local gevals runs have a documented and easy way to exclude Kiali tasks
- [ ] The solution is maintainable when new tasks are added

## Labels

bug, enhancement

## Tests

- Run standard gevals evaluation and verify only kubernetes tasks execute
- Run kiali-specific gevals evaluation and verify kiali tasks execute with Istio installed
- Verify local `./gevals-linux-amd64 eval evals/claude-code/eval.yaml` can filter out kiali tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EVALS] Kiali tasks run during standard evaluations despite filtering infrastructure #642

Description

Background / Analysis

Current Workflow Logic

The Problem

Why Regex Filtering Won't Work

Impact

Evidence

Proposed Solutions

Option 1: Adopt a Task Naming Convention (Recommended)

Option 2: Separate Eval Configuration Files

Recommended Next Steps

Acceptance Criteria

Labels

Tests

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[EVALS] Kiali tasks run during standard evaluations despite filtering infrastructure #642

Description

Description

Background / Analysis

Current Workflow Logic

The Problem

Why Regex Filtering Won't Work

Impact

Evidence

Proposed Solutions

Option 1: Adopt a Task Naming Convention (Recommended)

Option 2: Separate Eval Configuration Files

Recommended Next Steps

Acceptance Criteria

Labels

Tests

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions