Add analysis modules for code health and technical debt tracking#1232
Add analysis modules for code health and technical debt tracking#1232karthiknadig merged 13 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request introduces a comprehensive code health analysis system for tracking technical debt and code quality metrics over time. The implementation consists of Python-based analysis modules that examine git history, code complexity, dependency patterns, and debt indicators, plus a CI workflow to generate regular snapshots.
Changes:
- Adds Python analysis modules for git history analysis, complexity metrics, dependency analysis, and technical debt detection
- Introduces GitHub Actions workflow for automated snapshot generation on pushes to main
- Adds agent configuration and hooks for maintainer/reviewer workflows
- Creates skill documentation for snapshot generation and development workflows
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
analysis/snapshot.py |
Main orchestrator that aggregates analysis results into JSON snapshot |
analysis/git_analysis.py |
Analyzes git history for hotspots, churn, and temporal coupling |
analysis/dependency_analysis.py |
Analyzes TypeScript/JavaScript module dependencies and coupling |
analysis/debt_indicators.py |
Scans for TODO/FIXME markers and code smells |
analysis/complexity_analysis.py |
Calculates complexity metrics using radon and regex |
analysis/__init__.py |
Package initialization |
analysis/pyproject.toml |
Python dependencies specification |
.github/workflows/code-analysis.yml |
CI workflow for generating snapshots |
.github/hooks/maintainer-hooks.json |
Agent hooks configuration |
.github/hooks/scripts/*.py |
Hook scripts for session management and validation |
.github/agents/*.agent.md |
Agent definitions for reviewer and maintainer |
.github/skills/*/SKILL.md |
Skill documentation for development workflows |
.gitignore |
Excludes generated snapshot files |
Comments suppressed due to low confidence (2)
analysis/dependency_analysis.py:17
- Import of 'Tuple' is not used.
from typing import Dict, List, Optional, Set, Tuple
.github/hooks/scripts/session_start.py:94
- 'except' clause does nothing but pass and there is no explanatory comment.
except json.JSONDecodeError:
- Fix Python 3.9 typing compatibility (use typing module) - Remove unused imports: defaultdict in dependency_analysis.py - Remove unused variables: func_start, since_date, agent_id - Add explanatory comments for except pass blocks - Fix uv pip install command in code-analysis.yml - Update README.md to say Python 3.9+ instead of 3.10+ - Remove unused gitpython dependency from pyproject.toml - Rename manager-discovery skill to python-manager-discovery - Quote # symbol in cross-platform-paths description - Set user-invocable: false for reference skills - Remove / prefix from skill name references in agents/hooks
- Fix Python 3.9 typing compatibility (use typing module) - Remove unused imports: defaultdict in dependency_analysis.py, Tuple in dependency_analysis.py - Remove unused variables: func_start, since_date, agent_id - Add explanatory comments for except pass blocks - Fix uv pip install command in code-analysis.yml - Update README.md to say Python 3.9+ instead of 3.10+ - Remove unused gitpython dependency from pyproject.toml - Rename manager-discovery skill to python-manager-discovery - Quote # symbol in cross-platform-paths description - Set user-invocable: false for reference skills - Remove / prefix from skill name references in agents/hooks
ac29f94 to
93b7995
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (4)
analysis/debt_indicators.py:258
- The function boundary detection for Python is overly simplistic and may misidentify function ends. It detects the end of a function when encountering a line with indentation less than or equal to the function definition indentation, but this doesn't account for nested functions, class methods, decorators, or other Python constructs. This could lead to incorrect function length calculations. Consider using an AST-based approach for more accurate function boundary detection.
# Detect when we've left the current function (dedent to same or less level)
elif current_func and line.strip():
line_indent = len(line) - len(line.lstrip())
if line_indent <= current_indent and not line.strip().startswith("#"):
# End of function
length = i - func_start
if length > LONG_FUNCTION_THRESHOLD:
long_funcs.append(
LongFunction(
file=rel_path,
function_name=current_func,
line=func_start + 1,
length=length,
)
)
current_func = None
analysis/complexity_analysis.py:195
- The ternary operator pattern '\b\?\s*[^:]+\s*:' is problematic because it will match the colon in object literals, type annotations, and other TypeScript constructs that contain '?'. For example, it would incorrectly count 'type Foo = { bar?: string }' as a branching statement. Consider making the pattern more specific or using an AST-based approach.
r"\b\?\s*[^:]+\s*:", # ternary
analysis/debt_indicators.py:194
- The code line counting logic is overly simplistic and doesn't account for multi-line comments, docstrings, or other non-code content. For Python, it only excludes lines starting with '#', but triple-quoted strings (docstrings) and multi-line comments are still counted. For TypeScript, it only excludes lines starting with '//', but multi-line /* */ comments and JSDoc blocks are still counted. Consider using a more robust approach or document this limitation.
# Count code lines (non-empty, non-comment)
suffix = filepath.suffix.lower()
if suffix == ".py":
code_lines = sum(
1 for line in lines if line.strip() and not line.strip().startswith("#")
)
elif suffix in {".ts", ".js", ".tsx", ".jsx"}:
code_lines = sum(
1 for line in lines if line.strip() and not line.strip().startswith("//")
)
else:
code_lines = total_lines
analysis/complexity_analysis.py:198
- The logical operator patterns '\|\|' and '&&' will be counted even within strings or comments, potentially inflating complexity scores. Consider filtering out matches that appear within string literals or comments, or document this as a known limitation of the regex-based approach.
r"\|\|", # logical or
r"&&", # logical and
- Add timeout parameters to subprocess calls in snapshot.py and git_analysis.py - Fix debt marker regex pattern (remove literal pipe from character class) - Fix import pattern regex to avoid cross-statement matches - Use astral-sh/setup-uv@v4 action instead of curl | sh
- Fix should_analyze_file() to use relative path parts instead of absolute - Add [build-system] table to pyproject.toml for uv pip install - Fix snapshot path in skill docs and session_start.py (use repo root) - Rename open_issues_count to recent_issues_count (reflects --limit 5)
- Use relative imports in snapshot.py for proper package structure - Add traceback.print_exc() for better CI debugging on failures - Add test file patterns to _should_skip_file in git_analysis.py - Include .tsx, .js, .jsx in complexity analysis (not just .ts) - Update workflow to use 'python -m analysis.snapshot' invocation - Update skill docs with new execution method
- Add FileNotFoundError to get_git_info exception handling - Ensure repo_root is absolute in resolve_import_path - Fix misleading comment in stop_hook.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
.github/hooks/scripts/stop_hook.py:70
check_ts_files_changed()only looks atgit diffoutputs (working tree vs HEAD and staged), which do not include untracked new files. Even if untracked files are considered elsewhere, new*.ts/*.tsxfiles won’t be detected here. Consider also checkinggit ls-files --others --exclude-standard(and filtering for TS extensions) so new files participate in the reminder logic.
def check_ts_files_changed(repo_root: Path) -> bool:
"""Check if any TypeScript files were changed."""
code, output = run_command(
["git", "diff", "--name-only", "HEAD"],
repo_root,
)
if code == 0 and output:
return any(f.endswith((".ts", ".tsx")) for f in output.split("\n"))
# Also check staged files
code, output = run_command(
["git", "diff", "--cached", "--name-only"],
repo_root,
)
if code == 0 and output:
return any(f.endswith((".ts", ".tsx")) for f in output.split("\n"))
…tracked files (PR #1232) - Use git ls-files instead of rglob to avoid scanning node_modules/dist - Fix get_layer() to return None for unknown layers and skip violations - Include untracked .ts/.tsx files in pre-commit check trigger
…stderr for progress (PR #1232) - Extract shared utilities into analysis/file_discovery.py - Update complexity, debt, and dependency modules to import from file_discovery - Expand _should_skip_file() to handle test_*.py prefix and directory parts - Use print(..., file=sys.stderr) for progress logs to avoid mixing with JSON
|
Re: pyproject.toml uv_build comment - uv_build is a valid build backend from Astral (the uv project). See https://docs.astral.sh/uv/concepts/projects/init/#packaged-applications for documentation. This is the correct build backend for uv-based Python projects. |
Introduce tools for analyzing code health and tracking technical debt. This includes a snapshot generation workflow that aggregates various analysis results into a single JSON output. The implementation supports dependency analysis, complexity analysis, and debt indicators, enhancing the ability to monitor and manage technical debt over time.