Skip to content

feat: add shared YAML-config file size checker with tiered limits#98

Open
JacobPEvans wants to merge 5 commits intomainfrom
feature/file-size-yaml
Open

feat: add shared YAML-config file size checker with tiered limits#98
JacobPEvans wants to merge 5 commits intomainfrom
feature/file-size-yaml

Conversation

@JacobPEvans
Copy link
Owner

@JacobPEvans JacobPEvans commented Mar 15, 2026

Summary

  • Add shared check-file-sizes.sh (155 lines) that reads .file-size.yml for tiered file size limits
  • Three enforcement tiers: default (5KB warn / 10KB error), extended (10KB warn / 20KB error), exempt
  • Reusable workflow updated with 3-priority delegation: repo script → YAML-config checker → inline fallback
  • Same script runs in pre-commit hook and GH Actions (DRY)

Changes

File Description
scripts/workflows/check-file-sizes.sh New YAML-driven checker (requires yq, cross-platform via wc -c)
.github/workflows/_file-size.yml 3-priority delegation with curl retries and graceful fallback
.file-size.yml Org-wide default tiers, exemptions, and scan extensions
.pre-commit-config.yaml Local hook with pass_filenames: false and always_run: true
.cspell.json Dictionary entries for awk/yq terms

Test plan

  • bash scripts/workflows/check-file-sizes.sh passes locally
  • Repos with .file-size.yml get YAML-config checker via reusable workflow
  • Repos without config fall back to legacy inline check
  • Pre-commit hook runs file size check on commit
  • Script fits under default 10KB limit (currently 5KB)

Related: #61

Replace the blanket 500KB limit with a proper tiered system:
- 5KB warning / 10KB error (default)
- 10KB warning / 20KB error (extended, for CHANGELOG/README)
- Per-repo overrides via .file-size.yml

The reusable workflow now supports 3 priority levels:
1. Repo's own script (existing behavior preserved)
2. Shared YAML-config checker (new)
3. Inline fallback (legacy)

DRY: same script runs in pre-commit hook and GH Actions.

(claude)
Copilot AI review requested due to automatic review settings March 15, 2026 16:18
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust and configurable system for enforcing file size limits across the organization. By centralizing configuration in a .file-size.yml file and integrating a shared bash script with a pre-commit hook, it ensures that file sizes are consistently monitored and controlled, both locally and in CI environments, promoting better repository health and performance.

Highlights

  • Shared File Size Checker Script: A new shared check-file-sizes.sh script has been added to enforce file size limits based on a .file-size.yml configuration file. This script supports tiered limits (warn/error) and file exemptions.
  • YAML Configuration for File Sizes: An organization-wide .file-size.yml configuration file has been introduced, defining default and extended file size limits, as well as patterns for exempting files from checks and specifying file extensions to scan.
  • Pre-commit Hook Integration: A new pre-commit hook has been added to .pre-commit-config.yaml to automatically run the file size checker script, ensuring local enforcement of file size policies before commits.
  • Reusable Workflow Update (Intent): The description indicates an update to the _file-size.yml reusable workflow to delegate to the new YAML-config checker, providing a 3-priority delegation: repo script → YAML config → inline fallback. (Note: The _file-size.yml file itself is not part of this patch, but the intent is noted.)
Changelog
  • .cspell.json
    • Added 'kislyuk', 'gsub', and 'RLENGTH' to the cspell dictionary to prevent false positives related to the new script's dependencies and logic.
  • .file-size.yml
    • Introduced a new configuration file to define default warning and error thresholds (5KB/10KB), extended limits for specific files (10KB/20KB for CHANGELOG.md, README.md), and patterns for exempting files (e.g., lock files). Configured file extensions to scan.
  • .pre-commit-config.yaml
    • Integrated a new local pre-commit hook named file-size-check that executes scripts/workflows/check-file-sizes.sh to ensure files adhere to defined size limits before committing.
  • scripts/workflows/check-file-sizes.sh
    • Implemented a bash script that reads file size limits from .file-size.yml, supports default and extended limits, handles file exemptions, and provides fallback YAML parsing using awk if yq is not available. The script outputs warnings/errors with GitHub Actions annotations in CI and exits with a count of error-level violations.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/_file-size.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A file grows large, a warning rings, Then error shouts, on digital wings. With YAML's grace, and script's sharp eye, We keep our bytes beneath the sky.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an org-standard, YAML-configurable file-size enforcement mechanism that can run both locally (pre-commit) and in CI via the reusable _file-size workflow, with layered fallback behavior.

Changes:

  • Introduces scripts/workflows/check-file-sizes.sh to enforce tiered size limits from .file-size.yml (with CI annotations).
  • Updates reusable workflow .github/workflows/_file-size.yml to prefer repo script → shared YAML-config checker → legacy inline fallback.
  • Adds org-wide default .file-size.yml plus a local pre-commit hook, and updates cspell dictionary entries.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/workflows/check-file-sizes.sh New YAML-driven checker script (yq/awk parsing, annotations, tiered limits).
.github/workflows/_file-size.yml Adds shared-checker download path when .file-size.yml exists; keeps inline fallback.
.file-size.yml Adds org default tiers, exemptions, and scan extensions.
.pre-commit-config.yaml Adds local hook to run the checker on commits.
.cspell.json Adds dictionary words used by the new script/comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

The script scans the whole repo via find and ignores positional args,
so passing staged filenames is unnecessary and wasteful.

(claude)
…ilure

Adds --retry 3 --retry-connrefused to match _markdown-lint.yml pattern.
When the shared checker download fails, emit a warning and fall through
to the inline fallback instead of failing the workflow.

(claude)
…tput

- has_yq now only checks yq availability, not config validity
- parse_config validates YAML upfront and fails fast on malformed configs
- Add -r flag to yq calls for raw scalar output, preventing quoted
  patterns (e.g., "*.lock") from breaking glob/case matching
- Fix arithmetic increment under set -e (0++ is falsy, causes early exit)
- Add large repo docs to extended files list in .file-size.yml

(claude)
- Drop awk fallback, require yq (available on GH Actions + Nix devShells)
- Use wc -c instead of stat platform dance
- Merge emit_warning/emit_error into single emit() function
- Extract read_into() helper to DRY 6 identical while-read loops
- Merge is_exempt/is_extended into matches_list() with nameref
- Resolve CI mode once at startup, declare locals outside loop
- Remove self-exemption from .file-size.yml (script now 5KB, under 10KB limit)

(claude)
@JacobPEvans JacobPEvans changed the title feat: shared YAML-config file size checker feat: add shared YAML-config file size checker with tiered limits Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants