From a3737c7bcd795ca0aa02492fea498c8a9eec4bdc Mon Sep 17 00:00:00 2001 From: Andreas Kollegger Date: Wed, 11 Mar 2026 08:22:25 +0000 Subject: [PATCH 1/5] proposal: editor improvements for symbols, comments, tagged-string tags, injections, and conventions for using :: and ==> --- proposals/editor-improvements.md | 100 +++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 proposals/editor-improvements.md diff --git a/proposals/editor-improvements.md b/proposals/editor-improvements.md new file mode 100644 index 0000000..516dea0 --- /dev/null +++ b/proposals/editor-improvements.md @@ -0,0 +1,100 @@ +# Editor Support and Syntax Highlighting Improvements + +**Status:** Proposal (summary of discussion) +**Scope:** Tree-sitter-gram grammar, query files, editor integrations, and documentation. + +--- + +## 1. Current state (review) + +### Grammar + +- **grammar.js** clearly separates pattern vs path notation: `subject_pattern` `[...]`, `node_pattern` `(...)`, `relationship_pattern`, `annotated_pattern`, `pattern_reference`. +- **Identifiers** (`_identifier` = symbol | string_literal | integer) appear as: + - Subject/node/relationship identity, `@@` annotation identity and labels, `pattern_reference` (bare id in `[ ... | a, b ]`), and record/map keys. +- **Tagged strings:** `` tag`content` `` and ``` ```tag\ncontent\n``` ``` with `tag` (symbol) and `content` (string_content). No grammar-level restriction on tag names. + +### Editor support + +- **Queries:** Only `queries/highlights.scm` (and a copy under `editors/zed/languages/gram/`). No `locals.scm`, `indents.scm`, or `injections.scm` in the initial review. +- **Zed:** Extension points at the repo; `scripts/prepare-zed-extension.sh` syncs `queries/*.scm` into the extension. File type `.gram`, scope `source.gram`. +- **Other editors:** No in-repo VSCode/Neovim/Emacs config; they can use the npm package and point at `queries/` where supported. +- **Highlighting:** Strings, numbers, booleans, `symbol` → `@variable`, record/map keys → `@property`, annotation key/identifier/labels → `@attribute`, `subject_pattern` → `@type`, node labels → `@type`, `relationship_pattern` → `@keyword`, arrows → `@operator`, brackets/punctuation, `ERROR` → `@error`. All symbols are treated the same; no distinction between defining vs referencing identifiers. + +--- + +## 2. Proposed improvements (general) + +### Syntax highlighting (highlights.scm) + +- **Differentiate identifier roles:** Capture definition-like identifiers (subject, node, relationship, annotation) separately from references (`pattern_reference`) so they can be styled differently (e.g. type-like vs variable/reference). +- **Tagged-string tag:** Explicitly capture the tag in `tagged_string` (e.g. `@type` or `@attribute`) so tags are highlighted distinctly. +- **Comments:** If the grammar exposes a named `comment` node, add `(comment) @comment`. +- **Single source of truth:** Keep one canonical `queries/highlights.scm` and have the Zed extension consume it via the existing sync script (no duplicate maintenance). + +### Identifiers as unique symbols (semantics) + +- **Add `queries/locals.scm`** using Tree-sitter’s local-variable captures: + - **`@local.definition`:** Where an identifier is defined (subject in `subject_pattern`, identifier in `node_pattern`, subject in relationship arrows, identifier in `identified_annotation`). + - **`@local.reference`:** Where an identifier is used as a reference (`pattern_reference` identifier). + - **`@local.scope`** (optional): e.g. `gram_pattern` (file) and optionally `subject_pattern`, so references resolve to the right definition. +- Enables editors that support tree-sitter locals (Neovim, Helix, Emacs, etc.) to do “highlight references” and “go to definition” without an LSP. +- **Scope rules:** At least file scope; optionally refine so references inside a subject pattern resolve appropriately. + +### Indentation and other queries + +- **`queries/indents.scm`:** Define indentation for brackets and multi-line structures so tree-sitter–aware editors get consistent indent. +- **Optional:** `injections.scm` for embedding other languages (see below). + +--- + +## 3. Tagged strings and language injection + +### Question + +Downstream uses well-known tags (`md`, `ts`, `date`, `datetime`, `time`, `sql`, etc.). Do we need to specify each of these in the grammar or query files? + +### Proposal + +- **No need to enumerate every tag in the grammar.** The grammar already allows any symbol as tag; tagged strings are a single generic construct. +- **Injection strategy:** + - **Dynamic injection:** Use the tag symbol’s text as the injection language name. One rule in `injections.scm`: capture `tag` as `@injection.language`, content as `@injection.content`. Editors that support dynamic language from a capture (e.g. Neovim, Zed) then use the tag (e.g. `sql`, `json`, `html`) as the parser name. + - **Overrides only where needed:** Add explicit rules only for tags that don’t match common parser names (e.g. `md` → `markdown`, `ts` → `typescript`) using `#eq?` and `#set! injection.language "markdown"` (or similar). Other tags (e.g. `date`, `datetime`, `time`) may have no parser or be mapped by the editor; the grammar stays agnostic. +- **Well-known tags** can be documented (e.g. in a doc like `docs/tagged-strings-and-injections.md`) as a convention table (md, ts, date, datetime, time, sql, json, html, etc.) so downstream and editors know what to expect and can add their own tag → parser mappings without changing the grammar. + +--- + +## 4. Schema definitions: `::` and `=>` convention + +### Observation + +The `::` and `=>` / `==>` variations are used to imply schema definitions. For example, `{ name:: ts\`string\` }` is read as: property `name` has value-type described in TypeScript (e.g. “string”). Value types have used `ts`, `SQL`, and other languages, which avoids explicit grammar support for schema. + +### Proposal + +- **Grammar:** No change. `record_property` already allows both `:` and `::`; the value can be any `_value`, including tagged strings. +- **Convention:** Document that `::` is used for type/schema slots and that the value is often a tagged string (e.g. `ts`, `SQL`) describing the type. This keeps schema semantics out of the core grammar and lets each ecosystem choose its type languages. +- **Support in editors:** Language injection applies to tagged string content regardless of context. So `name:: ts\`string\`` gets TypeScript highlighting inside the backticks via the same `injections.scm` rules. No extra grammar or query logic is required for schema-specific languages. +- **Downstream:** Libraries can treat `key:: tagged_string` as “property `key` has type/schema given by tag and content” and dispatch to the right validator or code generator (e.g. `ts` → TypeScript, `SQL` → schema). Encouragement is mainly through documentation and examples (e.g. in `docs/` and `examples/`). + +--- + +## 5. Summary table + +| Area | Current | Proposed | +|------|--------|----------| +| **highlights.scm** | Single `@variable` for all symbols | Differentiate definition-like vs reference identifiers; capture tagged-string tag; add comment if exposed | +| **Identifiers as symbols** | Not modeled | Add **locals.scm** with `@local.definition`, `@local.reference`, optional `@local.scope` | +| **Injections** | None | **injections.scm**: dynamic injection from tag text; overrides for md→markdown, ts→typescript; document well-known tags | +| **Schema / `::`** | Grammar allows `::`; no documented convention | Document `::` for type/schema and tagged strings (ts, SQL, etc.); no new grammar rules | +| **Indentation** | None | Add **indents.scm** for brackets and multi-line structures | +| **Docs** | — | Document tagged strings, well-known tags, and `::` schema convention (e.g. dedicated doc + cross-links from gram-reference / gram-ebnf) | + +--- + +## 6. References + +- Grammar: `grammar.js`, `docs/gram-ebnf.md` +- Queries: `queries/highlights.scm`; proposed `queries/injections.scm`, `queries/locals.scm`, `queries/indents.scm` +- Zed: `editors/zed/`, `scripts/prepare-zed-extension.sh` (syncs `queries/*.scm`) +- Tree-sitter: injection captures `@injection.language`, `@injection.content`; locals `@local.definition`, `@local.reference`, `@local.scope` From dee0e7a4c3b47dc6933fb5cd5f811016e26069bf Mon Sep 17 00:00:00 2001 From: Andreas Kollegger Date: Wed, 11 Mar 2026 09:37:08 +0000 Subject: [PATCH 2/5] editor-improve: plan and tasks for improved syntax highlighting, semantic hints and editor support --- .cursor/rules/specify-rules.mdc | 4 +- docs/tagged-strings-and-injections.md | 69 ++++++ editors/zed/languages/gram/injections.scm | 29 +++ queries/injections.scm | 29 +++ .../checklists/requirements.md | 34 +++ .../contracts/README.md | 14 ++ .../contracts/documentation.md | 29 +++ .../contracts/highlights.md | 42 ++++ .../contracts/indents.md | 27 +++ .../contracts/injections.md | 42 ++++ .../contracts/locals.md | 30 +++ specs/004-editor-improvements/data-model.md | 91 +++++++ .../example.schema.gram | 53 +++++ specs/004-editor-improvements/plan.md | 96 ++++++++ specs/004-editor-improvements/quickstart.md | 59 +++++ specs/004-editor-improvements/research.md | 87 +++++++ specs/004-editor-improvements/spec.md | 146 ++++++++++++ specs/004-editor-improvements/tasks.md | 225 ++++++++++++++++++ 18 files changed, 1105 insertions(+), 1 deletion(-) create mode 100644 docs/tagged-strings-and-injections.md create mode 100644 editors/zed/languages/gram/injections.scm create mode 100644 queries/injections.scm create mode 100644 specs/004-editor-improvements/checklists/requirements.md create mode 100644 specs/004-editor-improvements/contracts/README.md create mode 100644 specs/004-editor-improvements/contracts/documentation.md create mode 100644 specs/004-editor-improvements/contracts/highlights.md create mode 100644 specs/004-editor-improvements/contracts/indents.md create mode 100644 specs/004-editor-improvements/contracts/injections.md create mode 100644 specs/004-editor-improvements/contracts/locals.md create mode 100644 specs/004-editor-improvements/data-model.md create mode 100644 specs/004-editor-improvements/example.schema.gram create mode 100644 specs/004-editor-improvements/plan.md create mode 100644 specs/004-editor-improvements/quickstart.md create mode 100644 specs/004-editor-improvements/research.md create mode 100644 specs/004-editor-improvements/spec.md create mode 100644 specs/004-editor-improvements/tasks.md diff --git a/.cursor/rules/specify-rules.mdc b/.cursor/rules/specify-rules.mdc index 4780762..0dbbc99 100644 --- a/.cursor/rules/specify-rules.mdc +++ b/.cursor/rules/specify-rules.mdc @@ -7,6 +7,8 @@ Auto-generated from all feature plans. Last updated: 2025-11-10 - N/A (parser generation, no data storage) (001-refactor-terminology) - JavaScript (Node.js), tree-sitter grammar DSL + ree-sitter-cli (project version), tree-sitter (bindings) (003-extended-annotation) - N/A (parser grammar definition) (003-extended-annotation) +- Tree-sitter query language (scheme-like .scm); grammar.js (JavaScript, Node); parser generated to C via tree-sitter-cli + ree-sitter, tree-sitter-cli (npm); editors consume queries (Zed, Neovim, Helix, Emacs) (001-editor-improvements) +- N/A (query and doc files only) (001-editor-improvements) - JavaScript (Node.js), tree-sitter grammar DSL + ree-sitter-cli ^0.25.10, tree-sitter ^0.25.0 (001-line-comments) @@ -26,10 +28,10 @@ npm test && npm run lint JavaScript (Node.js), tree-sitter grammar DSL: Follow standard conventions ## Recent Changes +- 001-editor-improvements: Added Tree-sitter query language (scheme-like .scm); grammar.js (JavaScript, Node); parser generated to C via tree-sitter-cli + ree-sitter, tree-sitter-cli (npm); editors consume queries (Zed, Neovim, Helix, Emacs) - 003-extended-annotation: Added JavaScript (Node.js), tree-sitter grammar DSL + ree-sitter-cli (project version), tree-sitter (bindings) - 001-refactor-terminology: Added JavaScript (Node.js) for grammar definition, generated C code for parser + ree-sitter (CLI and runtime), tree-sitter CLI for code generation -- 001-line-comments: Added JavaScript (Node.js), tree-sitter grammar DSL + ree-sitter-cli ^0.25.10, tree-sitter ^0.25.0 diff --git a/docs/tagged-strings-and-injections.md b/docs/tagged-strings-and-injections.md new file mode 100644 index 0000000..fab2d67 --- /dev/null +++ b/docs/tagged-strings-and-injections.md @@ -0,0 +1,69 @@ +# Tagged Strings and Language Injection + +This document describes how tagged strings work in Gram notation, how syntax highlighting (language injection) is applied to their content, and how the `::` convention supports schema definitions without extending the grammar for each type system. + +--- + +## 1. Tagged string syntax + +A **tagged string** attaches a type or format tag to string content: + +- **Backtick form:** `` tag`content` `` +- **Fenced form:** ` ```tag` followed by a newline, then content, then ` ``` ` + +The tag is a **symbol**; the content is arbitrary text (backtick form: single line with escapes; fenced form: multiline). The grammar does not restrict which tags may appear. Downstream libraries and editors interpret tags by convention. + +--- + +## 2. Language injection (syntax highlighting) + +Tree-sitter injection queries (`queries/injections.scm`) use the **tag’s text as the injection language**. You do **not** need to add every possible tag to the grammar or query file. + +- **Dynamic injection:** For most tags, the content is highlighted with the language whose name matches the tag (e.g. `sql`, `json`, `html`). +- **Overrides:** A small set of tags are mapped to parser names that differ from the tag (e.g. `md` → `markdown`, `ts` → `typescript`) in `injections.scm`. Adding more overrides is optional; editors and downstream tools can also map tag names to parsers themselves. + +### Well-known tags + +| Tag | Typical use | Parser / notes | +|----------|---------------------------|-----------------------------------------------------| +| `md` | Markdown | Mapped to `markdown` in injections.scm | +| `ts` | TypeScript (types/code) | Mapped to `typescript` in injections.scm | +| `date` | ISO 8601 date | Often no parser; content is `YYYY-MM-DD` | +| `datetime` | ISO 8601 date-time | Often no parser; content is ISO 8601 | +| `time` | ISO 8601 time | Often no parser; content is time part | +| `sql` | SQL | Tag used as language name (parser often `sql`) | +| `json` | JSON | Tag used as language name | +| `html` | HTML | Tag used as language name | + +Editors and consumers can extend this list by mapping additional tag names to language parsers (e.g. `yaml`, `graphql`) without changing the Gram grammar or its queries. + +--- + +## 3. Schema definitions and the `::` convention + +In records, the grammar allows two separators between a property key and its value: + +- **`:`** — normal property: value is data. +- **`::`** — often used for **type or schema** definitions: value describes the *kind* or *shape* of the property rather than a literal value. + +Example: + +```gram +{ name:: ts`string`, count:: ts`number`, bio:: md`# Markdown allowed` } +``` + +Here, `name`, `count`, and `bio` are property names whose **value types** are described by tagged strings: TypeScript type expressions (`string`, `number`) and Markdown. Downstream can interpret `::` as “schema slot” and use the tagged content for validation, codegen, or documentation without the grammar ever defining TypeScript, SQL, or other schema languages. + +### Encouraging and supporting this + +- **Grammar:** No change is required. `record_property` already allows both `:` and `::`; the value can be any `_value`, including tagged strings. +- **Convention:** Document and use `::` for “type/schema” and reserve tagged strings (e.g. `ts`, `SQL`) for the type description. That keeps schema concerns out of the core grammar and lets each ecosystem choose its type languages. +- **Editors:** Injection applies to tagged string content regardless of context. So `name:: ts\`string\`` gets TypeScript highlighting inside the backticks. Editors that support tree-sitter injections will get this from the existing `injections.scm`. +- **Downstream:** Libraries can treat `key:: tagged_string` as “property `key` has type/schema given by tag and content,” and dispatch to the right validator or generator (e.g. `ts` → TypeScript type checker, `SQL` → schema validator). + +--- + +## 4. Summary + +- **Tags:** Arbitrary; no need to enumerate every tag in the grammar. Injection uses the tag symbol; a few well-known tags are mapped in `injections.scm`; the rest use the tag text as the language name or are mapped by the editor. +- **Schema:** Use `::` for type/schema properties and tagged strings (`ts`, `SQL`, etc.) for the type description. The grammar stays generic; schema support is by convention and downstream tooling. diff --git a/editors/zed/languages/gram/injections.scm b/editors/zed/languages/gram/injections.scm new file mode 100644 index 0000000..cf688d2 --- /dev/null +++ b/editors/zed/languages/gram/injections.scm @@ -0,0 +1,29 @@ +; Language injection for tagged strings: tag`content` and ```tag\ncontent\n``` +; +; The tag symbol is used as the injection language so that downstream and editors +; can support arbitrary tags without changing the grammar. Well-known tags (md, ts, +; date, datetime, time, sql, json, html, etc.) are documented in docs/tagged-strings-and-injections.md. +; +; Overrides below map tags that do not match common parser names. The final +; rule uses the tag's text as the language name for all other tags (e.g. "sql", +; "json", "html" often match parser names). + +; md -> markdown +(tagged_string + tag: (symbol) @_tag + content: (string_content) @injection.content) +(#eq? @_tag "md") +(#set! injection.language "markdown") + +; ts -> typescript +(tagged_string + tag: (symbol) @_tag + content: (string_content) @injection.content) +(#eq? @_tag "ts") +(#set! injection.language "typescript") + +; Dynamic: use tag text as language name for all other tags (sql, json, html, etc.) +; Editors may map additional tags (e.g. date, datetime, time) to parsers or leave as plain. +(tagged_string + tag: (symbol) @injection.language + content: (string_content) @injection.content) diff --git a/queries/injections.scm b/queries/injections.scm new file mode 100644 index 0000000..cf688d2 --- /dev/null +++ b/queries/injections.scm @@ -0,0 +1,29 @@ +; Language injection for tagged strings: tag`content` and ```tag\ncontent\n``` +; +; The tag symbol is used as the injection language so that downstream and editors +; can support arbitrary tags without changing the grammar. Well-known tags (md, ts, +; date, datetime, time, sql, json, html, etc.) are documented in docs/tagged-strings-and-injections.md. +; +; Overrides below map tags that do not match common parser names. The final +; rule uses the tag's text as the language name for all other tags (e.g. "sql", +; "json", "html" often match parser names). + +; md -> markdown +(tagged_string + tag: (symbol) @_tag + content: (string_content) @injection.content) +(#eq? @_tag "md") +(#set! injection.language "markdown") + +; ts -> typescript +(tagged_string + tag: (symbol) @_tag + content: (string_content) @injection.content) +(#eq? @_tag "ts") +(#set! injection.language "typescript") + +; Dynamic: use tag text as language name for all other tags (sql, json, html, etc.) +; Editors may map additional tags (e.g. date, datetime, time) to parsers or leave as plain. +(tagged_string + tag: (symbol) @injection.language + content: (string_content) @injection.content) diff --git a/specs/004-editor-improvements/checklists/requirements.md b/specs/004-editor-improvements/checklists/requirements.md new file mode 100644 index 0000000..121731c --- /dev/null +++ b/specs/004-editor-improvements/checklists/requirements.md @@ -0,0 +1,34 @@ +# Specification Quality Checklist: Editor Support and Syntax Highlighting Improvements + +**Purpose**: Validate specification completeness and quality before proceeding to planning +**Created**: 2025-03-11 +**Feature**: [spec.md](../spec.md) + +## Content Quality + +- [x] No implementation details (languages, frameworks, APIs) +- [x] Focused on user value and business needs +- [x] Written for non-technical stakeholders +- [x] All mandatory sections completed + +## Requirement Completeness + +- [x] No [NEEDS CLARIFICATION] markers remain +- [x] Requirements are testable and unambiguous +- [x] Success criteria are measurable +- [x] Success criteria are technology-agnostic (no implementation details) +- [x] All acceptance scenarios are defined +- [x] Edge cases are identified +- [x] Scope is clearly bounded +- [x] Dependencies and assumptions identified + +## Feature Readiness + +- [x] All functional requirements have clear acceptance criteria +- [x] User scenarios cover primary flows +- [x] Feature meets measurable outcomes defined in Success Criteria +- [x] No implementation details leak into specification + +## Notes + +- All items pass. Spec is ready for `/speckit.clarify` or `/speckit.plan`. diff --git a/specs/004-editor-improvements/contracts/README.md b/specs/004-editor-improvements/contracts/README.md new file mode 100644 index 0000000..9c9d07e --- /dev/null +++ b/specs/004-editor-improvements/contracts/README.md @@ -0,0 +1,14 @@ +# Contracts: Editor Support and Syntax Highlighting + +**Feature**: 004-editor-improvements +**Date**: 2025-03-11 + +This directory holds behavior contracts for the query files and documentation. They define required captures, override sets, and doc content so implementation and tests can be validated against the spec. + +| Contract | Description | +|----------|-------------| +| [highlights.md](./highlights.md) | Highlight capture names and node mapping (definition vs reference, tag, comment) | +| [locals.md](./locals.md) | Locals capture names and file scope | +| [indents.md](./indents.md) | Indentation rules and 2-space step | +| [injections.md](./injections.md) | Injection language capture and minimal overrides (md, ts) | +| [documentation.md](./documentation.md) | Well-known tags table and `::` schema convention | diff --git a/specs/004-editor-improvements/contracts/documentation.md b/specs/004-editor-improvements/contracts/documentation.md new file mode 100644 index 0000000..3c1a713 --- /dev/null +++ b/specs/004-editor-improvements/contracts/documentation.md @@ -0,0 +1,29 @@ +# Documentation Contract + +**Feature**: 004-editor-improvements +**Spec reference**: FR-010 + +## Required content + +### 1. Well-known tags + +Documentation MUST include a convention table (or equivalent) for well-known tags and how they map to parsers. At least: md, ts, date, datetime, time, sql, json, html. + +- **md** → markdown (injections.scm override) +- **ts** → typescript (injections.scm override) +- **date**, **datetime**, **time** → often no parser; content format described (e.g. ISO 8601) +- **sql**, **json**, **html** → tag used as language name; editors may map as needed + +Editors and downstream MAY add more tag → parser mappings without changing the grammar or query set. + +### 2. `::` schema convention + +Documentation MUST state that `::` is used for type/schema slots and that values are often tagged strings (e.g. `ts`, `SQL`) describing the type. No new grammar rules; convention only. + +## Location + +Existing `docs/tagged-strings-and-injections.md` already covers tagged strings, well-known tags, and the `::` convention. This feature requires that content to be present and up to date; extend or cross-link from gram-reference / gram-ebnf as needed. + +## Test requirements + +- A maintainer or editor author can find the well-known tags table and the `::` convention in the docs without reading the grammar source. diff --git a/specs/004-editor-improvements/contracts/highlights.md b/specs/004-editor-improvements/contracts/highlights.md new file mode 100644 index 0000000..2b83820 --- /dev/null +++ b/specs/004-editor-improvements/contracts/highlights.md @@ -0,0 +1,42 @@ +# Highlights Query Contract + +**Feature**: 004-editor-improvements +**File**: `queries/highlights.scm` + +## Required captures + +| Capture | Node / pattern | Spec reference | +|--------|----------------|----------------| +| Definition-like identifier | Subject in subject_pattern, node_pattern, relationship left/right/kind; identifier/labels in identified_annotation | FR-001 | +| Reference identifier | Identifier in pattern_reference | FR-001 | +| Tagged-string tag | tagged_string tag: (symbol) | FR-002 | +| Comment | (comment) | FR-003 | + +## Capture naming (implementation choice) + +- Definition-like identifiers: `@type` or equivalent so themes can style as type/definition. +- Reference identifiers: `@variable` or equivalent so themes can style as variable/use. +- Tagged-string tag: `@attribute` or equivalent so tag is distinct from content. +- Comment: `@comment`. + +## Existing captures to keep + +- string_literal, string_content, number, boolean_literal, symbol (where not overridden by definition/reference/tag) +- record_property key, map_entry key → @property +- property_annotation key, identified_annotation → @attribute (or as above for tag) +- subject_pattern → @type (or definition capture) +- node_pattern labels (symbol) → @type +- relationship_pattern → @keyword +- right_arrow, left_arrow, etc. → @operator +- Brackets, comma → @punctuation.* +- ERROR → @error + +## Single source of truth + +highlights.scm MUST exist only under `queries/`. Zed consumes it via `scripts/prepare-zed-extension.sh` (FR-004). + +## Test requirements + +- Corpus or manual test: file with subject_pattern, node_pattern, pattern_reference shows definition-like and reference identifiers with different highlight groups. +- File with tagged_string shows tag and content with different highlight groups. +- File with `//` comment shows comment highlighted as @comment. diff --git a/specs/004-editor-improvements/contracts/indents.md b/specs/004-editor-improvements/contracts/indents.md new file mode 100644 index 0000000..afee41a --- /dev/null +++ b/specs/004-editor-improvements/contracts/indents.md @@ -0,0 +1,27 @@ +# Indents Query Contract + +**Feature**: 004-editor-improvements +**File**: `queries/indents.scm` + +## Required behavior + +- Indentation for brackets and multi-line structures so tree-sitter–aware editors apply consistent indent (FR-007). +- Indent width: 2 spaces per level (spec clarification). + +## Capture pattern + +Use standard Tree-sitter indent captures: + +- **Indent increase**: Opening delimiters of record `{`, subject_pattern `[`, node_pattern `(`, and similar bracket/block nodes → capture that increases indent (e.g. `@indent` or language-specific indent begin). +- **Indent decrease / align**: Closing `}`, `]`, `)` → capture that aligns with opening and reduces indent (e.g. `@indent.end`). +- **Extend** (if needed): Nodes that span multiple lines so that continuation lines receive the correct indent (e.g. `@extend`). + +Exact capture names depend on the editor (Neovim vs Helix vs Zed); the contract is “2 spaces per level” and “brackets and multi-line structures” aligned. + +## Single source of truth + +indents.scm MUST exist under `queries/` and be synced to Zed via `scripts/prepare-zed-extension.sh`. + +## Test requirements + +- Create or open a .gram file with nested brackets and multi-line patterns; use editor auto-indent; verify 2-space step and alignment of closing brackets with opening. diff --git a/specs/004-editor-improvements/contracts/injections.md b/specs/004-editor-improvements/contracts/injections.md new file mode 100644 index 0000000..0bb8a6b --- /dev/null +++ b/specs/004-editor-improvements/contracts/injections.md @@ -0,0 +1,42 @@ +# Injections Query Contract + +**Feature**: 004-editor-improvements +**File**: `queries/injections.scm` + +## Required captures + +| Capture | Node / pattern | Spec reference | +|--------|----------------|----------------| +| @injection.language | tagged_string tag (symbol): use tag’s text as language name | FR-008 | +| @injection.content | tagged_string content (string_content) | FR-008 | + +## Dynamic injection + +Language is NOT pre-defined. The tag symbol’s text is the injection language name. Both forms MUST be supported with the same semantics: + +- Inline: `` tag`content` `` +- Fenced: ``` ```tag\ncontent\n``` ``` + +## Minimal override set (spec) + +When the tag text does not match the editor’s parser name, overrides MUST be provided for: + +- `md` → markdown +- `ts` → typescript + +(FR-009). Other tags (e.g. `sql`, `json`, `html`) use the tag as the language name; no override required in the spec. Additional overrides (e.g. `zod`) may be added in docs or by editors. + +## Rule order + +Override rules (md, ts) MUST appear before the generic dynamic rule that sets `@injection.language` from the tag, so that `md` and `ts` are resolved to markdown and typescript. + +## Single source of truth + +injections.scm MUST exist under `queries/` and be synced to Zed via `scripts/prepare-zed-extension.sh`. + +## Test requirements + +- Tagged string with tag `sql` (or `json`, `html`) → content highlighted with that language (dynamic). +- Tagged string with tag `md` → content highlighted as markdown (override). +- Tagged string with tag `ts` → content highlighted as typescript (override). +- Unknown tag → content may be plain or editor-defined; no requirement to enumerate all tags. diff --git a/specs/004-editor-improvements/contracts/locals.md b/specs/004-editor-improvements/contracts/locals.md new file mode 100644 index 0000000..5ed7918 --- /dev/null +++ b/specs/004-editor-improvements/contracts/locals.md @@ -0,0 +1,30 @@ +# Locals Query Contract + +**Feature**: 004-editor-improvements +**File**: `queries/locals.scm` + +## Required captures + +| Capture | Node / pattern | Spec reference | +|--------|----------------|----------------| +| @local.definition | Identifier where it defines: subject in subject_pattern, subject in node_pattern, subject in relationship_pattern (left, right, kind), identifier/labels in identified_annotation | FR-005 | +| @local.reference | Identifier in pattern_reference | FR-005 | +| @local.scope (optional) | gram_pattern (root) | FR-006 | + +## Scope rule + +File-only. All definitions and references share one scope (the file). References resolve to any matching definition in the file. No subject-pattern or finer-grained scope (per spec clarification). + +## Behavior + +- Editors that support tree-sitter locals use these for “go to definition” (from reference to definition) and “highlight references” (from definition to references). +- When multiple definitions share the same name, resolution is file-wide; editor may pick first or allow disambiguation. + +## Single source of truth + +locals.scm MUST exist under `queries/` and be synced to Zed via `scripts/prepare-zed-extension.sh`. + +## Test requirements + +- In an editor with locals support: from a pattern_reference identifier, “go to definition” jumps to a subject/node/relationship/annotation definition with the same name. +- From a definition, “highlight references” highlights all pattern_reference identifiers that match within the file. diff --git a/specs/004-editor-improvements/data-model.md b/specs/004-editor-improvements/data-model.md new file mode 100644 index 0000000..18368ae --- /dev/null +++ b/specs/004-editor-improvements/data-model.md @@ -0,0 +1,91 @@ +# Data Model: Editor Support and Syntax Highlighting + +**Feature**: 004-editor-improvements +**Date**: 2025-03-11 + +This feature does not introduce a persistent data store or API entities. The “data model” here describes the **query capture model**: which grammar nodes map to which Tree-sitter capture names for highlights, locals, indents, and injections. Validation rules come from the spec (FR-001–FR-010). + +--- + +## 1. Grammar nodes used by queries + +Relevant rules from `grammar.js` (no changes in this feature): + +| Node | Description | +|------|-------------| +| `gram_pattern` | Root; optional root record + repeated top-level patterns | +| `subject_pattern` | `[ subject \| elements ]`; subject is definition-like | +| `node_pattern` | `( subject )`; subject is definition-like | +| `relationship_pattern` | `left kind right`; left/right node subjects and kind (arrow) carry definitions | +| `pattern_reference` | Bare identifier; reference (use) of a pattern | +| `identified_annotation` | `@@ identifier` or `@@ labels`; identifier/labels are definition-like | +| `tagged_string` | `tag\`content\`` or ``` ```tag\ncontent\n``` ```; tag (symbol), content (string_content) | +| `comment` | `// ...` (in extras) | +| `record`, `labels`, `_subject`, `_identifier`, `symbol`, `string_content` | Used inside the above | + +--- + +## 2. Highlight captures (highlights.scm) + +| Capture | Grammar source | Purpose | +|--------|----------------|---------| +| Definition-like identifier | `subject_pattern` subject, `node_pattern` subject, `relationship_pattern` left/right subject, arrow kind subject, `identified_annotation` identifier/labels | Style as type/definition (e.g. `@type`) | +| Reference identifier | `pattern_reference` identifier | Style as variable/reference (`@variable`) | +| Tagged-string tag | `tagged_string` field `tag` (symbol) | Style distinctly (e.g. `@attribute`) | +| Tagged-string content | `tagged_string` field `content` (string_content) | Already string; injection handles language | +| Comment | `(comment)` | `@comment` | +| Existing | string_literal, number, boolean, property, bracket, operator, ERROR | Unchanged per current highlights.scm | + +**Validation**: FR-001 (definition vs reference), FR-002 (tag distinct), FR-003 (comment), FR-004 (single canonical file). + +--- + +## 3. Locals captures (locals.scm) + +| Capture | Grammar source | Scope | +|--------|----------------|-------| +| `@local.definition` | Identifier in: subject of subject_pattern, subject of node_pattern, subject of relationship left/right, subject in relationship kind (arrow), identifier/labels in identified_annotation | File | +| `@local.reference` | Identifier in pattern_reference | File | +| `@local.scope` (optional) | `gram_pattern` (root) | Single file scope | + +**Validation**: FR-005 (definition + reference), FR-006 (optional file scope). References resolve to any matching definition in the file. + +--- + +## 4. Indent captures (indents.scm) + +| Capture | Grammar source | Effect | +|--------|----------------|--------| +| `@indent` / indent begin | Opening of record `{`, subject_pattern `[`, node_pattern `(`, etc. | Increase indent by 2 spaces | +| `@indent.end` / dedent | Closing `}`, `]`, `)` | Align with opening; decrease indent | +| `@extend` (if needed) | Continuation of multi-line pattern | Extend scope so following lines get correct indent | + +**Validation**: FR-007 (indentation for brackets and multi-line; 2 spaces per level). + +--- + +## 5. Injection captures (injections.scm) + +| Capture | Grammar source | Purpose | +|--------|----------------|---------| +| `@injection.language` | `tagged_string` field `tag` (symbol) | Dynamic: tag text as language name; overrides for md→markdown, ts→typescript | +| `@injection.content` | `tagged_string` field `content` (string_content) | Injected region | + +**Validation**: FR-008 (dynamic; both inline and fenced forms), FR-009 (minimal overrides md, ts). + +--- + +## 6. Documentation (non-capture) + +| Artifact | Content | +|----------|---------| +| Well-known tags | Convention table: md, ts, date, datetime, time, sql, json, html (and mapping notes) | +| `::` schema convention | Document that `::` denotes type/schema slots; values often tagged strings | + +**Validation**: FR-010. Existing `docs/tagged-strings-and-injections.md` covers this; extend or cross-link as needed. + +--- + +## 7. State and lifecycle + +No state machines or lifecycle. Query files are static; editors load them and apply captures to the AST. Scope is file-wide only; no cross-file resolution. diff --git a/specs/004-editor-improvements/example.schema.gram b/specs/004-editor-improvements/example.schema.gram new file mode 100644 index 0000000..680f749 --- /dev/null +++ b/specs/004-editor-improvements/example.schema.gram @@ -0,0 +1,53 @@ +// Graph schema described in gram notation. +// Gram is descriptive and informative, not prescriptive and normative. +// The semantics are up to implementation, often interpreting the +// explicit syntax along with in-line documentation. +// +// This file is an informative example, not part of the project schema. +// +// Typographic conventions: +// - `::` double-colon for label prefixes and property name/value separator +// - `(::A)` for a labeled node +// - (::A {k:: ts`string`}) for a labeled node with a property value signature +// - `==>` fat-arrows for unqualified relationships +// - `=[::R]=>` fat-arrows for labeled relationships +// - tagged-literal strings to encapsulate type signatures +// - "ts" as a tag for strings to indicate TypeScript-style types +// +// Property type conventions: +// - use tagged-literal strings +// - use tags to denote the internal syntax of the string +// - for example, `ts` to indicate TypeScript, or `zod` to indicate zod-style, or `sql` to indicate SQL DDL +// - supplement with textual information in a comment to describe the intent + +// Always include a header record indicating the file kind. +{ + kind: "schema" // uses single-colon and plain string literal to indicate this is metadata +} + +// Example node type definition +(::A // Label-set for this type. Here, the single label "A" + { // Property record signature begins + k:: ts`string`, // A property type definition. The property named "k" has a value type defined as a TypeScript `string` + ko:: ts`string?`, // An optional property, as indicated by TypeScript + id:: ts`number`, // Must be unique. Use comments to further qualify what can't be described in the value type + id2:: sql`int NOT NULL UNIQUE` // Or use a language like SQL DDL that is more specific + } // End of property record signature +) // End of node type definition + +// A node type with multiple labels, where all labels must be present but extra labels are allowed +(::A::B { k2:: ts`string`}) + +// Example relationship type definition. +// References start/end node by minimum label-set (other labels can exist, but these are required) +(::A) =[::DEPENDS_ON // Label-set for the relationship, anchored on a starting node type identified by its label-set + { // Property record signature begins + m:: ts`number`, // Property type definition using TypeScript + } +]=> (::B) + +// Another relationship using `DEPENDS_ON` but with different endpoints +(::B) =[::DEPENDS_ON]=> (::C) + +// Relationships can also have the same start/end, forming a self-loop +(::C) =[::DEPENDS_ON]=> (::C) diff --git a/specs/004-editor-improvements/plan.md b/specs/004-editor-improvements/plan.md new file mode 100644 index 0000000..a9ff823 --- /dev/null +++ b/specs/004-editor-improvements/plan.md @@ -0,0 +1,96 @@ +# Implementation Plan: Editor Support and Syntax Highlighting Improvements + +**Branch**: `004-editor-improvements` | **Date**: 2025-03-11 | **Spec**: [spec.md](./spec.md) +**Input**: Feature specification from `/specs/004-editor-improvements/spec.md` + +## Summary + +Deliver improved editor support for .gram files by extending the Tree-sitter query set (highlights, locals, indents, injections) and adding documentation. No grammar changes. Single canonical query files in `queries/` consumed by Zed via `scripts/prepare-zed-extension.sh`. Highlights differentiate definition-like vs reference identifiers and capture tagged-string tag and comment; locals enable go-to-definition and highlight-references at file scope; indents use 2 spaces per level; injections use tag-as-language with minimal overrides (md→markdown, ts→typescript). Docs cover well-known tags and `::` schema convention. + +## Technical Context + +**Language/Version**: Tree-sitter query language (scheme-like .scm); grammar.js (JavaScript, Node); parser generated to C via tree-sitter-cli +**Primary Dependencies**: tree-sitter, tree-sitter-cli (npm); editors consume queries (Zed, Neovim, Helix, Emacs) +**Storage**: N/A (query and doc files only) +**Testing**: `npx tree-sitter test` (corpus in test/corpus/); manual/visual check in editors +**Target Platform**: Any editor that supports Tree-sitter queries (Zed in-repo; others via npm package and queries/) +**Project Type**: Single (grammar + queries + bindings) +**Performance Goals**: Queries and indents must not cause noticeable editor lag on typical .gram files (< ~10k lines) +**Constraints**: Single canonical `queries/*.scm`; Zed extension must not duplicate query logic (sync from queries/) +**Scale/Scope**: All .gram files; file-level scope for locals; 2-space indent; minimal injection overrides md, ts + +## Constitution Check + +*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.* + +**Grammar Expressiveness and Clarity (Principle I)**: +- [x] No grammar changes in this feature; existing grammar.js remains authoritative +- [x] N/A — rule structure unchanged + +**Comprehensive and Illustrative Testing (Principle II)**: +- [ ] Corpus tests added/updated for any new query-relevant patterns if needed (e.g. corpus file exercising definition vs reference for visual verification) +- [ ] Tests follow "source ===> tree" format; existing corpus remains valid +- [x] No binding changes; binding tests N/A for query-only feature +- [ ] All test suites pass (`npx tree-sitter test`, `npm test`) after changes + +**Comprehensive and Illustrative Testing (Principle II)**: +- [ ] Corpus tests added/updated for any new query-relevant patterns if needed (e.g. corpus file exercising definition vs reference for visual verification) +- [ ] Tests follow "source ===> tree" format; existing corpus remains valid +- [x] No binding changes; binding tests N/A for query-only feature +- [ ] All test suites pass (`npx tree-sitter test`, `npm test`) after changes + +**Minimal Binding Examples (Principle III)**: +- [x] No AST structure change; examples in examples/ unchanged + +*Post Phase 1 design*: Research, data-model, and contracts complete. No grammar changes; Principle I N/A. Principle II satisfied by existing corpus plus manual/visual verification of queries; add corpus only if new patterns need regression coverage. Principle III satisfied (no example updates). + +## Project Structure + +### Documentation (this feature) + +```text +specs/004-editor-improvements/ +├── plan.md # This file +├── research.md # Phase 0 +├── data-model.md # Phase 1 +├── quickstart.md # Phase 1 +├── contracts/ # Phase 1 (query and doc contracts) +└── tasks.md # Phase 2 (created by /speckit.tasks) +``` + +### Source Code (repository root) + +```text +queries/ +├── highlights.scm # Canonical; extend (definition vs reference, tag, comment) +├── locals.scm # New (definition, reference, optional scope) +├── indents.scm # New (2 spaces per level) +└── injections.scm # Existing; ensure minimal overrides md, ts + dynamic + +editors/zed/languages/gram/ +├── config.toml +├── highlights.scm # Synced from queries/ +├── injections.scm # Synced from queries/ +├── locals.scm # Synced from queries/ (after add) +└── indents.scm # Synced from queries/ (after add) + +docs/ +├── tagged-strings-and-injections.md # New or extend: well-known tags, :: convention +└── (cross-links from gram-reference / gram-ebnf as needed) + +grammar.js # No changes +src/ # Generated; no hand edit +test/corpus/ # Add/update only if new coverage needed for query behavior +scripts/ +└── prepare-zed-extension.sh # Already syncs queries/*.scm → Zed +``` + +**Structure Decision**: Single repo; canonical queries in `queries/`; Zed and other editors consume from there. No new top-level apps or services. + +## Complexity Tracking + +> **Fill ONLY if Constitution Check has violations that must be justified** + +| Violation | Why Needed | Simpler Alternative Rejected Because | +|-----------|------------|-------------------------------------| +| (none) | — | — | diff --git a/specs/004-editor-improvements/quickstart.md b/specs/004-editor-improvements/quickstart.md new file mode 100644 index 0000000..da0f647 --- /dev/null +++ b/specs/004-editor-improvements/quickstart.md @@ -0,0 +1,59 @@ +# Quickstart: Editor Support and Syntax Highlighting + +**Feature**: 004-editor-improvements +**Date**: 2025-03-11 + +## Prerequisites + +- Node.js and npm +- Tree-sitter CLI: `npm install -g tree-sitter-cli` (or use `npx tree-sitter`) + +## Build and test (repo root) + +```bash +# Regenerate parser (no grammar change in this feature; run for hygiene) +npx tree-sitter generate + +# Run grammar corpus tests +npx tree-sitter test + +# Run Node binding tests (if applicable) +npm test +``` + +## Query files + +Canonical query files live under `queries/`: + +- `queries/highlights.scm` — syntax highlighting (extend for definition/reference, tag, comment) +- `queries/locals.scm` — go to definition / highlight references (add if missing) +- `queries/indents.scm` — indentation rules, 2 spaces (add if missing) +- `queries/injections.scm` — language injection for tagged strings (existing; ensure md/ts overrides) + +Do not edit copies under `editors/zed/languages/gram/` directly; they are synced from `queries/`. + +## Sync to Zed extension + +After changing any `queries/*.scm`: + +```bash +./scripts/prepare-zed-extension.sh +``` + +This copies `queries/*.scm` into `editors/zed/languages/gram/`, runs tests, and updates the extension metadata. + +## Manual verification + +1. **Highlights**: Open a .gram file with subject patterns, node patterns, pattern references, and tagged strings. Check that definitions and references use different styles and that the tag is distinct from the content; check that `//` comments are highlighted. +2. **Locals**: In an editor that supports tree-sitter locals (e.g. Neovim with nvim-treesitter, Helix), open a .gram file, put the cursor on a pattern reference, run “go to definition”; put the cursor on a definition, run “highlight references.” +3. **Indentation**: In a .gram file with nested brackets, use the editor’s indent/format and confirm 2-space step and bracket alignment. +4. **Injections**: Add `` sql`SELECT 1` ``, `` ts`string` ``, `` md`*bold*` `` and confirm SQL/TypeScript/Markdown highlighting inside the content. + +## Docs + +- Well-known tags and `::` convention: `docs/tagged-strings-and-injections.md` +- Spec and contracts: `specs/004-editor-improvements/spec.md`, `specs/004-editor-improvements/contracts/` + +## Next + +Run `/speckit.tasks` to generate `tasks.md` for implementation tasks (Phase 2). diff --git a/specs/004-editor-improvements/research.md b/specs/004-editor-improvements/research.md new file mode 100644 index 0000000..0d9a34e --- /dev/null +++ b/specs/004-editor-improvements/research.md @@ -0,0 +1,87 @@ +# Research: Editor Support and Syntax Highlighting Improvements + +**Feature**: 004-editor-improvements +**Date**: 2025-03-11 + +## 1. Highlight captures: definition vs reference + +**Decision**: Use distinct capture names so definitions can be styled differently from references. Definitions (subject in subject_pattern, identifier in node_pattern, subject in relationship arrows, identifier in identified_annotation) → `@type` or a custom capture (e.g. `@identifier.definition`) that themes can map to a type-like or definition style. References (pattern_reference identifier) → keep `@variable` or use `@variable.reference`. + +**Rationale**: Tree-sitter and editor themes commonly use `@type` for type/definition-like identifiers and `@variable` for uses; Neovim/Helix/Emacs map these to different highlight groups. Using `@type` for definition-like identifiers and `@variable` for pattern references aligns with existing grammar terminology (subject_pattern already captured as `@type` in places) and gives immediate visual distinction without requiring theme changes. + +**Alternatives considered**: +- Single `@variable` for all: Rejected; spec requires differentiation (FR-001). +- Custom names only (e.g. `@gram.definition`): Possible but less portable; `@type` / `@variable` are widely recognized. + +--- + +## 2. Highlight: tagged-string tag and comment + +**Decision**: Capture tagged_string’s `tag` field (symbol) as a dedicated capture (e.g. `@type` or `@attribute`) so the tag is highlighted distinctly from `string_content`. Capture `(comment)` as `@comment` when present in the grammar (grammar.js exposes `comment`). + +**Rationale**: FR-002 and FR-003 require distinct tag and comment highlighting. The grammar has `tagged_string` with `field("tag", $.symbol)` and `comment: ($) => token(seq("//", /.*/))`, so both are available. Using `@attribute` for the tag keeps tags visually distinct from general `@variable` and fits “metadata” semantics. + +**Alternatives considered**: +- Reuse `@variable` for tag: Rejected; spec requires distinct tag (FR-002). +- Omit comment capture until grammar exposes it: Not applicable; grammar already exposes `comment` in extras. + +--- + +## 3. Locals: file scope only + +**Decision**: Implement locals with `@local.definition` and `@local.reference` only. Use a single file-level `@local.scope` (e.g. on `gram_pattern` or the root) so all definitions and references are in one scope. Do not add subject-pattern or finer-grained scope nodes. + +**Rationale**: Spec clarification: file-only scope (no subject-pattern scoping). Tree-sitter locals resolve references to definitions within the same scope; one file scope gives correct “go to definition” and “highlight references” for the whole file. Simpler query set and fewer edge cases. + +**Alternatives considered**: +- Subject-pattern scope: Rejected per spec (clarification: file-only). +- No @local.scope: Some editors still resolve by name within file; defining one root scope is the documented pattern and avoids ambiguity. + +--- + +## 4. Indentation: 2 spaces and indent captures + +**Decision**: Add `indents.scm` using standard indent captures (`@indent`, `@indent.end`, `@extend` as needed). Indent width is 2 spaces per level. Capture bracket-delimited blocks (e.g. record `{}`, pattern `[]`, `()`) so that content inside increases indent by one level; closing bracket aligns with opening. + +**Rationale**: Spec requires 2 spaces per level (FR-007, clarification). Tree-sitter indent queries use captures like `@indent` (increase indent), `@indent.end` (decrease), `@extend` (continue scope). Align with nvim-treesitter/Helix patterns: opening bracket → indent begin, closing → indent end, 2-space step. + +**Alternatives considered**: +- 4 spaces: Rejected; spec says 2 spaces. +- Rely only on editor default: Rejected; spec requires defined rules (FR-007). + +--- + +## 5. Injections: dynamic tag + minimal overrides + +**Decision**: Keep current injection approach: capture tag as `@injection.language`, content as `@injection.content`. Overrides only for tags that do not match parser names: minimal set `md` → markdown, `ts` → typescript. All other tags use the tag’s text as the language name (dynamic). Support both inline and fenced tagged_string forms in the same way. + +**Rationale**: Spec (FR-008, FR-009) and existing `queries/injections.scm` already follow this. No research change; document the decision and ensure override order in queries (specific overrides first, then generic dynamic rule) so that `md` and `ts` are overridden before the catch-all. + +**Alternatives considered**: +- Enumerate all well-known tags in queries: Rejected; spec says language is captured from tag, not pre-defined. +- Remove overrides: Rejected; spec requires minimal set (md, ts) for testability. + +--- + +## 6. Single canonical queries and Zed sync + +**Decision**: Maintain exactly one set of query files under `queries/` (highlights.scm, locals.scm, indents.scm, injections.scm). Zed extension gets them via `scripts/prepare-zed-extension.sh` copying `queries/*.scm` into `editors/zed/languages/gram/`. Do not maintain separate copies in Zed; remove or stop editing any duplicate under editors/zed that diverges from queries/. + +**Rationale**: FR-004 and plan require single source of truth. prepare-zed-extension.sh already copies all .scm from queries/ to Zed; adding locals.scm and indents.scm to queries/ and re-running the script is sufficient. No NEEDS CLARIFICATION. + +**Alternatives considered**: +- Separate Zed-specific queries: Rejected; would duplicate maintenance. + +--- + +## Summary table + +| Topic | Decision | Status | +|--------------------|-----------------------------------------------|----------| +| Definition vs ref | @type (or definition) vs @variable | Resolved | +| Tagged-string tag | Capture tag as @attribute (or @type) | Resolved | +| Comment | (comment) @comment | Resolved | +| Locals scope | File-only; one @local.scope at root | Resolved | +| Indent | indents.scm; 2 spaces; bracket blocks | Resolved | +| Injections | Dynamic + md/ts overrides; both forms | Resolved | +| Canonical queries | queries/ only; Zed sync via script | Resolved | diff --git a/specs/004-editor-improvements/spec.md b/specs/004-editor-improvements/spec.md new file mode 100644 index 0000000..d66aae0 --- /dev/null +++ b/specs/004-editor-improvements/spec.md @@ -0,0 +1,146 @@ +# Feature Specification: Editor Support and Syntax Highlighting Improvements + +**Feature Branch**: `004-editor-improvements` +**Created**: 2025-03-11 +**Status**: Draft +**Input**: User description: "editor improvements as described in the proposals/editor-improvements.md document" + +## Clarifications + +### Session 2025-03-11 + +- Q: Should references resolve at file-only scope or also within subject-pattern scope? → A: File-only scope. +- Q: Should the spec call out a primary target (e.g. schema-style .gram) or treat all .gram files equally? → A: All .gram files (no primary target). +- Q: Should the spec add a glossary mapping spec terms to the example's terminology? → A: No glossary; example is illustrative only. +- Q: What indent width or style should the spec require for "consistent indentation"? → A: 2 spaces. +- Q: Should the spec list a minimal set of injection overrides for testability? → A: List minimal set in spec. + +## User Scenarios & Testing *(mandatory)* + +### User Story 1 - Improved syntax highlighting for .gram files (Priority: P1) + +Authors editing Gram (`.gram`) files see clearer visual distinction between defining identifiers (subjects, nodes, relationships, annotations) and references to them. Tagged-string tags (e.g. `sql`, `ts`) are highlighted distinctly from their content, and comments are highlighted when the grammar exposes them. + +**Why this priority**: Core editing experience; enables readers to scan structure and intent without an LSP. + +**Independent Test**: Open a .gram file with mixed definitions and references; verify definition-like identifiers and references use different highlight styles; verify tagged-string tag vs content and comments are distinct. + +**Acceptance Scenarios**: + +1. **Given** a .gram file with subject patterns, node patterns, and pattern references, **When** the file is opened in an editor using the grammar’s highlight queries, **Then** definition-like identifiers and reference identifiers use different highlight styles. +2. **Given** a .gram file containing tagged strings (e.g. `sql\`SELECT 1\``), **When** the file is opened, **Then** the tag (`sql`) is highlighted distinctly from the string content. +3. **Given** the grammar exposes a named `comment` node, **When** comments are present in a .gram file, **Then** they are captured and highlighted as comments. + +--- + +### User Story 2 - Go to definition and highlight references (Priority: P2) + +Authors can use “go to definition” and “highlight references” for identifiers in tree-sitter–aware editors (e.g. Neovim, Helix, Emacs) without an LSP. Definitions are where an identifier is introduced (subject in subject pattern, identifier in node pattern, subject in relationship arrows, identifier in identified annotation); references are where an identifier is used in a pattern reference. + +**Why this priority**: High value for navigation and refactoring; depends on a single canonical locals query set. + +**Independent Test**: In an editor that supports tree-sitter locals, open a .gram file, place cursor on a reference, invoke “go to definition”; place cursor on a definition, invoke “highlight references.” Verify correct targets. + +**Acceptance Scenarios**: + +1. **Given** a .gram file with at least one subject/node/relationship definition and one pattern reference to it, **When** the author places the cursor on the reference and invokes “go to definition”, **Then** the editor jumps to the corresponding definition. +2. **Given** a .gram file with a definition and multiple references, **When** the author places the cursor on the definition and invokes “highlight references”, **Then** all references to that definition are visually highlighted. +3. **Given** file-level scope, **When** multiple definitions could match a name, **Then** references resolve to a matching definition in the file (file-wide scope; no subject-pattern scoping). + +--- + +### User Story 3 - Consistent indentation (Priority: P3) + +Authors get consistent indentation for brackets and multi-line structures (2 spaces per level) so that structure is easy to read and maintain. + +**Why this priority**: Improves readability and reduces manual formatting; no semantic impact. + +**Independent Test**: Create or open a .gram file with nested brackets and multi-line constructs; use editor auto-indent; verify indentation follows defined rules. + +**Acceptance Scenarios**: + +1. **Given** a .gram file with nested bracket structures, **When** the author uses editor indentation features (e.g. indent selection, new line), **Then** indentation aligns with the defined indentation rules. +2. **Given** multi-line patterns or records, **When** the author inserts new lines or reformats, **Then** indent depth reflects the current nesting level. + +--- + +### User Story 4 - Language injection in tagged strings (Priority: P4) + +Authors see syntax highlighting for the injected language inside tagged string content (e.g. SQL inside `sql\`...\``, TypeScript inside `ts\`...\``). Well-known tags (e.g. `md`, `ts`, `sql`, `json`, `html`) are mapped to the appropriate language where needed (e.g. `md` → markdown, `ts` → typescript); other tags use the tag name as the language identifier when the editor supports it. + +**Why this priority**: Makes embedded content readable and consistent with standalone files. + +**Independent Test**: Add tagged strings with known tags (`sql`, `ts`, `md`); verify content is highlighted with the expected language; verify overrides (e.g. `md` → markdown) work. + +**Acceptance Scenarios**: + +1. **Given** a tagged string whose tag matches a common parser name (e.g. `sql`, `json`, `html`), **When** the file is opened in an editor that supports dynamic injection, **Then** the content is highlighted using that language (language is captured from the tag; no pre-defined set). +2. **Given** a tagged string with a tag that does not match a parser name (e.g. `md`, `ts`), **When** an override from the minimal set applies (md→markdown, ts→typescript), **Then** the content is highlighted with the overridden language. +3. **Given** a tagged string with an unknown or unmapped tag, **When** no parser is available, **Then** the content may be plain or fall back to a default; the grammar and query set do not require enumerating every possible tag. + +--- + +### User Story 5 - Documented conventions for downstream and editors (Priority: P5) + +Downstream consumers and editor integrations have clear documentation for tagged strings, well-known tags, and the `::` schema convention so they can implement consistent behavior without changing the core grammar. + +**Why this priority**: Enables ecosystem consistency and reduces support burden; no new grammar rules. + +**Independent Test**: Read the documentation; verify it describes well-known tags (e.g. md, ts, date, datetime, time, sql, json, html) and the convention that `::` denotes type/schema slots with tagged-string values. + +**Acceptance Scenarios**: + +1. **Given** a maintainer or editor author, **When** they consult the documentation, **Then** they find a convention table (or equivalent) for well-known tags and how to map them to parsers. +2. **Given** the same audience, **When** they need to handle schema/type slots, **Then** they find that `::` is documented as the convention for type/schema and that values are often tagged strings (e.g. `ts`, `SQL`). + +--- + +### Edge Cases + +- Unknown or custom tag names: injection may have no parser; documentation should state that editors can add their own tag → parser mappings. +- Malformed or partial structures: highlight and indent queries should not crash; errors remain visible (e.g. existing `ERROR` capture). +- Multiple definitions with the same name in the file: references resolve to any matching definition in the file (file-wide scope). +- Empty or minimal tagged string content: injection still applies so that when content is added, highlighting is correct. + +## Requirements *(mandatory)* + +### Functional Requirements + +- **FR-001**: The highlight query set MUST distinguish definition-like identifiers (subject, node, relationship, annotation definitions) from reference identifiers (pattern references) so they can be styled differently. +- **FR-002**: The highlight query set MUST capture the tag of a tagged string separately from its content so the tag can be highlighted distinctly. +- **FR-003**: If the grammar exposes a named `comment` node, the highlight query set MUST capture it for comment highlighting. +- **FR-004**: There MUST be a single canonical highlight query file consumed by the Zed extension via the existing sync mechanism (no duplicate maintenance). +- **FR-005**: A locals query set MUST define `@local.definition` for identifiers that define (subject in subject pattern, identifier in node pattern, subject in relationship arrows, identifier in identified annotation) and `@local.reference` for pattern-reference identifiers. +- **FR-006**: The locals query set MAY define `@local.scope` at file level so that references resolve to definitions within the same file. +- **FR-007**: An indentation query set MUST define indentation for brackets and multi-line structures so tree-sitter–aware editors can apply consistent indent; indent width is 2 spaces per level. +- **FR-008**: An injection query set MUST support dynamic injection: capture the tag symbol as the injection language name and the tagged string body as the injection content. The language is not pre-defined; it is taken from the tag. Both forms (inline e.g. `` tag`content` `` and multiline ``` ```tag\ncontent\n``` ```) are supported with the same semantics (one tag, one content region). +- **FR-009**: The injection query set MUST provide overrides only when the tag text does not match the editor’s parser name (e.g. `md` → markdown, `ts` → typescript). The spec defines a minimal override set for testability: `md` → markdown, `ts` → typescript. Tags that already match the parser name (e.g. `sql`, `json`, `html`) need no override. Additional overrides (e.g. `zod`) may be documented or added by editors without changing the spec. +- **FR-010**: Documentation MUST describe well-known tags (e.g. md, ts, date, datetime, time, sql, json, html) as a convention table and MUST document that `::` is used for type/schema slots with values often given as tagged strings. + +### Key Entities + +- **Identifier**: A symbol, string literal, or integer used as subject/node/relationship identity, annotation identity/labels, pattern reference, or record/map key; has a role (definition vs reference) for highlighting and navigation. +- **Tagged string**: A construct with a tag (symbol) and content (string content). The tag is captured as the injection language (not pre-defined); the content is captured for injection. Both inline (e.g. `` tag`content` ``) and multiline (``` ```tag\ncontent\n``` ```) forms have the same semantics. Overrides apply only when the tag does not match the editor’s parser name (e.g. `md` → markdown). +- **Scope**: A region (the file) within which references resolve to definitions; used by locals for “go to definition” and “highlight references.” +- **Well-known tag**: A documented convention for tag names (e.g. md, ts, sql) and suggested or standard parser mappings; not enforced by the grammar. + +## Success Criteria *(mandatory)* + +### Measurable Outcomes + +- **SC-001**: Authors can visually distinguish definitions from references in .gram files without using “go to definition.” +- **SC-002**: In supported editors, authors can complete “go to definition” from a reference to the correct definition in under two actions (e.g. key chord or command). +- **SC-003**: In supported editors, “highlight references” from a definition highlights all and only the references that resolve to that definition within the defined scope. +- **SC-004**: Indentation in .gram files follows a single documented set of rules (2 spaces per level) so that nested brackets and multi-line structures align consistently. +- **SC-005**: Tagged string content is highlighted with the expected language for well-known tags (and overrides) in editors that support tree-sitter injection. +- **SC-006**: Downstream and editor implementers can implement tag and schema behavior using documentation alone, without changing the core grammar. + +## Assumptions + +- The grammar already supports tagged strings with any symbol as tag; no grammar change is required for tag enumeration. +- The grammar already allows `::` in record properties; no grammar change is required for the schema convention. +- Tree-sitter query semantics for `@local.definition`, `@local.reference`, `@local.scope`, and injection captures (`@injection.language`, `@injection.content`) are used as intended by supporting editors. +- Editors that do not support locals or injection will still benefit from improved highlights and indentation where they consume the same query files. +- Scope is file-wide only; references resolve to definitions anywhere in the file. +- All .gram files are in scope equally; no primary file type or style (e.g. schema-style) is designated; highlights, locals, indentation, and injection apply to every .gram file the same way. +- Example .gram files (e.g. example.schema.gram) are illustrative only; Key Entities and spec terminology are canonical; no normative glossary mapping to example terms is required. diff --git a/specs/004-editor-improvements/tasks.md b/specs/004-editor-improvements/tasks.md new file mode 100644 index 0000000..64850de --- /dev/null +++ b/specs/004-editor-improvements/tasks.md @@ -0,0 +1,225 @@ +# Tasks: Editor Support and Syntax Highlighting Improvements + +**Input**: Design documents from `/specs/004-editor-improvements/` +**Prerequisites**: plan.md, spec.md, research.md, data-model.md, contracts/ + +**Tests**: Not requested in the feature specification; no test tasks included. Verification is manual per quickstart.md and Independent Test criteria per story. + +**Organization**: Tasks are grouped by user story so each story can be implemented and verified independently. + +## Format: `[ID] [P?] [Story?] Description` + +- **[P]**: Can run in parallel (different files or independent edits) +- **[Story]**: User story (US1–US5) +- Include exact file paths in descriptions + +## Path Conventions + +- **Canonical queries**: `queries/` at repository root (highlights.scm, locals.scm, indents.scm, injections.scm) +- **Zed**: `editors/zed/languages/gram/` — synced from queries/ via script, do not edit directly +- **Docs**: `docs/tagged-strings-and-injections.md` + +--- + +## Phase 1: Setup (Shared Infrastructure) + +**Purpose**: Ensure repository is ready for query and doc work + +- [ ] T001 Verify `queries/` directory exists and `scripts/prepare-zed-extension.sh` copies `queries/*.scm` into `editors/zed/languages/gram/` per plan.md + +--- + +## Phase 2: Foundational (Blocking Prerequisites) + +**Purpose**: Baseline grammar and tests pass before changing queries + +**⚠️ CRITICAL**: No user story work should begin until this phase is complete + +- [ ] T002 Run `npx tree-sitter generate`, `npx tree-sitter test`, and `npm test` at repo root to confirm baseline; fix any failing tests before editing queries + +**Checkpoint**: Foundation ready — user story implementation can proceed + +--- + +## Phase 3: User Story 1 - Improved syntax highlighting (Priority: P1) 🎯 MVP + +**Goal**: Authors see definition-like vs reference identifiers styled differently; tagged-string tag and comments highlighted distinctly (FR-001, FR-002, FR-003). + +**Independent Test**: Open a .gram file with subject patterns, node patterns, pattern references, and tagged strings; verify definition-like and reference identifiers use different highlight styles; verify tag vs content and `//` comments are distinct. + +### Implementation for User Story 1 + +- [ ] T003 [P] [US1] In `queries/highlights.scm` add captures for definition-like identifiers (subject in subject_pattern, node_pattern, relationship_pattern left/right/kind, identified_annotation identifier/labels) using @type or equivalent per specs/004-editor-improvements/contracts/highlights.md +- [ ] T004 [P] [US1] In `queries/highlights.scm` add capture for pattern_reference identifier as @variable (or @variable.reference) and ensure it does not conflict with definition captures per contracts/highlights.md +- [ ] T005 [P] [US1] In `queries/highlights.scm` add capture for tagged_string tag (symbol) as @attribute or equivalent so the tag is distinct from string content per FR-002 and contracts/highlights.md +- [ ] T006 [P] [US1] In `queries/highlights.scm` add (comment) @comment for comment highlighting per FR-003 and contracts/highlights.md + +**Checkpoint**: User Story 1 complete — open a .gram file and verify highlight differentiation and comment highlighting + +--- + +## Phase 4: User Story 2 - Go to definition and highlight references (Priority: P2) + +**Goal**: Authors can use “go to definition” and “highlight references” in tree-sitter–aware editors at file scope (FR-005, FR-006). + +**Independent Test**: In an editor with tree-sitter locals support, open a .gram file; from a pattern_reference invoke “go to definition”; from a definition invoke “highlight references” and verify correct targets. + +### Implementation for User Story 2 + +- [ ] T007 [US2] Create `queries/locals.scm` with @local.definition (subject in subject_pattern, node_pattern, relationship_pattern left/right/kind, identified_annotation), @local.reference (pattern_reference), and optional @local.scope (gram_pattern) for file-wide scope per specs/004-editor-improvements/contracts/locals.md + +**Checkpoint**: User Story 2 complete — verify go-to-definition and highlight-references in a supporting editor + +--- + +## Phase 5: User Story 3 - Consistent indentation (Priority: P3) + +**Goal**: Indentation for brackets and multi-line structures with 2 spaces per level (FR-007). + +**Independent Test**: In a .gram file with nested brackets, use editor indent/format; verify 2-space step and bracket alignment. + +### Implementation for User Story 3 + +- [ ] T008 [US3] Create `queries/indents.scm` with indent captures for record `{}`, subject_pattern `[]`, node_pattern `()`, and 2-space indent per level per specs/004-editor-improvements/contracts/indents.md + +**Checkpoint**: User Story 3 complete — verify indentation in an editor that uses tree-sitter indents + +--- + +## Phase 6: User Story 4 - Language injection in tagged strings (Priority: P4) + +**Goal**: Tagged string content is highlighted with the injected language; dynamic tag as language name; minimal overrides md→markdown, ts→typescript (FR-008, FR-009). + +**Independent Test**: Add tagged strings with tags `sql`, `ts`, `md`; verify SQL/TypeScript/Markdown highlighting inside content and that overrides apply for md and ts. + +### Implementation for User Story 4 + +- [ ] T009 [US4] Verify and if needed update `queries/injections.scm` so overrides for md→markdown and ts→typescript appear before the dynamic rule, and both inline and fenced tagged_string forms are covered per specs/004-editor-improvements/contracts/injections.md + +**Checkpoint**: User Story 4 complete — verify injection in an editor that supports tree-sitter injections + +--- + +## Phase 7: User Story 5 - Documented conventions (Priority: P5) + +**Goal**: Documentation describes well-known tags and the `::` schema convention (FR-010). + +**Independent Test**: Read docs; confirm convention table for well-known tags (md, ts, date, datetime, time, sql, json, html) and that `::` is documented for type/schema slots with tagged-string values. + +### Implementation for User Story 5 + +- [ ] T010 [US5] Extend or verify `docs/tagged-strings-and-injections.md` with well-known tags convention table and `::` type/schema convention per specs/004-editor-improvements/contracts/documentation.md + +**Checkpoint**: User Story 5 complete — documentation is sufficient for downstream and editor authors + +--- + +## Phase 8: Polish & Cross-Cutting Concerns + +**Purpose**: Sync Zed, validate quickstart, and cross-links + +- [ ] T011 Run `scripts/prepare-zed-extension.sh` from repo root to sync all `queries/*.scm` into `editors/zed/languages/gram/` and confirm script passes +- [ ] T012 Run quickstart validation: `npx tree-sitter test`, `npm test`, then manual verification steps in specs/004-editor-improvements/quickstart.md +- [ ] T013 [P] Add cross-links to `docs/tagged-strings-and-injections.md` from `docs/gram-reference` or `docs/gram-ebnf` if those files exist and reference tagged strings or schema + +--- + +## Dependencies & Execution Order + +### Phase Dependencies + +- **Phase 1 (Setup)**: No dependencies — start immediately +- **Phase 2 (Foundational)**: Depends on Phase 1 — BLOCKS all user stories +- **Phases 3–7 (User Stories)**: Depend on Phase 2; can be done in priority order (US1 → US2 → …) or in parallel if working on different files +- **Phase 8 (Polish)**: Depends on completion of all query and doc changes (Phases 3–7) + +### User Story Dependencies + +- **US1 (P1)**: After Phase 2 only — no dependency on other stories +- **US2 (P2)**: After Phase 2 only — no dependency on other stories +- **US3 (P3)**: After Phase 2 only — no dependency on other stories +- **US4 (P4)**: After Phase 2 only — injections.scm may already satisfy contract +- **US5 (P5)**: After Phase 2 only — docs can be updated independently + +### Within Each User Story + +- US1: T003–T006 can be done in any order or in parallel (all edit highlights.scm; ensure combined result satisfies contract) +- US2–US5: Single-task phases; complete in order +- Run `scripts/prepare-zed-extension.sh` after any change to `queries/*.scm` before manual verification + +### Parallel Opportunities + +- Phase 1: T001 only +- Phase 2: T002 only +- Phase 3: T003, T004, T005, T006 [P] — same file but logically independent capture groups; can be implemented in one pass or split +- Phases 4–7: No parallelism within phase (one task each except US1) +- Across stories: US2 (locals.scm), US3 (indents.scm), US4 (injections.scm), US5 (docs) can be done in parallel after Phase 2 +- Phase 8: T013 [P] can run in parallel with T011–T012 + +--- + +## Parallel Example: User Story 1 + +```text +# Option A: Single implementer — implement all four capture groups in queries/highlights.scm in one edit +T003 + T004 + T005 + T006 → one coherent update to queries/highlights.scm per contracts/highlights.md + +# Option B: Split by capture type (if reviewing separately) +T003: definition-like identifiers +T004: pattern_reference as @variable +T005: tagged_string tag +T006: (comment) @comment +``` + +--- + +## Parallel Example: User Stories 2–5 After Foundation + +```text +# After T002 passes, different implementers can take different stories: +Developer A: T007 (locals.scm) +Developer B: T008 (indents.scm) +Developer C: T009 (injections.scm) +Developer D: T010 (docs) +``` + +--- + +## Implementation Strategy + +### MVP First (User Story 1 Only) + +1. Complete Phase 1 (T001) and Phase 2 (T002) +2. Complete Phase 3 (T003–T006) — extend highlights.scm +3. Run `scripts/prepare-zed-extension.sh` +4. **STOP and VALIDATE**: Open a .gram file; verify definition vs reference, tag, and comment highlighting +5. If satisfied, MVP is done; proceed to US2–US5 as needed + +### Incremental Delivery + +1. Setup + Foundational → baseline ready +2. US1 (highlights) → sync Zed → verify (MVP) +3. US2 (locals) → sync → verify go-to-definition / highlight-references +4. US3 (indents) → sync → verify indentation +5. US4 (injections) → sync → verify injection +6. US5 (docs) → verify documentation +7. Polish (sync, quickstart, cross-links) + +### Parallel Team Strategy + +After Phase 2: + +- One person: US1 (highlights) then US2 (locals) +- Another: US3 (indents) and US4 (injections) +- Another: US5 (docs) +Merge query files as needed; run prepare-zed-extension.sh once before final verification. + +--- + +## Notes + +- [P] tasks are either different files or logically independent edits to the same file +- [Story] label links each task to a user story for traceability +- No test tasks: spec does not request TDD or automated tests; use Independent Test and quickstart manual verification +- Commit after each task or after each user story checkpoint +- Always sync Zed after changing any file under `queries/` From dcf8c8b91a6306efa3a65c2875caaf38cbe4e54d Mon Sep 17 00:00:00 2001 From: Andreas Kollegger Date: Wed, 11 Mar 2026 10:44:51 +0000 Subject: [PATCH 3/5] editor-improve: better syntax highlighting and queries --- Cargo.toml | 2 +- Makefile | 2 +- docs/gram-ebnf.md | 2 +- docs/gram-reference.md | 2 + docs/tagged-strings-and-injections.md | 7 +++ editors/zed/.gitignore | 2 +- editors/zed/README.md | 43 +++++++++----- editors/zed/extension.toml | 8 +-- editors/zed/languages/gram/highlights.scm | 71 +---------------------- editors/zed/languages/gram/indents.scm | 1 + editors/zed/languages/gram/injections.scm | 30 +--------- editors/zed/languages/gram/locals.scm | 1 + editors/zed/zed | 1 - package-lock.json | 4 +- package.json | 2 +- pyproject.toml | 2 +- queries/highlights.scm | 41 ++++++++++--- queries/indents.scm | 14 +++++ queries/locals.scm | 28 +++++++++ scripts/prepare-zed-extension.sh | 27 +++++---- specs/004-editor-improvements/tasks.md | 26 ++++----- src/parser.c | 2 +- tree-sitter.json | 2 +- 23 files changed, 161 insertions(+), 159 deletions(-) mode change 100644 => 120000 editors/zed/languages/gram/highlights.scm create mode 120000 editors/zed/languages/gram/indents.scm mode change 100644 => 120000 editors/zed/languages/gram/injections.scm create mode 120000 editors/zed/languages/gram/locals.scm delete mode 120000 editors/zed/zed create mode 100644 queries/indents.scm create mode 100644 queries/locals.scm diff --git a/Cargo.toml b/Cargo.toml index c110a39..adfa673 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -6,7 +6,7 @@ members = [ resolver = "2" [workspace.package] -version = "0.3.3" +version = "0.3.4" license = "MIT" repository = "https://github.com/gram-data/tree-sitter-gram" edition = "2021" diff --git a/Makefile b/Makefile index 84c8887..cbaefe4 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -VERSION := 0.3.3 +VERSION := 0.3.4 LANGUAGE_NAME := tree-sitter-gram diff --git a/docs/gram-ebnf.md b/docs/gram-ebnf.md index e28e6b8..2272ae5 100644 --- a/docs/gram-ebnf.md +++ b/docs/gram-ebnf.md @@ -314,7 +314,7 @@ Supported escape sequences in single, double, and backtick strings: `\\`, `\'`, ### 9.5 Tagged String -A tagged string attaches a type tag (a `symbol`) to a string value, expressed in two forms: +A tagged string attaches a type tag (a `symbol`) to a string value, expressed in two forms. See [Tagged strings and injections](tagged-strings-and-injections.md) for well-known tags, language injection, and the `::` schema convention. ```ebnf tagged_string = symbol, "`", /([^`\\\n])*/, "`" diff --git a/docs/gram-reference.md b/docs/gram-reference.md index b936e76..6294617 100644 --- a/docs/gram-reference.md +++ b/docs/gram-reference.md @@ -505,6 +505,8 @@ arguments. | `{k: "v"}` | `Value::Map` | scalars only | | `bareword` | `Value::Symbol` | unquoted symbol in value position | +See [Tagged strings and injections](tagged-strings-and-injections.md) for well-known tags, language injection, and the `::` schema convention. + String escape sequences (single, double, backtick forms): `\\`, `\'`, `\"`, `` \` ``, `\b`, `\f`, `\n`, `\r`, `\t`. diff --git a/docs/tagged-strings-and-injections.md b/docs/tagged-strings-and-injections.md index fab2d67..d5651fb 100644 --- a/docs/tagged-strings-and-injections.md +++ b/docs/tagged-strings-and-injections.md @@ -67,3 +67,10 @@ Here, `name`, `count`, and `bio` are property names whose **value types** are de - **Tags:** Arbitrary; no need to enumerate every tag in the grammar. Injection uses the tag symbol; a few well-known tags are mapped in `injections.scm`; the rest use the tag text as the language name or are mapped by the editor. - **Schema:** Use `::` for type/schema properties and tagged strings (`ts`, `SQL`, etc.) for the type description. The grammar stays generic; schema support is by convention and downstream tooling. + +--- + +## See also + +- [Gram Notation Reference](gram-reference.md) — value types and `Value::TaggedString` in the data model +- [Gram EBNF](gram-ebnf.md) — formal `tagged_string` syntax (§9.5) diff --git a/editors/zed/.gitignore b/editors/zed/.gitignore index 91ad81b..529e691 100644 --- a/editors/zed/.gitignore +++ b/editors/zed/.gitignore @@ -1,2 +1,2 @@ -# this is created by scripts/prepare-zed-extension.sh +# Populated by Zed when loading the extension (grammar clone). Do not commit. grammars/ diff --git a/editors/zed/README.md b/editors/zed/README.md index 17ed8b2..85f7275 100644 --- a/editors/zed/README.md +++ b/editors/zed/README.md @@ -115,27 +115,44 @@ To work on this extension: ``` editors/zed/ -├── extension.toml # Extension metadata -├── grammars/ -│ └── tree-sitter-gram/ # Grammar files -│ ├── grammar.js # Tree-sitter grammar definition -│ └── src/ # Generated parser source +├── extension.toml # Extension metadata; points at grammar repo (repository + rev) ├── languages/ │ └── gram/ -│ ├── config.toml # Language configuration -│ └── queries/ # Syntax highlighting queries -│ └── highlights.scm -└── example.gram # Example file for testing +│ ├── config.toml # Zed language config (brackets, suffixes, etc.) +│ ├── highlights.scm # → ../../../../queries/ (symlink; edit queries/ in repo root) +│ ├── indents.scm +│ ├── locals.scm +│ └── injections.scm +├── test.gram # Example file for testing +└── .gitignore # Ignores grammars/ (populated by Zed from extension.toml) ``` +Query files (`.scm`) in `languages/gram/` are symlinks to the canonical `queries/` directory at the repo root. Edit `queries/*.scm` there; do not edit the copies under `editors/zed` directly. Running `scripts/prepare-zed-extension.sh` copies `queries/*.scm` into this directory (e.g. for distribution) and updates extension version/rev. + +The `grammars/` directory is created by Zed when it loads the extension (it clones the grammar from the URL in `extension.toml`). It is gitignored. If you see a nested `editors/zed/grammars/gram/...` path, that is the cloned repo inside Zed’s cache; you can ignore or delete `editors/zed/grammars/` locally. + +### Keeping the grammar revision in sync + +The extension pins the tree-sitter-gram grammar with `repository` and `rev` in `extension.toml`. Zed uses that to fetch and build the parser. To keep it aligned with the latest version: + +| Goal | Command | What it does | +|------|---------|---------------| +| **Local testing** | `npm run zed:dev` | Sets `repository = "file://"` and `rev = HEAD`. Zed uses your local clone at the current commit, so you can test grammar/query changes without pushing. | +| **Prepare for publish** | `npm run zed:publish` | Sets `repository` to the public GitHub URL (from `package.json`) and `rev = HEAD`. Run this before committing a release so the published extension points at the correct commit on GitHub. | + +After either command, `extension.toml` is updated in place. For local dev you typically don’t commit that change (so the repo keeps a rev that matches the last release). For a release, run `zed:publish`, then commit and push so the extension and the tagged release stay aligned. + +**If Zed shows an old version (e.g. 0.1.11) after installing the dev extension:** Zed may be using a cached clone of the grammar (at an old rev) or an older copy of the extension. Try: (1) Uninstall the Gram extension from Zed’s Extensions panel. (2) Delete the grammar cache: remove `editors/zed/grammars/` if it exists (Zed recreates it when needed). (3) Run `npm run zed:dev` again so `extension.toml` has the current version and rev. (4) In Zed, run “Install Dev Extension” and select `editors/zed` again. Restart Zed and recheck the extension version. + ## Contributing Contributions are welcome! Please see the main [repository](https://github.com/gram-data/tree-sitter-gram) for contribution guidelines. -To improve syntax highlighting: -1. Edit `languages/gram/queries/highlights.scm` -2. Test with various Gram files -3. Submit a pull request +To improve syntax highlighting and editor behavior: + +1. Edit the canonical query files under `queries/` at the repo root (e.g. `queries/highlights.scm`). +2. The extension uses those via symlinks in `languages/gram/`; run `scripts/prepare-zed-extension.sh` if you need to copy them for distribution. +3. Test with various Gram files, then submit a pull request. ## License diff --git a/editors/zed/extension.toml b/editors/zed/extension.toml index 1fcf5c9..0f139f3 100644 --- a/editors/zed/extension.toml +++ b/editors/zed/extension.toml @@ -1,11 +1,11 @@ id = "gram" name = "Gram Language Support" -version = "0.3.3" +version = "0.3.4" schema_version = 1 authors = ["Gram Data Contributors"] -description = "Support for Gram notation - a subject-oriented notation for structured data" +description = "Support for Gram notation - composable data patterns" # path = "grammars/tree-sitter-gram" [grammars.gram] -repository = "https://github.com/gram-data/tree-sitter-gram" -rev = "78fba591ce4e3ca86ae77c871cfc9e87205c8e2b" +repository = "file:///Users/akollegger/Developer/gram-data/tree-sitter-gram" +rev = "dee0e7a4c3b47dc6933fb5cd5f811016e26069bf" diff --git a/editors/zed/languages/gram/highlights.scm b/editors/zed/languages/gram/highlights.scm deleted file mode 100644 index 7ea9914..0000000 --- a/editors/zed/languages/gram/highlights.scm +++ /dev/null @@ -1,70 +0,0 @@ -; Strings -(string_literal) @string -(string_content) @string - -; Numbers -(integer) @number -(decimal) @number -(hexadecimal) @number -(octal) @number -(measurement) @number - -; Boolean literals -(boolean_literal) @boolean - -; Symbols and identifiers -(symbol) @variable - -; Keywords and operators -[ - "@" - "|" - ":" - "::" -] @operator - -; Subject Patterns and delimiters -[ - "[" - "]" - "(" - ")" - "{" - "}" -] @punctuation.bracket - -; Comma separator -[ - "," -] @punctuation.delimiter - -; Field names in records and maps -(record_property key: (symbol) @property) -(record_property key: (string_literal) @property) -(record_property key: (integer) @property) -(map_entry key: (symbol) @property) -(map_entry key: (string_literal) @property) -(map_entry key: (integer) @property) - -; Annotation keys (property-style) and headers (identified/label-style) -(property_annotation key: (symbol) @attribute) -(identified_annotation identifier: (_) @attribute) -(identified_annotation labels: (_) @attribute) - -; Subject Pattern notation (special highlighting) -(subject_pattern) @type - -; Node with labels -(node_pattern (labels (symbol) @type)) - -; Relationship arrows (special highlighting for graph syntax) -(relationship_pattern) @keyword - -; Arrow operators in relationships -(right_arrow) @operator -(left_arrow) @operator -(undirected_arrow) @operator -(bidirectional_arrow) @operator - -; Error highlighting -(ERROR) @error diff --git a/editors/zed/languages/gram/highlights.scm b/editors/zed/languages/gram/highlights.scm new file mode 120000 index 0000000..d789d89 --- /dev/null +++ b/editors/zed/languages/gram/highlights.scm @@ -0,0 +1 @@ +../../../../queries/highlights.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/indents.scm b/editors/zed/languages/gram/indents.scm new file mode 120000 index 0000000..e6873ff --- /dev/null +++ b/editors/zed/languages/gram/indents.scm @@ -0,0 +1 @@ +../../../../queries/indents.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/injections.scm b/editors/zed/languages/gram/injections.scm deleted file mode 100644 index cf688d2..0000000 --- a/editors/zed/languages/gram/injections.scm +++ /dev/null @@ -1,29 +0,0 @@ -; Language injection for tagged strings: tag`content` and ```tag\ncontent\n``` -; -; The tag symbol is used as the injection language so that downstream and editors -; can support arbitrary tags without changing the grammar. Well-known tags (md, ts, -; date, datetime, time, sql, json, html, etc.) are documented in docs/tagged-strings-and-injections.md. -; -; Overrides below map tags that do not match common parser names. The final -; rule uses the tag's text as the language name for all other tags (e.g. "sql", -; "json", "html" often match parser names). - -; md -> markdown -(tagged_string - tag: (symbol) @_tag - content: (string_content) @injection.content) -(#eq? @_tag "md") -(#set! injection.language "markdown") - -; ts -> typescript -(tagged_string - tag: (symbol) @_tag - content: (string_content) @injection.content) -(#eq? @_tag "ts") -(#set! injection.language "typescript") - -; Dynamic: use tag text as language name for all other tags (sql, json, html, etc.) -; Editors may map additional tags (e.g. date, datetime, time) to parsers or leave as plain. -(tagged_string - tag: (symbol) @injection.language - content: (string_content) @injection.content) diff --git a/editors/zed/languages/gram/injections.scm b/editors/zed/languages/gram/injections.scm new file mode 120000 index 0000000..ff019fe --- /dev/null +++ b/editors/zed/languages/gram/injections.scm @@ -0,0 +1 @@ +../../../../queries/injections.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/locals.scm b/editors/zed/languages/gram/locals.scm new file mode 120000 index 0000000..d68c805 --- /dev/null +++ b/editors/zed/languages/gram/locals.scm @@ -0,0 +1 @@ +../../../../queries/locals.scm \ No newline at end of file diff --git a/editors/zed/zed b/editors/zed/zed deleted file mode 120000 index fdd0971..0000000 --- a/editors/zed/zed +++ /dev/null @@ -1 +0,0 @@ -/Users/akollegger/Developer/gram-data/tree-sitter-gram/editors/zed \ No newline at end of file diff --git a/package-lock.json b/package-lock.json index e508dd8..4726649 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "@gram-data/tree-sitter-gram", - "version": "0.3.3", + "version": "0.3.4", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "@gram-data/tree-sitter-gram", - "version": "0.3.3", + "version": "0.3.4", "hasInstallScript": true, "license": "ISC", "dependencies": { diff --git a/package.json b/package.json index a33f160..578d39f 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@gram-data/tree-sitter-gram", - "version": "0.3.3", + "version": "0.3.4", "description": "subject-oriented notation for structured data", "homepage": "https://gram-data.github.io", "repository": { diff --git a/pyproject.toml b/pyproject.toml index 9811253..4d83e77 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -5,7 +5,7 @@ build-backend = "setuptools.build_meta" [project] name = "tree-sitter-gram" description = "Gram grammar for tree-sitter" -version = "0.3.3" +version = "0.3.4" keywords = ["incremental", "parsing", "tree-sitter", "gram"] classifiers = [ "Intended Audience :: Developers", diff --git a/queries/highlights.scm b/queries/highlights.scm index 7ea9914..73dd182 100644 --- a/queries/highlights.scm +++ b/queries/highlights.scm @@ -12,7 +12,38 @@ ; Boolean literals (boolean_literal) @boolean -; Symbols and identifiers +; Comment (FR-003) +(comment) @comment + +; Tagged-string tag distinct from content (FR-002) +(tagged_string tag: (symbol) @attribute) + +; Reference identifier: pattern_reference (FR-001) +(pattern_reference identifier: (_) @variable) + +; Definition-like identifiers (FR-001): @type +; subject/node subject is _subject (use wildcard _ as it may be hidden in some runtimes) +(subject_pattern subject: (_ identifier: (_) @type)) +(subject_pattern subject: (_ labels: (labels (symbol) @type))) +(node_pattern subject: (_ identifier: (_) @type)) +(node_pattern subject: (_ labels: (labels (symbol) @type))) +(relationship_pattern left: (node_pattern subject: (_ identifier: (_) @type))) +(relationship_pattern left: (node_pattern subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern right: (node_pattern subject: (_ identifier: (_) @type))) +(relationship_pattern right: (node_pattern subject: (_ labels: (labels (symbol) @type)))) +; Arrow kind: subject is inside optional brackets on the arrow +(relationship_pattern kind: (right_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (right_arrow subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern kind: (left_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (left_arrow subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern kind: (undirected_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (undirected_arrow subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ labels: (labels (symbol) @type)))) +(identified_annotation identifier: (_) @type) +(identified_annotation labels: (labels (symbol) @type)) + +; Symbols and identifiers (generic; definition/reference/tag captured above) (symbol) @variable ; Keywords and operators @@ -48,14 +79,6 @@ ; Annotation keys (property-style) and headers (identified/label-style) (property_annotation key: (symbol) @attribute) -(identified_annotation identifier: (_) @attribute) -(identified_annotation labels: (_) @attribute) - -; Subject Pattern notation (special highlighting) -(subject_pattern) @type - -; Node with labels -(node_pattern (labels (symbol) @type)) ; Relationship arrows (special highlighting for graph syntax) (relationship_pattern) @keyword diff --git a/queries/indents.scm b/queries/indents.scm new file mode 100644 index 0000000..6056020 --- /dev/null +++ b/queries/indents.scm @@ -0,0 +1,14 @@ +; Indentation: 2 spaces per level for brackets and multi-line structures +; FR-007 — specs/004-editor-improvements/contracts/indents.md + +; Record {} +"{" @indent +"}" @indent.end + +; Subject pattern [] +"[" @indent +"]" @indent.end + +; Node pattern () +"(" @indent +")" @indent.end diff --git a/queries/locals.scm b/queries/locals.scm new file mode 100644 index 0000000..197faa6 --- /dev/null +++ b/queries/locals.scm @@ -0,0 +1,28 @@ +; Locals: go to definition and highlight references (file scope) +; FR-005, FR-006 — specs/004-editor-improvements/contracts/locals.md + +; File scope: all definitions and references in one scope +(gram_pattern) @local.scope + +; Definitions: identifiers that define a pattern or annotation +(subject_pattern subject: (_ identifier: (_) @local.definition)) +(subject_pattern subject: (_ labels: (labels (symbol) @local.definition))) +(node_pattern subject: (_ identifier: (_) @local.definition)) +(node_pattern subject: (_ labels: (labels (symbol) @local.definition))) +(relationship_pattern left: (node_pattern subject: (_ identifier: (_) @local.definition))) +(relationship_pattern left: (node_pattern subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern right: (node_pattern subject: (_ identifier: (_) @local.definition))) +(relationship_pattern right: (node_pattern subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (right_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (right_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (left_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (left_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (undirected_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (undirected_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(identified_annotation identifier: (_) @local.definition) +(identified_annotation labels: (labels (symbol) @local.definition)) + +; References: pattern_reference identifier +(pattern_reference identifier: (_) @local.reference) diff --git a/scripts/prepare-zed-extension.sh b/scripts/prepare-zed-extension.sh index 52cfc37..8186fee 100755 --- a/scripts/prepare-zed-extension.sh +++ b/scripts/prepare-zed-extension.sh @@ -23,6 +23,9 @@ if ! command -v tree-sitter &> /dev/null; then exit 1 fi +# Configure repository mode (dev|pub) — used for sync and for extension.toml +ZED_REPO_MODE="${ZED_REPO_MODE:-dev}" + # Generate the latest parser echo "📦 Regenerating parser..." cd "$PROJECT_ROOT" @@ -43,16 +46,24 @@ if [ ! -d "$QUERIES_SRC" ] || [ ! -f "$QUERIES_SRC/highlights.scm" ]; then fi mkdir -p "$QUERIES_DST" +rm -f "$QUERIES_DST"/*.scm cp "$QUERIES_SRC"/*.scm "$QUERIES_DST"/ +# In dev mode, restore symlinks so the repo keeps a single source of truth (queries/) +if [ "$ZED_REPO_MODE" != "pub" ] && [ "$ZED_REPO_MODE" != "publish" ]; then + echo "🔗 Restoring symlinks in languages/gram/ (dev mode)" + rm -f "$QUERIES_DST"/*.scm + for f in highlights indents locals injections; do + ln -sf "../../../../queries/$f.scm" "$QUERIES_DST/$f.scm" + done +fi + # Update version in extension.toml to match package.json PACKAGE_VERSION=$(node -p "require('$PROJECT_ROOT/package.json').version") echo "🔄 Updating extension version to $PACKAGE_VERSION..." -# Configure repository mode (dev|pub) -ZED_REPO_MODE="${ZED_REPO_MODE:-dev}" COMMIT_SHA=$(git -C "$PROJECT_ROOT" rev-parse HEAD) if [ "$ZED_REPO_MODE" = "pub" ] || [ "$ZED_REPO_MODE" = "publish" ]; then @@ -87,22 +98,18 @@ required_files=( "$ZED_EXTENSION_DIR/extension.toml" "$ZED_EXTENSION_DIR/languages/gram/config.toml" "$ZED_EXTENSION_DIR/languages/gram/highlights.scm" + "$ZED_EXTENSION_DIR/languages/gram/indents.scm" + "$ZED_EXTENSION_DIR/languages/gram/locals.scm" + "$ZED_EXTENSION_DIR/languages/gram/injections.scm" ) for file in "${required_files[@]}"; do - if [ ! -f "$file" ]; then + if [ ! -e "$file" ]; then echo "❌ Error: Missing required file: $file" exit 1 fi done -# Explicit check for highlights.scm copied from queries/ -HIGHLIGHTS_FILE="$ZED_EXTENSION_DIR/languages/gram/highlights.scm" -if [ ! -f "$HIGHLIGHTS_FILE" ]; then - echo "❌ Error: Missing required file: $HIGHLIGHTS_FILE" - exit 1 -fi - echo "✅ Extension validation passed!" # Test the extension by creating a test file diff --git a/specs/004-editor-improvements/tasks.md b/specs/004-editor-improvements/tasks.md index 64850de..c577ae4 100644 --- a/specs/004-editor-improvements/tasks.md +++ b/specs/004-editor-improvements/tasks.md @@ -25,7 +25,7 @@ **Purpose**: Ensure repository is ready for query and doc work -- [ ] T001 Verify `queries/` directory exists and `scripts/prepare-zed-extension.sh` copies `queries/*.scm` into `editors/zed/languages/gram/` per plan.md +- [X] T001 Verify `queries/` directory exists and `scripts/prepare-zed-extension.sh` copies `queries/*.scm` into `editors/zed/languages/gram/` per plan.md --- @@ -35,7 +35,7 @@ **⚠️ CRITICAL**: No user story work should begin until this phase is complete -- [ ] T002 Run `npx tree-sitter generate`, `npx tree-sitter test`, and `npm test` at repo root to confirm baseline; fix any failing tests before editing queries +- [X] T002 Run `npx tree-sitter generate`, `npx tree-sitter test`, and `npm test` at repo root to confirm baseline; fix any failing tests before editing queries **Checkpoint**: Foundation ready — user story implementation can proceed @@ -49,10 +49,10 @@ ### Implementation for User Story 1 -- [ ] T003 [P] [US1] In `queries/highlights.scm` add captures for definition-like identifiers (subject in subject_pattern, node_pattern, relationship_pattern left/right/kind, identified_annotation identifier/labels) using @type or equivalent per specs/004-editor-improvements/contracts/highlights.md -- [ ] T004 [P] [US1] In `queries/highlights.scm` add capture for pattern_reference identifier as @variable (or @variable.reference) and ensure it does not conflict with definition captures per contracts/highlights.md -- [ ] T005 [P] [US1] In `queries/highlights.scm` add capture for tagged_string tag (symbol) as @attribute or equivalent so the tag is distinct from string content per FR-002 and contracts/highlights.md -- [ ] T006 [P] [US1] In `queries/highlights.scm` add (comment) @comment for comment highlighting per FR-003 and contracts/highlights.md +- [X] T003 [P] [US1] In `queries/highlights.scm` add captures for definition-like identifiers (subject in subject_pattern, node_pattern, relationship_pattern left/right/kind, identified_annotation identifier/labels) using @type or equivalent per specs/004-editor-improvements/contracts/highlights.md +- [X] T004 [P] [US1] In `queries/highlights.scm` add capture for pattern_reference identifier as @variable (or @variable.reference) and ensure it does not conflict with definition captures per contracts/highlights.md +- [X] T005 [P] [US1] In `queries/highlights.scm` add capture for tagged_string tag (symbol) as @attribute or equivalent so the tag is distinct from string content per FR-002 and contracts/highlights.md +- [X] T006 [P] [US1] In `queries/highlights.scm` add (comment) @comment for comment highlighting per FR-003 and contracts/highlights.md **Checkpoint**: User Story 1 complete — open a .gram file and verify highlight differentiation and comment highlighting @@ -66,7 +66,7 @@ ### Implementation for User Story 2 -- [ ] T007 [US2] Create `queries/locals.scm` with @local.definition (subject in subject_pattern, node_pattern, relationship_pattern left/right/kind, identified_annotation), @local.reference (pattern_reference), and optional @local.scope (gram_pattern) for file-wide scope per specs/004-editor-improvements/contracts/locals.md +- [X] T007 [US2] Create `queries/locals.scm` with @local.definition (subject in subject_pattern, node_pattern, relationship_pattern left/right/kind, identified_annotation), @local.reference (pattern_reference), and optional @local.scope (gram_pattern) for file-wide scope per specs/004-editor-improvements/contracts/locals.md **Checkpoint**: User Story 2 complete — verify go-to-definition and highlight-references in a supporting editor @@ -80,7 +80,7 @@ ### Implementation for User Story 3 -- [ ] T008 [US3] Create `queries/indents.scm` with indent captures for record `{}`, subject_pattern `[]`, node_pattern `()`, and 2-space indent per level per specs/004-editor-improvements/contracts/indents.md +- [X] T008 [US3] Create `queries/indents.scm` with indent captures for record `{}`, subject_pattern `[]`, node_pattern `()`, and 2-space indent per level per specs/004-editor-improvements/contracts/indents.md **Checkpoint**: User Story 3 complete — verify indentation in an editor that uses tree-sitter indents @@ -94,7 +94,7 @@ ### Implementation for User Story 4 -- [ ] T009 [US4] Verify and if needed update `queries/injections.scm` so overrides for md→markdown and ts→typescript appear before the dynamic rule, and both inline and fenced tagged_string forms are covered per specs/004-editor-improvements/contracts/injections.md +- [X] T009 [US4] Verify and if needed update `queries/injections.scm` so overrides for md→markdown and ts→typescript appear before the dynamic rule, and both inline and fenced tagged_string forms are covered per specs/004-editor-improvements/contracts/injections.md **Checkpoint**: User Story 4 complete — verify injection in an editor that supports tree-sitter injections @@ -108,7 +108,7 @@ ### Implementation for User Story 5 -- [ ] T010 [US5] Extend or verify `docs/tagged-strings-and-injections.md` with well-known tags convention table and `::` type/schema convention per specs/004-editor-improvements/contracts/documentation.md +- [X] T010 [US5] Extend or verify `docs/tagged-strings-and-injections.md` with well-known tags convention table and `::` type/schema convention per specs/004-editor-improvements/contracts/documentation.md **Checkpoint**: User Story 5 complete — documentation is sufficient for downstream and editor authors @@ -118,9 +118,9 @@ **Purpose**: Sync Zed, validate quickstart, and cross-links -- [ ] T011 Run `scripts/prepare-zed-extension.sh` from repo root to sync all `queries/*.scm` into `editors/zed/languages/gram/` and confirm script passes -- [ ] T012 Run quickstart validation: `npx tree-sitter test`, `npm test`, then manual verification steps in specs/004-editor-improvements/quickstart.md -- [ ] T013 [P] Add cross-links to `docs/tagged-strings-and-injections.md` from `docs/gram-reference` or `docs/gram-ebnf` if those files exist and reference tagged strings or schema +- [X] T011 Run `scripts/prepare-zed-extension.sh` from repo root to sync all `queries/*.scm` into `editors/zed/languages/gram/` and confirm script passes +- [X] T012 Run quickstart validation: `npx tree-sitter test`, `npm test`, then manual verification steps in specs/004-editor-improvements/quickstart.md +- [X] T013 [P] Add cross-links to `docs/tagged-strings-and-injections.md` from `docs/gram-reference` or `docs/gram-ebnf` if those files exist and reference tagged strings or schema --- diff --git a/src/parser.c b/src/parser.c index e2dd528..0fc16e0 100644 --- a/src/parser.c +++ b/src/parser.c @@ -4582,7 +4582,7 @@ TS_PUBLIC const TSLanguage *tree_sitter_gram(void) { .metadata = { .major_version = 0, .minor_version = 3, - .patch_version = 3, + .patch_version = 4, }, }; return &language; diff --git a/tree-sitter.json b/tree-sitter.json index 074eb16..7a2becd 100644 --- a/tree-sitter.json +++ b/tree-sitter.json @@ -11,7 +11,7 @@ } ], "metadata": { - "version": "0.3.3", + "version": "0.3.4", "license": "ISC", "description": "subject-oriented notation for structured data", "links": { From d4e8355ac86f00e1ba40d6b600d11781209a5f6a Mon Sep 17 00:00:00 2001 From: Andreas Kollegger Date: Wed, 11 Mar 2026 10:57:35 +0000 Subject: [PATCH 4/5] editor-improve: (chore) fix ci build --- .github/workflows/build.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index c88edc0..a79fded 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -167,7 +167,7 @@ jobs: python-version: "3.11" - name: Build wheels run: | - python -m pip install cibuildwheel==2.16.2 + python -m pip install cibuildwheel==3.2.1 python -m cibuildwheel --output-dir dist env: CIBW_ARCHS: native From 518b65945c079cb5d209a557b983aaa3e5afdfbb Mon Sep 17 00:00:00 2001 From: Andreas Kollegger Date: Wed, 11 Mar 2026 11:11:50 +0000 Subject: [PATCH 5/5] publish: zed editor 0.3.4 --- editors/zed/extension.toml | 4 +- editors/zed/languages/gram/highlights.scm | 94 ++++++++++++++++++++++- editors/zed/languages/gram/indents.scm | 15 +++- editors/zed/languages/gram/injections.scm | 30 +++++++- editors/zed/languages/gram/locals.scm | 29 ++++++- 5 files changed, 166 insertions(+), 6 deletions(-) mode change 120000 => 100644 editors/zed/languages/gram/highlights.scm mode change 120000 => 100644 editors/zed/languages/gram/indents.scm mode change 120000 => 100644 editors/zed/languages/gram/injections.scm mode change 120000 => 100644 editors/zed/languages/gram/locals.scm diff --git a/editors/zed/extension.toml b/editors/zed/extension.toml index 0f139f3..9c532a0 100644 --- a/editors/zed/extension.toml +++ b/editors/zed/extension.toml @@ -7,5 +7,5 @@ description = "Support for Gram notation - composable data patterns" # path = "grammars/tree-sitter-gram" [grammars.gram] -repository = "file:///Users/akollegger/Developer/gram-data/tree-sitter-gram" -rev = "dee0e7a4c3b47dc6933fb5cd5f811016e26069bf" +repository = "https://github.com/gram-data/tree-sitter-gram" +rev = "7aee4c203a5c6ea48660ae6d8af849ed90317bed" diff --git a/editors/zed/languages/gram/highlights.scm b/editors/zed/languages/gram/highlights.scm deleted file mode 120000 index d789d89..0000000 --- a/editors/zed/languages/gram/highlights.scm +++ /dev/null @@ -1 +0,0 @@ -../../../../queries/highlights.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/highlights.scm b/editors/zed/languages/gram/highlights.scm new file mode 100644 index 0000000..73dd182 --- /dev/null +++ b/editors/zed/languages/gram/highlights.scm @@ -0,0 +1,93 @@ +; Strings +(string_literal) @string +(string_content) @string + +; Numbers +(integer) @number +(decimal) @number +(hexadecimal) @number +(octal) @number +(measurement) @number + +; Boolean literals +(boolean_literal) @boolean + +; Comment (FR-003) +(comment) @comment + +; Tagged-string tag distinct from content (FR-002) +(tagged_string tag: (symbol) @attribute) + +; Reference identifier: pattern_reference (FR-001) +(pattern_reference identifier: (_) @variable) + +; Definition-like identifiers (FR-001): @type +; subject/node subject is _subject (use wildcard _ as it may be hidden in some runtimes) +(subject_pattern subject: (_ identifier: (_) @type)) +(subject_pattern subject: (_ labels: (labels (symbol) @type))) +(node_pattern subject: (_ identifier: (_) @type)) +(node_pattern subject: (_ labels: (labels (symbol) @type))) +(relationship_pattern left: (node_pattern subject: (_ identifier: (_) @type))) +(relationship_pattern left: (node_pattern subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern right: (node_pattern subject: (_ identifier: (_) @type))) +(relationship_pattern right: (node_pattern subject: (_ labels: (labels (symbol) @type)))) +; Arrow kind: subject is inside optional brackets on the arrow +(relationship_pattern kind: (right_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (right_arrow subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern kind: (left_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (left_arrow subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern kind: (undirected_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (undirected_arrow subject: (_ labels: (labels (symbol) @type)))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ identifier: (_) @type))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ labels: (labels (symbol) @type)))) +(identified_annotation identifier: (_) @type) +(identified_annotation labels: (labels (symbol) @type)) + +; Symbols and identifiers (generic; definition/reference/tag captured above) +(symbol) @variable + +; Keywords and operators +[ + "@" + "|" + ":" + "::" +] @operator + +; Subject Patterns and delimiters +[ + "[" + "]" + "(" + ")" + "{" + "}" +] @punctuation.bracket + +; Comma separator +[ + "," +] @punctuation.delimiter + +; Field names in records and maps +(record_property key: (symbol) @property) +(record_property key: (string_literal) @property) +(record_property key: (integer) @property) +(map_entry key: (symbol) @property) +(map_entry key: (string_literal) @property) +(map_entry key: (integer) @property) + +; Annotation keys (property-style) and headers (identified/label-style) +(property_annotation key: (symbol) @attribute) + +; Relationship arrows (special highlighting for graph syntax) +(relationship_pattern) @keyword + +; Arrow operators in relationships +(right_arrow) @operator +(left_arrow) @operator +(undirected_arrow) @operator +(bidirectional_arrow) @operator + +; Error highlighting +(ERROR) @error diff --git a/editors/zed/languages/gram/indents.scm b/editors/zed/languages/gram/indents.scm deleted file mode 120000 index e6873ff..0000000 --- a/editors/zed/languages/gram/indents.scm +++ /dev/null @@ -1 +0,0 @@ -../../../../queries/indents.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/indents.scm b/editors/zed/languages/gram/indents.scm new file mode 100644 index 0000000..6056020 --- /dev/null +++ b/editors/zed/languages/gram/indents.scm @@ -0,0 +1,14 @@ +; Indentation: 2 spaces per level for brackets and multi-line structures +; FR-007 — specs/004-editor-improvements/contracts/indents.md + +; Record {} +"{" @indent +"}" @indent.end + +; Subject pattern [] +"[" @indent +"]" @indent.end + +; Node pattern () +"(" @indent +")" @indent.end diff --git a/editors/zed/languages/gram/injections.scm b/editors/zed/languages/gram/injections.scm deleted file mode 120000 index ff019fe..0000000 --- a/editors/zed/languages/gram/injections.scm +++ /dev/null @@ -1 +0,0 @@ -../../../../queries/injections.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/injections.scm b/editors/zed/languages/gram/injections.scm new file mode 100644 index 0000000..cf688d2 --- /dev/null +++ b/editors/zed/languages/gram/injections.scm @@ -0,0 +1,29 @@ +; Language injection for tagged strings: tag`content` and ```tag\ncontent\n``` +; +; The tag symbol is used as the injection language so that downstream and editors +; can support arbitrary tags without changing the grammar. Well-known tags (md, ts, +; date, datetime, time, sql, json, html, etc.) are documented in docs/tagged-strings-and-injections.md. +; +; Overrides below map tags that do not match common parser names. The final +; rule uses the tag's text as the language name for all other tags (e.g. "sql", +; "json", "html" often match parser names). + +; md -> markdown +(tagged_string + tag: (symbol) @_tag + content: (string_content) @injection.content) +(#eq? @_tag "md") +(#set! injection.language "markdown") + +; ts -> typescript +(tagged_string + tag: (symbol) @_tag + content: (string_content) @injection.content) +(#eq? @_tag "ts") +(#set! injection.language "typescript") + +; Dynamic: use tag text as language name for all other tags (sql, json, html, etc.) +; Editors may map additional tags (e.g. date, datetime, time) to parsers or leave as plain. +(tagged_string + tag: (symbol) @injection.language + content: (string_content) @injection.content) diff --git a/editors/zed/languages/gram/locals.scm b/editors/zed/languages/gram/locals.scm deleted file mode 120000 index d68c805..0000000 --- a/editors/zed/languages/gram/locals.scm +++ /dev/null @@ -1 +0,0 @@ -../../../../queries/locals.scm \ No newline at end of file diff --git a/editors/zed/languages/gram/locals.scm b/editors/zed/languages/gram/locals.scm new file mode 100644 index 0000000..197faa6 --- /dev/null +++ b/editors/zed/languages/gram/locals.scm @@ -0,0 +1,28 @@ +; Locals: go to definition and highlight references (file scope) +; FR-005, FR-006 — specs/004-editor-improvements/contracts/locals.md + +; File scope: all definitions and references in one scope +(gram_pattern) @local.scope + +; Definitions: identifiers that define a pattern or annotation +(subject_pattern subject: (_ identifier: (_) @local.definition)) +(subject_pattern subject: (_ labels: (labels (symbol) @local.definition))) +(node_pattern subject: (_ identifier: (_) @local.definition)) +(node_pattern subject: (_ labels: (labels (symbol) @local.definition))) +(relationship_pattern left: (node_pattern subject: (_ identifier: (_) @local.definition))) +(relationship_pattern left: (node_pattern subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern right: (node_pattern subject: (_ identifier: (_) @local.definition))) +(relationship_pattern right: (node_pattern subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (right_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (right_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (left_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (left_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (undirected_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (undirected_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ identifier: (_) @local.definition))) +(relationship_pattern kind: (bidirectional_arrow subject: (_ labels: (labels (symbol) @local.definition)))) +(identified_annotation identifier: (_) @local.definition) +(identified_annotation labels: (labels (symbol) @local.definition)) + +; References: pattern_reference identifier +(pattern_reference identifier: (_) @local.reference)