Add support for writing HTML literals using UTF8 strings#12848
Add support for writing HTML literals using UTF8 strings#12848DamianEdwards wants to merge 6 commits intomainfrom
Conversation
Implements the @utf8HtmlLiterals directive (with boolean token) that when
enabled causes the Razor compiler to emit HTML literal blocks as C# UTF-8
string literals ("..."u8) instead of regular string literals.
This allows the page's base class to provide a WriteLiteral(ReadOnlySpan<byte>)
overload that writes pre-encoded UTF-8 bytes directly to the output, avoiding
runtime UTF-16 to UTF-8 encoding and associated memory allocations.
Key changes:
- Add WriteHtmlUtf8StringLiterals flag to RazorCodeGenerationOptions
- Add Utf8HtmlLiteralsDirective and Utf8HtmlLiteralsDirectivePass
- Register directive for Legacy (.cshtml) files, gated on Version_11_0
- Modify CodeWriterExtensions to append u8 suffix when flag is set
- Modify RuntimeNodeWriter to pass flag from options to code writer
- Use documentNode.Options in lowering phase (respects directive passes)
- Relax directive keyword validation to allow digits (not just letters)
Fixes #8429
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Since this is for .NET 11 anyway, seems like there's plenty of time to get the ROS overload into the runtime. Should also ideally detect whether such an overload exists or not, an error if not, then people can polyfill easily on older runtimes. Also should probably have a LDM about this :) |
The intent wasn't to get an overload into the runtime, at least not at this time. While we certainly could do that, it would make it slower in that case, not faster, as it would then convert from UTF8 bytes to For now, the goal here is to enable other .cshtml-based scenarios (i.e. non-MVC) to leverage this support and get the performance benefits, e.g. Razor Slices.
We don't do this for other directives when custom base classes are being used AFAIK, e.g. if I use
Didn't realize we discussed Razor compiler stuff there now, cool. LMK what the process is. |
That at least answer my other (unasked) question about why this is .cshtml only.
Oh, I don't mean the C# LDM. There has been one Razor LDM meeting so far, and I was asleep at the time, but the plan is for there to at least be some committee that can sign off on things, I believe.
I know we don't, but IMO that is not a good thing, and something we should be better about in future. BUT this is also something we can discuss at LDM and see if anyone else agrees with me :) |
Replace @utf8HtmlLiterals directive wiring with automatic detection based on whether the inherited base type exposes a callable WriteLiteral(ReadOnlySpan<byte>) overload. Update compiler/source-generator plumbing and tests accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolve post-conflict source generator breakage by removing the unresolved suppression call, aligning option provider tuple shape, restoring cshtml test execution against output compilation, and updating incremental step expectations for compilation-dependent options. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Implements #8429 by enabling UTF-8 HTML literal emission for legacy
.cshtmlcode generation when the generated type's inherited base class has a callable:When that overload is available to the generated type, Razor emits HTML literals as C# UTF-8 literals (
"..."u8), allowing direct binding to the byte-span overload. If the overload is not available, generation remains the existing string-literal path.Implementation
Utf8WriteLiteralDetectionPassfor legacy documents to detect UTF-8WriteLiteralcapability from the inherited base type.WriteLiteral(ReadOnlySpan<byte>)overload across the inheritance chain.Behavior
Given:
and a base type containing:
generated HTML literal calls are emitted using UTF-8 string literals (
"..."u8).Tests