Fix HTML logger crash on invalid XML chars in test names#16052
Draft
nohwnd wants to merge 2 commits into
Draft
Conversation
When a test's DisplayName (e.g. from a DataRow attribute) contains XML 1.0 invalid control characters such as 0x01-0x08, 0x0B, 0x0C, 0x0E-0x1F, DataContractSerializer throws XmlException and silently prevents the HTML report from being generated. Apply the same sanitization pattern already used by TrxLogger's XmlPersistence to replace invalid XML characters with their Unicode escape representation (e.g. \u0001) before they are stored in the HTML logger object model. Fixes microsoft#10431 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rs correctly - Use a static readonly compiled Regex instead of re-creating on every call - Exclude the surrogate range from the negated char class in the first alternative so valid surrogate pairs are not matched; add explicit lone-surrogate alternatives with lookahead/lookbehind to catch only invalid lone surrogates - Add test verifying emoji (valid surrogate pair) passes through unchanged Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a failure mode in the HTML logger where DataContractSerializer can throw XmlException (preventing HTML report generation) when test names or failure details include XML 1.0–invalid control characters. It introduces XML-character sanitization (aligned with the TRX logger’s approach) before persisting the HTML logger object model.
Changes:
- Added XML 1.0 invalid-character detection (with surrogate-pair preservation) and sanitization to
HtmlLogger. - Applied sanitization to
DisplayName,FullyQualifiedName,ErrorMessage, andErrorStackTraceinTestResultHandler. - Added unit tests covering invalid control character replacement and preservation of valid surrogate pairs.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/Microsoft.TestPlatform.Extensions.HtmlLogger/HtmlLogger.cs | Sanitizes XML-invalid characters before serializing test result fields; adds regex + helper. |
| test/Microsoft.TestPlatform.Extensions.HtmlLogger.UnitTests/HtmlLoggerTests.cs | Adds unit tests for invalid XML control characters and surrogate-pair preservation. |
Comment on lines
+251
to
+273
| [TestMethod] | ||
| public void TestResultHandlerShouldSanitizeInvalidXmlCharsInDisplayName() | ||
| { | ||
| // Characters like \x01 (SOH) are invalid in XML 1.0 and would cause DataContractSerializer to throw. | ||
| var testCase = CreateTestCase("Pass1"); | ||
| testCase.FullyQualifiedName = "fully"; | ||
| testCase.Source = "abc/def.dll"; | ||
|
|
||
| var testResult = new ObjectModel.TestResult(testCase) | ||
| { | ||
| DisplayName = "TestMethod(\x01value)", | ||
| ErrorMessage = "error\x02message", | ||
| ErrorStackTrace = "stack\x03trace", | ||
| }; | ||
|
|
||
| _htmlLogger.TestResultHandler(new object(), new Mock<TestResultEventArgs>(testResult).Object); | ||
|
|
||
| var result = _htmlLogger.TestRunDetails!.ResultCollectionList!.First().ResultList!.First(); | ||
|
|
||
| Assert.AreEqual(@"TestMethod(\u0001value)", result.DisplayName); | ||
| Assert.AreEqual(@"error\u0002message", result.ErrorMessage); | ||
| Assert.AreEqual(@"stack\u0003trace", result.ErrorStackTrace); | ||
| } |
Comment on lines
+471
to
+474
| /// XML 1.0 valid characters: #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD]. | ||
| /// Control characters in the range #x00-#x08, #x0B, #x0C, #x0E-#x1F are not valid and | ||
| /// will cause <see cref="DataContractSerializer"/> to throw an <see cref="System.Xml.XmlException"/>. | ||
| /// Invalid characters are replaced with their Unicode escape representation. |
This was referenced May 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #10431
When test display names contain XML 1.0 invalid control characters (
\x01–\x08,\x0B,\x0C,\x0E–\x1F),DataContractSerializerthrows anXmlExceptionand the HTML report is silently broken — test entries with those characters are missing from the output.Sanitize those characters before storing them in the HTML logger object model — same approach as
TrxLoggeralready uses. Invalid chars are replaced with their\uXXXXescape representation. Valid surrogate pairs (emoji etc.) pass through unchanged.Applied to
DisplayName,FullyQualifiedName,ErrorStackTrace, andErrorMessageinTestResultHandler.Before — HTML report with control chars in test names (test entries missing, only
Test(normal)shows up):After — control chars sanitized to
\u0001, all three tests visible: