Skip to content

disable dtd and external entities in stax xml parsers#12251

Open
jmestwa-coder wants to merge 1 commit into
apache:masterfrom
jmestwa-coder:stax-xxe-hardening
Open

disable dtd and external entities in stax xml parsers#12251
jmestwa-coder wants to merge 1 commit into
apache:masterfrom
jmestwa-coder:stax-xxe-hardening

Conversation

@jmestwa-coder

@jmestwa-coder jmestwa-coder commented Jun 10, 2026

Copy link
Copy Markdown

XXE via external entity resolution in the StAX XML parsers

Maven's hand-written StAX readers build their parser with XMLInputFactory.newFactory(), and the Woodstox defaults resolve external entities. A pom.xml, META-INF/maven/extension.xml or plugin.xml read from a downloaded artifact can pull in file:// or http:// SYSTEM entities, which gives local file disclosure and SSRF while a build runs. SUPPORT_DTD and IS_SUPPORTING_EXTERNAL_ENTITIES are now turned off on every input factory used to read XML. None of Maven's own formats carry a DOCTYPE, so this only rejects hostile input.

The added test in ExtensionDescriptorBuilderTest leaks a temp file through an external entity in an extension descriptor; it fails on the current tree and passes with the factories hardened.

  • Your pull request should address just one issue, without pulling in other changes.

  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.

  • Each commit in the pull request should have a meaningful subject line and body.

  • Write unit tests that match behavioral changes, where the tests fail if the changes to the runtime are not applied.

  • Run mvn verify to make sure basic checks pass.

  • You have run the Core IT successfully.

  • I hereby declare this contribution to be licenced under the Apache License Version 2.0, January 2004

  • In any other case, please file an Apache Individual Contributor License Agreement.

hand-written XMLInputFactory readers resolved external entities, allowing xxe and ssrf from downloaded poms, extension and plugin descriptors

@gnodet gnodet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this — the XXE hardening is a good idea.

A few observations:

  1. Duplication: the PR copies an identical newInputFactory() helper into 5 separate classes. A single shared method would be easier to maintain.
  2. Missing coverage: the Velocity templates (reader-stax.vm, reader.vm) that generate StAX readers for settings.xml, toolchains.xml, plugin.xml, metadata.xml, etc. also create their own XMLInputFactory instances and are not hardened by this PR.
  3. Context: external entity / XInclude support was an intentional feature (MNG-5862) that was later extracted into maven-xinclude-extension. That extension creates its own factory with entities enabled under a restricted resolver, so it is unaffected by this hardening.

I've opened #12256 as an alternative that centralizes the factory creation into XmlService.newXMLInputFactory() and also covers the Velocity templates.

This review was generated by an AI agent and may contain inaccuracies. Please verify all suggestions before applying.

Claude Code on behalf of Guillaume Nodet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants