Picture of stu

Never untaint

  • Posted By Stuart Halloway on January 17, 2008

For the last few weeks, Aaron and I have been worrying about the state of XSS security. As we have said before, input sanitization is not enough, you need something like SafeERB for all external data.

But after looking at XSS Shield, and thinking a bit more, I have concerns about SafeERB's approach to safety. Here's the problem:

  • External data is marked as tainted. Great care should be taken in using tainted data in any potentially executable context.
  • SafeERB sanitizes strings against XSS, and then untaints the strings so that you know they are safe in an HTML context.

Notice the subtle asymmetry. External data is dangerous in any executable context. SafeERB untaints data after making it safe for only one specific context--HTML. What if your freshly untainted string contains SQL? Shell script code? Or some DSL you never heard of?

You could argue that the data is being marked untainted in the view layer of an HTML MVC framework, so we don't have to worry about non-HTML dangers. I don't buy this at all. How can you be sure that none of your Rails code will ever be repurposed in another context? Sounds like a recipe for legacy to me.

XSS Shield takes a more cautious approach. Instead of calling untaint, XSS shield uses Ruby's open classes to define new methods such as mark_as_xss_protected. So you can leave a string tainted, but mark it as XSS safe.

Let me be clear: SafeERB is way better than doing nothing. But I am moving toward the conclusion that you should never untaint data. What do you think?

Comments
  1. Paul DoerwaldJanuary 17, 2008 @ 06:24 PM

    I’ve been thinking about these issues since I read and commented on your other post a few days ago on this topic.

    I think it is reasonable to untaint data once you’ve asserted that the data is safe. It strikes me that the method “untaint” exists for that reason, and using it is appropriate (i.e. within the contract of the method). As long as you don’t break the contract of ‘taint’ and ‘untaint’ (i.e. weaken the responsibilities of the method w.r.t. any other systems that may be calling those methods), you can do whatever you want to their behaviour. A big side-benefit of using taint/untaint is that any other software that would interact with the system using taint/untaint can count on these methods being meaningful. For instance, if an unsafe string is passed to a third-party system and string.tainted? false but string.xss_safe? false, the third-party software won’t check xss_safe because it won’t know of that method. By using the published API within its contractual limits, every other piece of software can depend on the functionality working as expected.

    I agree that a problem with SafeERB is that it only focuses on Erb, and not on the rest of the RoR stack, which is why, as you say, there is a risk of malicious SQL leaking through, or other as-yet-unknown attacks. I would like to see the entire RoR stack run in safe mode, thus eliminating the need for SafeERB. Every string that comes in from an outside source must be checked according to some criteria before it is marked untainted.

    For instance, files read from the hard drive that are owned by a certain user/group and are not writable from the web server for instance should be automatically untainted. That way strings read from routes.rb and your views are treated as “safe”, which in all likelihood they are. Files in /public that are writable by the web server process (i.e. photos/files uploaded by your users) are default tainted. They have to be checked before they’re untainted. For a photo this is trivial (if type == jpg, photo.untaint) because the type itself is fundamentally safe. For a Word document, you might want to check for macro viruses before you untaint. For an HTML file, check for XSS attacks, but for a plaintext file it’s theoretically not necessary, but let’s do it anyway since a misconfigured web server might serve the plaintext file as text/html or a misconfigured browser (?!) might read it as such and start executing scripts. Regardless of the type of tainting (XSS, SQL injection, macro virus), the fact is that the object is tainted and therefore unsafe. Once it’s been checked, it’s safe and can be used anywhere with no fear.

    Paul.

  2. Tom MoertelJanuary 19, 2008 @ 06:03 AM

    What most tainting-based solutions seem to miss is that “safeness” or “taintedness” is not a property of a string but a relationship between the string and the contexts in which it is used. Most tainting-based solutions, unfortunately, look at the problem from a single context and therefore get things subtly wrong when you consider the whole picture, as you have observed.

    Which brings us to your question about whether one should ever untaint strings. In most tainting (i.e., single-context) implementations, the taintedness of a string tells you what it represents, e.g., untrusted user input or HTML. So you should only change the taintedness of a string if you change its underlying representation. So if you have a string that represents untrusted user input, it should be marked as tainted. If you HTML-escape the string, the resulting string should be marked as untainted. Again, this only works from the perspective of a single context, here HTML templates.

    So, what do we do about all those other string contexts—SQL, shell scripts, and so on?

    One reasonable solution is to use richer type information. If you know what each string represents, and if you also know what each context in which strings can be used represents, you can then determine whether each use of your strings is safe.

    In effect, this is what good programmers already do in their heads. They reconcile what each string represents with each context in which it is used, applying escaping operations as needed to make the string compatible with its context. This mental burden, however, has proved difficult to carry reliably. The result is a world-wide web full of XSS holes and invalid X/HTML.

    Fortunately, this burden can be given to your computer. In Haskell, for example, I’ve written a library that detects XSS holes (and other “string errors”) at compile time (see link [1] below for more). I suspect that similar ideas could work in Ruby if you could combine them with code coverage analysis. (Because error detection would occur at run time, you would want to exercise all of your string uses during testing to be sure your code was safe. Catching XSS errors at run time in a live, production application is better than nothing, but it may be too late to meaningfully recover from the error’s consequences. Best case is that you must abort the current user transaction, which is pretty ugly.)

    Cheers!—Tom

    [1] http://blog.moertel.com/articles/2006/10/18/a-type-based-solution-to-the-strings-problem