Archive for November, 2009

Don’t Trust Your Users

November 3rd, 2009

My programming teacher was full of useful acronyms when it came to teaching us things. KISS (Keep it Simple Stupid), DRY (Don’t Repeat Yourself), IPO (Input-Process-Output), and probably one of the most useful, if understated:

GIGO = Garbage In, Garbage Out

Those four letters are probably some of the most ignored four (OK, three) letters in programming. Most programmers assume wrongly that their users are not out to get them and destroy precious data, but they are. Fortunately most users don’t know that they are trying to destroy the precious balance that most web apps have, and unfortunately most web apps don’t care. Programmers then have to worry about the rest of the users that are trying to abuse the system, and then the users who are maliciously attacking the system. By ignoring GIGO, you end up with a sad list like the OWASP Top 10.

OWASP Top 10 – We Shouldn’t Need It

Despite what that heading says, I’m not saying that the Top 10 is a bad thing. It’s just sad that most of the vulnerabilities can easily be done away with if programmers just take a little extra time and remember to treat everything that a user submits as garbage. Let’s take a look two that should have never made the list, let alone made it to the top two:

What do both of these problems have in common? Both are caused by applications blindly accepting user input. The problem is made even more sad by the fact that PHP, especially, has many tools available to help mitigate these two types of attacks.

Filtering Out Cross Site Scripting

Cross Site Scripting, or XSS, is actually a form of injection, but instead of being executed by the server it is executed by the user’s browser. The classic example, one that happens so frequently it makes an appearance in the satirical Forum Warz, is of the guestbook or forum that just displays whatever a user types in back to everyone. Since the Forum (in this case) doesn’t actually do any filtering on the user’s input when they make a post, they can enter things like this:

[cce_html]
U’ve been pwned!!!!!~~~!!11009“1~

<script type=”text/javascript”>alert(document.cookie);</script>
[/cce_html]

While the author of the forum may have never intended for users to type in HTML tags, in this case a user did. Now anytime someone visits that post, a pop-up will show the cookie data to the user. An actual attack would probably post that back to a URL and be completely silent. So, how do we get rid of this attack?

PHP’s Filtering Functions

Whitelist HTML Tags with strip_tags()

strip_tags() can either be used in two ways – remove all the HTML tags in a string or to only remove tags that are not in a white list. This is convenient if you want to allow a subset of HTML to be used in an input form.

[cce_php]
// Grab the user’s input from POST
$message = $_POST['message'];

// Just remove all the HTML tags
$message = strip_tags($message)

// Only allow basic text formatting such as Bold, Italics, Underline
$whitelist = ‘<b><i><u>’;
$message = strip_tags($message, $whitelist);
[/cce_php]

This will filter out that nasty <script> tag in either case. This makes it extremely nice if the app uses a WYSIWYG editor like TinyMCE or FCKEditor that uses HTML natively instead of something like bbcode.

Convert HTML with htmlentities()

htmlentities() converts HTML entities (special characters) into their viewable equivalents. For example, to display the ‘less than’ symbol (<) you should actually type in ‘&lt;’ to make sure that browsers display it properly. htmlentities() will take a string and convert anything that has an entity to it.

[cce_php]

// Message is = <b>This is cool!</b>
$message = $_POST[‘message’];

// Now we’ll get: &lt;b&gt;This is cool!&lt;/b&gt;
echo htmlentities($message);
[/cce_php]

This is useful when you want to take text as it is and return it so that it displays properly. It has the side effect of also destroying attacks that rely on HTML tags since they tags are converted into displayable entities instead of interpreted by the browser. I call this a side effect because I don’t really suggest using htmlentities() in this way. If you are not going to allow HTML at all, strip it with strip_tags().

Use PHP 5’s filter_input for a common interface

PHP 5.2.0 introduced a new funtion, filter_input(), which allows for a single function to do different types of filtering and validation.

[cce_php]
// These two should do the exact same thing, strip all HTML Tags
$message = filter_input(INPUT_POST, ‘message’, FILTER_SANITIZE_STRING);
$message = strip_slashes($_POST[‘message’]);
[/cce_php]

While nice, it needs to be expanded a bit to become extremely functional for anything other than basic use.

Zend_Filter Makes it Way To Easy

Zend_Filter, part of the Zend Framework, exposes a great deal of filtering and makes it easy to integrate with allowing user input. Zend_Filter will filter using:

  • Alphanumeric
  • Only Alpha (with or without whitespace)
  • Numeric
  • HTML Entities
  • Whitelist HTML Tags

[cce_php]
$alnum = new Zend_Filter_Alnum(true); // Filters out everything but letters, numbers, and spaces
$alpha = new Zend_Filter_Alpha(); // Filters out everything but letters, including spaces
$tags = new Zend_Filter_StripTags(); // Filters out just HTML Tags
$message = “Let’s <b>filter</b> this message out!!!111!”;
echo $alnum->filter($message); // Lets bfilterb this message out111
echo $alpha->filter($message); // Letsbfilterbthismessageout
echo $tags->filter($message); // Let’s filter this message out!!!111!
[/cce_php]

Zend_Filter also allows for Filter Chains that allow a programmer to add a bunch of filters and apply them all at once. This is useful if you want to strip out all the HTML, and then strip out any remaining special characters, unlike the first filter example where the text inside the tags is left.

The real jewel with Zend_Filter comes when used with Zend_Form. Zend_Form allows for forms to be built in PHP and then sent to the browser. When generating elements for the form, the programmer has the option to add filters and validations to elements that are fired off when the form is processed. A full tutorial is beyond the scope of this article, but the documentation shows how to use filters on the elements here.

Remember, Don’t Trust Your Users

I’ll tackle Injection attacks in another post, as there are different types of injection attacks. For now though, always remember to sanitize and filter ANY input that you accept from the user. While most users are not trying to be harmful, there are ones out there that are. It also means one less thing you have to worry about being a flaw or bug in an application.

PS: Hungarian Notation

One thing that may be useful that I picked up from Joel Spolsky is the concept of (proper) Hungarian Notation. Have a read through his post for the proper way to do Hungarian Notation and why it makes sense.