Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Security

The Anatomy of Cross Site Scripting 208

LogError writes "Many documents discuss the actual insertion of HTML into a vulnerable script, but stop short of explaining the full ramifications of what can be done with a successful XSS attack. While this is adequate for prevention, the exact impact of cross site scripting attacks has not been fully appreciated. This paper will explore those possibilities."
This discussion has been archived. No new comments can be posted.

The Anatomy of Cross Site Scripting

Comments Filter:
  • by akedia ( 665196 ) * on Friday November 07, 2003 @12:42PM (#7417183)
    The Anatomy of Cross Site Scripting
    Anatomy, Discovery, Attack, Exploitation
    by Gavin Zuchlinski (gav@libox.net )
    http://libox.net/
    November 5, 2003

    Introduction
    Cross site scripting (XSS) flaws are a relatively common issue in web
    application security, but they are still extremely lethal. They are
    unique in that, rather than attacking a server directly, they use a
    vulnerable server as a vector to attack a client. This can lead to
    extreme difficulty in tracing attackers, especially when requests are
    not fully logged (such as POST requests). Many documents discuss the
    actual insertion of HTML into a vulnerable script, but stop short of
    explaining the full ramifications of what can be done with a successful
    XSS attack. While this is adequate for prevention, the exact impact of
    cross site scripting attacks has not been fully appreciated. This paper
    will explore those possibilities.

    Anatomy of a Cross Site Scripting Attack
    A cross site scripting attack is typically done with a specially crafted
    URI that an attacker provides to their victim. The XSS attack could be
    considered analogous to a buffer overflow, where the injected script is
    similar to overwriting an EIP. In both techniques, there are two options
    once a successful attack has occurred: insert funny data or jump to
    another location. Insertion of funny data during a buffer overflow
    typically results in a denial of service attack. In the case of a XSS
    attack, it allows the attacker to display arbitrary information, and
    suppress the display of the original webpage. When jumping to
    another location during a buffer overflow attack, the new location is
    another location in memory where shellcode or other important data
    resides - allowing the attacker to take control of the flow of the
    program. Within the XSS context, the attacker instead jumps the
    victim to another location on the Internet (typically under the
    attacker's control), hijacking the victim's web browsing session.

    Discovery
    But how do cross site scripting attacks occur? XSS attacks are the
    result of flaws in server- side web applications and are rooted in user
    input which is not properly sanitized for HTML characters. If the
    attacker can insert arbitrary HTML then they could control execution of
    the page under permissions of the site. A simple page vulnerable to
    cross site scripting looks like:

    Once the page is accessed, the variable sent via the GET method is
    placed directly on the rendered page. Since the input is not marked as
    variable input , the user- supplied input is interpreted exactly as its
    metacharacters command, very similar to SQL injection. Passing
    "Gavin Zuchlinski" as an argument outputs the content in correct form:
    Sending input with HTML metacharacters allows for unexpected output:
    The input is not validated by the script before rendering by the victim's
    web browser. This allows for user controlled HTML to be inserted on to
    the vulnerable page. Occasionally user input not directly parsed by the
    script it is sent to. Rather, the data is inserted into a file or database
    and retrieved later to be reinserted on the page.
    Common points where cross site scripting exists are confirmation
    pages (such as search engines which echo back user input in the event
    of a search) and error pages that help the user by filling in parts of the
    form which were correct. Commonly in the latter case (and sometimes
    the former) the containment of the form text box must be escaped
    with a quote and a greater than sign ("> - the quote closes the value
    property and the greater than closes the tag).

    Attack
    Once a vulnerable input is identified the valid HTTP methods must be
    determined. The way in which the variables are sent to the target
    script is an important consideration; are variables sent by GET, POST,
    or will either work? Some scripts are specific, but several which use
    canned methods (like PHP and Perl scr
  • Why is Cross Site Scripting XSS? Or have we reverted to referring to letters by the way they look?
  • Booring (Score:2, Interesting)

    Cross-site scripting vulnerabilities are just not as exciting as your standard buffer overflow. There are no crashes, no worms, etc. Unfortunately people are just not going to pay attention.
    • Re:Booring (Score:5, Insightful)

      by xchino ( 591175 ) on Friday November 07, 2003 @01:11PM (#7417461)
      I completely disagree. XSS is as dangerous if not more so than a buffer overflow, im many cases. Take this for example:

      Your target is one or more users of a community web site. The site itself isn't the target, only the means to your own ends. Remember, it's the users you are after, not the site itself. So you smash the stack on the server, grap the mySQL database, and open it up. Bummber, all the passwords are md5'd and basically useless. With XSS you could conceivably alter the login for that they get, and before md5($password) is executed you export $password (still in plain text) off to your little database.

      Cracking isn't about what is the most "exciting and leet" way to do it, it's about using the tools you have at your disposal to get what you want done, done. Sometimes this is a buffer overflow, sometimes it a XSS attack, sometimes an emailed trojan, and sometimes even social engineering to gain physical access (even via an unwitting human proxy).
      • Re:Booring (Score:3, Informative)

        by Carnildo ( 712617 )
        Just because a password is MD5-encoded doesn't mean it's useless.

        1) You can put the user ID and MD5-encoded password in your own cookies, and log in as the user.
        2) You can find another site that user logs in on, find their user ID, and use the captured MD5 password to log in as them -- people tend to use the same password in many places
        3) You can feed the MD5 password into a password cracker. If it's in a dictionary, you'll get the cleartext version in seconds; brute-forcing all possible 7-character pass
      • What you are describing has nothing to do with cross-site scripting. You're just cracking a web site to redirect traffic. At best it's a man-in-the-middle attack.

        A cross-site scripting exploit would be posting a specially crafted comment to the forums what would result in visitors' cookies being sent to the attacker.
    • Cross-site scripting and other attacks based on unescaped input are actually pretty much like executing arbitary code using a buffer overflow. You can hijack sessions (get elevated privileges) and cause limited denial of service, just to mention a few possibilities.

      I think it would be possible to design a self-propagating exploit, that would "infect" user sessions and then use those sessions to "infect" more vulnerable pages, which leads to more infected sessions, and so on. If the attack script was be com

  • XSS Protection (Score:5, Informative)

    by mnmlst ( 599134 ) on Friday November 07, 2003 @12:49PM (#7417253) Homepage Journal

    Cross Site Scripting attack protection is a standard feature of many network security products these days. Check Point NG with Application Intelligence (Feature Pack 4 in other words) includes XSS protection as part of its' so-called SmartDefense. I am curious if anyone has run into situations where SmartDefense is screwing up legitimate traffic, especially traffic that resembles an XSS attack.

    BTW, does anybody have some good recommendations for cheaper alternatives with pretty comparable protection to Check Point? I would like something that is as defensive, but not as configurable or extensible.

    • I am curious if anyone has run into situations where SmartDefense is screwing up legitimate traffic, especially traffic that resembles an XSS attack.

      "Cross-site scripting" or XSS sounds like a fancy way of saying "failed to properly escape arbitrary client-submitted data". If user input is properly escaped (or otherwise cleansed) when outputted to a web page, there shouldn't be any vulnerability or interference with ligitimate input submission.

      • The biggest problem is figuring out what data needs to be escaped, and how. For example, did you know that you can place a working onMouseClick javascript in a <u> tag? Instant "hyperlink"! What about in an <img> tag? What attributes will you allow in an <a> tag? How are you going about converting those [i] BBCode tags? Can someone sneak code in that way?
        • What's just as bad is a user deciding to send off "s to the database server. When properly escaped by certain database modules, they pose no harm. The moral to all of this is to never trust user input. But when staff is short and a deadline is looming, these sorts of silly little goofs can turn into big problems further down the line.
    • Re:XSS Protection (Score:3, Informative)

      by nehril ( 115874 )
      look at netscreen. pretty advanced firewall in a box, many different levels of hardware available, pretty secure and far, far cheaper than checkpoint.

      hardware accelerated vpns, available redundancy/HA, straightforward config, and no need to buy/maintain server class hardware + os in order to run it (no moving parts except fan I think).

      not a bad deal if you don't need specific Checkpoint features. unfortunately their last firmware update seems to have undone the "simplicity factor" that they were so popula
    • I'm not sure if it's some XSS protection, but I've noticed certain submissions to one of my sites to be occasionally mangled. E.g. a variable called foo1234.1.2 will have its name mangled to foo~~~~.1.2 as if something on the user's site considered 1234 to be a dangerous number. I've seen this happen very rarely however, maybe 3-4 times in the last 6 months with a 100k unique clients. Since all those numbers are generated on the client and are hidden variables obviously this protection, if it is indeed tryi
    • Re:XSS Protection (Score:3, Informative)

      by Anonymous Coward
      ASP.NET 1.1 will, by default, refuse to process any forms that have fields in which the user has tried to post values that contain HTML. You can override on a per-page basis, but I think it's a reasonable default.
  • Short solution (Score:5, Insightful)

    by Anonymous Coward on Friday November 07, 2003 @12:54PM (#7417307)
    Do not have a blacklist for denying invalid input. It's hell/impossible to maintain such blacklists

    Handle all user input as it was written by satan himself, and only allow input complying to your strict specification.
    • "That which is not explicitly permitted is denied." That is, whitelists rather than blacklists.

      Of course, you should also familiarize yourself with the mechanisms your language of choice has to help defend against such attacks. In PHP, this means register_globals = off, and there are also freely available input validation functions designed with XSS in mind.

      I like to maks sure that as many of my forms use POST as possible, and include code at the top to halt processing if anyone attempts to pass a PHP p
      • Using POST rather than GET does not address XSS in any way at all. POST values can be sent as easily as putting GET values into a query string.

        The issue is addressed by simply parsing any user input that's sent back to the browser. This parsing can be as simple as replacing HTML brackets <> with entity codes &lt;&gt;

        This is as basic as web development 101 gets. Any site that falls vicitm to XSS does so due to sloppy coding at its best, and rightly deserves to be compromised!

        • I don't see how you can trick someone into clicking on a URL that sends POST values using their Web browser. You could send it using something like netcat, sure. Just because something doesn't completely eliminate the problem doesn't mean it does nothing at all.

          Also, you can't just rely on replacing HTML brackets, especially if you're using any sort of SQL database (or any database, really). Even if you're not, your scripts could be tricked into revealing the contents of files.

          • An HTML form using POST can be submitted with a hyperlink.

            And you're right, it's not just brackets that need replacing, quotes should be replaced too. This massive set of 3 characters are all that defines the bulk of HTML.

            However, the issue with XSS is formatting user input that is sent back to the browser.

            Obviously user input must be parsed for insertion into SQL queries, but this is not an XSS issue.

            As for code being tricked by user input, I've never heard of anyone actually writing code that attempts
          • Really?

            <a href="mylink.html" onClick="javascript:document.forms.length++;
            theForm=document.forms[document.forms.length-1];
            theForm.action='http://myserver.com';
            theForm.elements.length=1;
            theForm.elements[0].name='password';
            theForm.elements[0].value='yourpw';
            theForm.elements[0].type='text';
            theForm.submit();">

  • From the article:

    Supposing that the above hello.php was on the same domain as a message board, posting the link to the board would illicit many victims.

    That's 'elicit', not 'illicit'.

  • This is news? (Score:5, Informative)

    by unfortunateson ( 527551 ) on Friday November 07, 2003 @01:00PM (#7417360) Journal
    I'm surprised this merited a news item.
    Webmonkey [lycos.com] had a similar article three and a half years ago, that provide some more solid examples of what happens.

    I designed an e-commerce site over the last six years, and evaluated where there might be XSS vulnerabilities. Not having a bulletin board or guestbook removes many areas for exploitation.

    So if someone types contaminated data into their address field when checking out, you'd think all it hoses is their own purchase, right?

    Well, with PHP or Perl CGI, it's possible that the inputted variables could exploit server knowledge: if you know the variable names used in the PHP code for, say, the MySQL password, then embedding this in the input to be evaluated on output can open an avenue for hacking. The variable has to be evaluated in most cases, although code which generates new PHP pages could result in similar problems.

    HTML encode EVERYTHING the user sends to you.

    • Its news to you (Score:5, Informative)

      by samjam ( 256347 ) on Friday November 07, 2003 @02:33PM (#7418334) Homepage Journal
      HTML encode EVERYTHING the user sends to you


      *cough*

      Its this kind of lack of understanding that makes the problem so prevelant.

      First it doesn't make sense to htmlencode everything just as id doesn't make sense to addslashes everything (now turned off by default in all good php configurations).

      Here's why: Not everything that comes in is to be displayed as html, just as not everything that comes in is destined for the database.

      Unless you understand the risks, you can't guard against them though it appears some people are still able to be certain they have guarded against them.

      If you do this,

      sqlquery("select * from user where username='$user'") then you need to think what the problem is, its a well defined problem, it is that $user may contain a final ' mark and then some; maybe:
      $user="jimjoe' or 1'"

      so your preferences page now shows the first user in the db, or depending on your web page, all of them.

      In php, htmlentities doesn't encode the '

      If you are invoking system commands (and yes I one had to do a LOT of this from php) then be careful about shell meta characters like ` ' " and $ in certain cases.

      The principle is that you need to make sure the system you are passing data on to interprets it in the literal sense that you require and you cannot do this unless you understand completely how each of the systems you will pass the data on to really does interpret data.

      So if your user data is destined for the database, then escape it, something like:

      sqlquery(sprintf("select * from user where username='%s'",addslashes($user)));
      (yes there are other better was of doing it)

      If you want to display on the web page inline:
      echo htmlentities($user);
      on the other hand if you want to display in an text area I think there is other encoding to use. If it is for a url you need to urlencode and htmlentities but I forget the order.

      Understand the system you are communicating with.

      Sam
      • sqlquery("select * from user where username='$user'")

        Which is why you should be using bound variables or placeholders in the first place:

        sqlquery("select * from user where username=?", $user)

        The db library should do the right thing. It will even take care of stuff like VARCHAR vs NUMBER. No need to remember to quote or escape.

        As an aside, you should never do 'SELECT *', because it's ALWAYS overkill. For example, in the above query you don't have to retrieve 'username' as you obviously already have i

    • Clarifications (Score:3, Interesting)

      I always type too fast and leave important things out, so here's a little more:

      1) I meant "HTML Encode anything that will end up in HTML output again."

      2) I didn't bother talking about SQL insertion, that's a different gremlin from XSS.

      3) I didn't implement the things I said were stupid to do... I avoided them for that particular reason. I'm saying that there are traps to avoid, such as evaluating the contents of inputted variables. Some ways of implementing template toolkits will have you build a large
    • Yeah, That old web monkey article really nipped the whole XSS problem right in teh bud; No need to talk about XSS vulnerabilities any more - web monkey took care of that three years ago. Gavin Zuchlinski is such a n00b. He should find some more important vulnerability to write about, cause it's pretty obvious to all that XSS yesterday's news.
  • "Many documents discuss the actual insertion of HTML into a vulnerable script, but stop short of explaining the full ramifications of what can be done with a successful XSS attack"

    I was hoping for some enlightened progression into the interesting-for-some field of cross-siting, but am sorely disappointed at this basic-tut-in-PDF-clothing. Most people don't really discuss much when it comes to XSS as there isn't much to discuss. Well, maybe there is, but this paper doesn't highlight anything new or advent

    • I'm trying to figure out how to use cookies from others computers on mine to acccess sites they use that have a "save my password" feature.

      Right now the copied cookie is ignored by the web site.
      • by Anonymous Coward
        You're probably copying the session ID in addition to the user ID and password. Session IDs are usually bound to a URL or small range of URLs, so submitting a stolen session ID invalidates the password.
  • by shiflett ( 151538 ) on Friday November 07, 2003 @01:06PM (#7417409) Homepage

    Although it is PHP-specific, this free article explains XSS and CSRF in quite a bit of detail and might be useful for Web developers using any language:

    http://www.phparch.com/sample.php?mid=16 [phparch.com]

    Enjoy

  • Lethal !!! (Score:5, Funny)

    by Timesprout ( 579035 ) on Friday November 07, 2003 @01:07PM (#7417419)
    Cross site scripting (XSS) flaws are a relatively common issue in web application security, but they are still extremely lethal

    You better believe it. Why only last week I had one of my web developers executed for writing code vunerable to a Cross Scripting Attack. I dont want any slackers on my team.

    PS I now have an opening for an experienced web developer. Sent resumes to spareme@icodetolive.com
  • Asp.NET 1.1 and XSS (Score:4, Interesting)

    by palad1 ( 571416 ) on Friday November 07, 2003 @01:14PM (#7417489)
    Asp.Net protects users from XSS by default since version 1.1 by parsing the parameters of a page and looking for javascript/html code in the query.
    Of course I was bitten by this feature when upgrading from 1.0 to 1.1, but that's just because I didn't bother reading the readme.txt :)
    Automatic protection bundled with any application server is good, especially if you can turn it off [you can in asp.net , validateRequest=false et voila].
    • Ah that explains it. I was toying around with a forum that i use and found out that they check for Javascript (or any script ,.but at the same time they permit HTML including frames,forms etc. so you can use iframe to do your dirty work.
      I was amazed that they would get into trouble to check for scripts and yet permit all this stuff.
      If ASP does it for them then it explains a lot. :)
      It means they are plain idiots and not twisted idiots.
  • This site found out about the lesser known cross site script problem, the a href= tag, in particular, the a href tag that happens to be posted on slashdot! another victim to the effect.
  • by iceco2 ( 703132 ) <meirmaor@gmai l . com> on Friday November 07, 2003 @01:19PM (#7417535)
    The generall tecnhique described above is with
    volnerable scripts which display text which came
    from URL encoded data, This is one of many methods
    to display the attackers HTML in an unsuspecting
    users browser.
    It is very common for the 404 message on a website to contain the URL which was entered, In the past this was done mostly by copying it as is. This would allow an attack.

    In order to hide the attack hex encoding is used in the URL so the victim would not notice the script in the URL.

    Still the attacker needs to minimize the length of the URL this causes him to use HTML options
    such as iframe in order to insert a lot of HTML
    taken from a diffrent site.

    The main point of intrest is that the page appears to be comming from the (probably trusted) server, this can convince the user to do stuff he may not do on the attackers web site, say for example enter credit card info.

    Also one could collect cookies this way, the cookies are likely to contain passwords or equivelent informations for sites with user login.

    In some forums a user can put scripts in his signature or profile, this allows similar results,
    but with out sending funny URLs.

    DO NOT TRUST USER INPUT, it may harm not only you
    but also the user, they must be protected.

    Me.
    1. Don't allow any control structures whatsoever to be added to a web page.
    2. Go back to 1.


    Seriously, the only problem is that of control structures. Most tags don't change the flow, or make data modifications, they'll simply set the style (eg: the bold and italic tags) or inject characters (eg: the paragraph tag).


    However, if you want to be safe, simply don't allow any HTML in a page, and require users to format in TeX.

  • I've been working on a personal website [no-ip.com] for the last month and a half, and had this happen. Not to any degree of maliciousness, but it did screw with PHP.

    My expirence with XSS was due to my lazy programming. I didn't filter user content before it was displayed. Someone posted guestbook content with malformed PHP commands. Luckly they didn't know PHP that well, and an error was returned.

    As far as PHP goes, functions like str_replace(), and striptags() should be used to parse all user-inputed data befo
    • As far as PHP goes, functions like str_replace(), and striptags() should be used to parse all user-inputed data before it is displayed. I'm sure other serverside scripting languages have similar functions.

      But its also possible to use str_replace and striptags in ways that DON'T protect from malformed user input and how are you to know the difference?

      Cavalier input processing is another curse of the internet, like email validators that think a-z0-9. are all the characters allowed in an email address, or t

  • Perl CGI coders (Score:3, Interesting)

    by Hecubas ( 21451 ) on Friday November 07, 2003 @01:32PM (#7417673)
    Take note of escapeHTML() [perldoc.com] in the CGI module. Use that on the form input that you save into a database, and that should cut down on most of the XSS problem. It's quite humiliating for a webmaster to have a guestbook get trashed by a load of img tags and evil links to offending sites (although I see a lot of Slashdotters abusing the the URL feature this way).

    --
    hecubas
    • Why don't you use escapeHTML() on the data that you take out of the database to display, or do you really want a database full of html that makes it harder to search and match on or use for anything else apart from a web page.

      Sam
      • Well, there are such things as SQL injection attacks you'll want to avoid. I'd rather play on the safe side and clean any input as soon as I can before it gets stored. Also, you never know if the the next developer after you is going to have enough sense to escape on output. Besides, most search engines index on words, not special characters. In the case of gathering input for things like name and address, you'll want to take the filtering a step futher and completely remove html elements. I'd hate to
  • Going back a good few years I remember finding one of the first sites to allow online shopping. Unbelievable as it might now seem they actually passed the id and the PRICE of the item you were attempting to purchase via the GET method in a query string! I remember having fun changing all the prices to negative numbers, and seeing the total come to around - $1,000,000. Of course I never had the balls to enter my credit card number and see if they would bill it for a negative amount :)
    • I highly doubt it, I am sure that when placing an order all that is sent is your personal info and a list of item#s to be purchased
      • Don't doubt it. This kind of crap is still out there today.

        I was adding a new module to an ASP shopping cart a few years back and found it was calculating the grand total to display on the confirmation page. Instead of recalculating the value when submitting the credit card, they passed the value from the confirmation page via a hidden input field directly to the CC processor.

        The company was grateful enough when I pointed out the problem I had fixed that they gave me a $200 gift certificate.
  • Can someone explain to me how this isn't like shooting yourself in the foot? Isn't the intent to harm a server with a client-injected script? I keep reading it and it seems like the point is to harm a client with a script submitted by a client - that is - yourself???
    • You of course design the attack script in such a way that it doesn't hurt you. Some examples of what you can do with cross-site scripting:

      Stealing user credentials, for identity theft, gaining administrator access etc.
      Modifying transactions performed on the site
      Inserting false information

      And the list goes on. Cross-site scripting attacks very often take advantage of browser security holes, but even with a 100% secure browser but less-than-perfect script, some types of attacks are still possible.

    • If you inject the script into something that saves data across sessions (like a guestbook or forum), you'll hit yourself -- and everyone else who views the page!
    • You have a popular page, and it links to something people want. What they don't realize is that your link to that page changes the page, so they really are at ebay.com, but using content from your site so when they go to login, you get their password instead.

      Or, make a link to a really good deal at amazon.com, and grab their credit cards.

      Of course, ebay and Amazon are probably not that stupid, but many smaller sites are. (Banks maybe?)
  • If you want to learn about cross-site scripting attacks, just click here! [evilhackers.com]
  • You should never design your website when angry. Wait till you calm down, or you'll make mistakes. Cool, Calm Site Scripting (CCSS) is much better.
  • This article somehow reminds me about the old VeriSign's ad campaign with naked women [slashdot.org] by which I was outrageously offended once.

    Seriously, there was a huge cross site scripting vulnerability in Omniture [omniture.com]'s "award-winning web analytics solution for large, complex sites" (too complex for them, I guess) which was included in the famous VeriSign's Site Finder we all loved so much. It's not that VeriSign handles any sensitive data, fortunately...

    (My link doesn't work any more, but the comment is still +5,

  • by Marak ( 722420 ) on Friday November 07, 2003 @02:13PM (#7418131)
    In high school for economics class we got to play a mock stock martket game (on the web). Well my stock market team consisted of myself and another CS student.

    On the website you would enter in the amount of stock, stock symbol, and BUY or SELL in a form. That form would POST to a confirmation page and from there you would click "TRADE" and it would post to some server side page to execute the trade. The fools that designed the site thought it would be a good idea to validate all the data on the confirmation page and NOT on the server side page. We created a local version of the initial confirmation page, changed the action of the form to "http://www.tradingsite.com/cgi-bin/trade.pl". We then proceeded to buy -100000 shares of MSFT for about 40 bucks a pop.
    The server had a formula of something like:

    (STOCKPRICE * SHARES) + COMMISION = SUM
    The sum was then checked against your accounts cash balance.
    Something like:
    IF (SUM > CASHBALANCE)
    ERROR;
    ELSE
    EXECUTE TRADE;

    Well we had a big negative number for our SUM so it passed.

    The server then procceeded to:
    CASHBALANCE = CASHBALANCE - SUM

    Well anyone who has taken 5th grade math knows what happens when you subtract a negative number.
    To make a long story short....we come into school about 2 weeks later and there is a big list of all the teams playing the stock market game in NY state. Our team is number 1 by about 2 million bucks, 2nd place is at about 105k. We confessed to whole the thing explained to the site what they did wrong and didn't get in any trouble.

    The morale of this story:

    Validate all user input before you perform ANY actions with it.
  • by cluge ( 114877 ) on Friday November 07, 2003 @02:15PM (#7418155) Homepage
    Slashdot - you provide some good security information and the next thing you know - 2.5 million hits later your server is a puddle of smoldering silicon and smells really bad. XSS isn't anything compared to the damage that slashdot's attention can get you.

    Our next paper - how to survive a slashdotting.
  • Paper mirror (Score:1, Informative)

    by Anonymous Coward
  • I'm developing a GPLed PHP application and was just wondering if anyone knew of an html sanititizing script that would allow for the input of a list of allowed tags.

    I need something with a GPL-compatible license.

    I guess I could just re-write Brad Choates's Sanitize Plugin [bradchoate.com] in PHP, but it would be nice to not have to go through the trouble. :-)
    • by ajs318 ( 655362 ) <sd_resp2@@@earthshod...co...uk> on Friday November 07, 2003 @04:15PM (#7419293)
      You need to do something like this. Use preg_replace to change all mustang signs to &lt; and &gt; sequences. But that's overzealous - you need to un-mung sequences that look like HTML tags you regard as innocuous. Now you have to define an array for allowed HTML tags, indexed by their "munged" form, like this:
      $allowed_tags = array('&lt;B&gt;' => '<B>',
      '&lt;/B&gt;' => '</B>',
      &c.);
      Do a foreach ($allowed_tags as $i=>$j), and str_replace {it's supposedly quicker than preg_replace} each occurrence of the index $i with the value $j. Only permitted HTML tags will remain. You can even do a second foreach further down the page to list the permitted tags {since they're already HTML-escaped you can just display the indexes and the reader will see it rendered to look like a HTML tag}.

      If you want to allow <A> or <IMG> tags, you should use preg_match expressions for elementary sanity checking.
    • strip_tags() [php.net] is probably a good place to start. It does exactly what you're asking for.

      Say you want to strip everything but bold and italic tags from some text:

      $foo = strip_tags($foo, "<i>, <b>");

      This by itself isn't sufficient to prevent XSS problems, but it's a start. Read over the user contributed notes on that page for some more good tips and example code.

Understanding is always the understanding of a smaller problem in relation to a bigger problem. -- P.D. Ouspensky

Working...