Forgot your password?
typodupeerror
Security IT

Study Shows Many Sites Still Failing Basic Security Measures 103

Posted by Unknown Lamer
from the remember-stack-smashing dept.
Orome1 writes with a summary of a large survey of web applications by Veracode. From the article: "Considered 'low hanging fruit' because of their prevalence in software applications, XSS and SQL Injection are two of the most frequently exploited vulnerabilities, often providing a gateway to customer data and intellectual property. When applying the new analysis criteria, Veracode reports eight out of 10 applications fail to meet acceptable levels of security, marking a significant decline from past reports. Specifically for web applications, the report showed a high concentration of XSS and SQL Injection vulnerabilities, with XSS present in 68 percent of all web applications and SQL Injection present in 32 percent of all web applications."
This discussion has been archived. No new comments can be posted.

Study Shows Many Sites Still Failing Basic Security Measures

Comments Filter:
  • by Anonymous Coward on Wednesday December 07, 2011 @02:41PM (#38293540)

    I work at Veracode, and can share how we test. I'll be brief and technical here, as there's lots of marketing material available other places. In short, we scan web sites and web applications that our customers pay us to scan for them; the "State of Software Security" report is the aggregate sanitized data from all of our customers. We provide two distinct kinds of scans: dynamic and static.

    With dynamic scans, we perform a deep, wide array of "simulated attacks" (e.g. SQL Injection, XSS, etc.) on the customer's site, looking for places where the site appears to respond in a vulnerable way. For example, if the customer's site has a form field, then our dynamic scanner might try to send some javascript in that field, and then can detect if the javascript is executed. If so, that's an XSS vulnerability. As you might imagine, the scanner can try literally hundreds of different attack approaches for each potentially vulnerable point on the site.

    The static scans are a little fancier. The customer uploads to Veracode a copy of the executable binary build of their application (C/C++, Java, .NET, iPhone app, and a couple of other platforms). From the executable binary, the Veracode systems then create a complete, in-depth model of the program, including control flow, data flow, program structure, stack and heap memory analysis, etc.. This model is then scanned for patterns of vulnerability, which are then reported back to the customer. For example, if the program accepts data from an incoming HTTP request, and then if any portion of that data can somehow find its way into a database query without being cleansed of SQL escape characters, then the application is vulnerable to SQL Injection attacks. There are hundreds of other scans, including buffer overflows, etc.

    Personally, I think what we do at Veracode is pretty amazing, particularly the static binary scans. I mean: you upload your executable, and you get back a report telling you where the flaws are and what you need to fix. The technical gee-whiz factor is pretty high, even for a jaded old-timer like me.

  • by kriegsman (55737) on Wednesday December 07, 2011 @02:44PM (#38293596) Homepage
    Oops, I wasn't logged in. The above comment is from me, Mark Kriegsman, Director of Engineering at Veracode.
  • Re:200 (Score:5, Informative)

    by jc42 (318812) on Wednesday December 07, 2011 @02:46PM (#38293628) Homepage Journal

    Why _would_ you [send valid content with a 4xx or 4xx code]? Is there incentive to be standards-compliant, friendly, and heterogenous-mix-of-clients interoperative with attackers?

    Perhaps because you know that the "attacks" are coming from sites that don't know they're attacking you, but are merely asking for content.

    The specific cases I'm thinking of are some sites that I'm responsible for, which can deliver the "content" information in a list of different formats such as HTML, PS, EPS, PS, RTF, GIF, PNG (and even plain text ;-). The request pages list the formats that are available; a client clicks on the one(s) that they want and presses the "Send" button, and gets back the information in the requested format(s). The data is stored in a database, of course, and converted on the fly to whatever format is requested. Things like PS and PDF are huge in comparison, so we don't save them. The required disk space would be exorbitantly expensive.

    There is a real problem with such an approach: The search sites' bots tend to hit your site with requests for all of your data in all of your formats. Some of them do this from several addresses simultaneously, hitting the poor little server with large numbers of conversion requests per second, bringing the server to its knees. Converting plain text to all the above formats can be quite expensive.

    How I handled this was to, first (as an emergency measure), simply drop the request from an "attacker" IP address. This gave breathing space, while I implemented the rest. What's in place now is code that honors single requests, but if it sees multiple such requests in the same second coming from a single address or a known search-site address block, replies to just one of them, and sends the rest an HTML page explaining why their request was rejected.

    Over time, this tends to get the message through to the guys behind the search bots, and they add code on their side to be nicer to smaller sites like ours.

    I've also used this approach to explain to search-site developers why they should honor a nofollow attribute. After all, they get no information from the expensive formats like PS, PDF or PNG that's not in the plain-text or HTML file, so there's no real reason for a search site to request them.

    Note that, in this case, we do actually refer to such misbehaved search bots as "attackers". They're clearly DOSing us, for no good reason. But the people responsible aren't actually malevolent; they just didn't realize what they're doing to small sites. If you can defuse their attacks gently, with human-readable explanations, they'll usually relent and become better neighbors. This helps their site, too, since they no longer waste disk space and cpu time dealing with duplicate information in formats that are expensive to decode and eat disks.

    It's yet another case where the usual simplistic approach to "security" doesn't match well with reality.

    (It should be noted that the above code also has a blacklist, which lists addresses that are simply blocked, because the code at that site either doesn't relent, or attempts things like XSS or SQL attacks, which are recognized during the input-parsing phase. Those sites simple get a 404. But those are a minority of our rejections. We don't mind being in the search site's indexes; we just don't like being DOS'd by their search bots.)

  • Re:Citicorp Hack (Score:4, Informative)

    by tomhudson (43916) <barbara.hudson@D ... com minus painte> on Wednesday December 07, 2011 @04:25PM (#38294826) Journal
    Latest stats for the US - 2nd quarter of 2010 from the FBI: 1,007 bank robberies (includes credit unions, savings and loans, as well as the "too big to fail" commercial banks). [fbi.gov]

    Total loot: $7,820,347.96 in cash, $298.88 in cheques. So far, they've gotten back $1,801,073.18, for a net loss of $6,019,573.66

    Extrapolated to an entire year, that would still be under $25 million net. A rounding error compared to all the US bank bail-outs.

  • by kriegsman (55737) on Wednesday December 07, 2011 @05:51PM (#38295894) Homepage
    That is a GREAT question, and the full answer is complicated and partially proprietary. But basically, you've touched on the problem of indirect control flow, which exists in C (call through a function pointer), C++ (virtual function calls), and in Java, .NET, ObjC, etc. The general approach is that at each indirect call site, you "solve for" what the actual targets of the call could possibly be, and take it from there. The specific example you gave is actually trivially solved, since there's only one possible answer in the program; in large scale applications it is what we call "hard." And yes, in some cases we (necessarily) lose the trail; see "halting problem" as noted. But we do a remarkably good job on most real world application code. I've been working with this team on this static binary analysis business for eight or nine years, and we still haven't run out of interesting problems to work on, and this is definitely one of them.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (2) Thank you for your generous donation, Mr. Wirth.

Working...