Why pen-testing doesn’t matter
Pen-testing is an art, not a science
Penetration-testing is the art of finding vulnerabilities in software. But what kind of an “art” is it? Is there any science to it? Is pen-testing the “only” way or the “best” way to find vulnerabilities in software?
When I took my first fine arts class, we learned that “art is for art’s sake” and “beauty is in the eye of the beholder”. I spent some time philosophizing on whether or not that was true. After years, I was never able to prove those concepts wrong. However, I did learn something interesting about art. If you’re an artist trying to improve technique, trying to sell art, or trying to send a message — it all comes down to one thing: goal setting and accomplishment. Does your artistic outlet meet your needs towards your goal? Did you accomplish what you wanted to?
Compliance “audits” and “education/awareness” vs. Security “testing” and “assurance/process-improvement”
Many organizations are attempting to improve software security assurance by improving process and technology. Others are just trying to increase security awareness, or meet compliance objectives. Some are trying to keep their heads above water — and everyday they worry that another breach will reach media attention or become a statistic on Etiolated. For those who are showing improvements and making software assurance a science and a reality — they are few in number.
Microsoft started the trusted computing initiative via a memo from Bill Gates in 2002. The end result was the Security Development Lifecycle (SDL), a process to improve software security. The Open Web Application Security Project (OWASP) was started in 2001, and began a project that was completed last year called the Comprehensive, Lightweight Application Security Process (CLASP), which utilized a lot of the research OWASP members had been working on for years. Also in 2001, Gary McGraw and John Viego wrote a book called Building Secure Software: How to Avoid Security Problems the Right Way, which later became a methodology for Cigital (Gary McGraw’s company) to move software security process knowledge into the hands of Cigital clients. Also last year, McGraw released a new book, Software Security: Building Security In, which is the culmination of his Touchpoints process model.
One year and one week after 9/11/2001, the National Strategy to Secure Cyberspace was released for the public eye. The US Department of Homeland Security created a National Cyber Security Division, which in turn created a strategic initiative, the SwA Program (Software Assurance). This program is based on one short, but very important part of the National Strategy to Secure Cyberspace document, “DHS will facilitate a national public-private effort to promulgate best practices and methodologies that promote integrity, security, and reliability in software code development, including processes and procedures that diminish the possibilities of erroneous code, malicious code, or trap doors that could be introduced during development”. The current director of the SwA Program is Joe Jarzombek, who is responsible for many important objectives, including the Build Security In web portal. This portal includes much of the on-going work from Cigital, NIST, MITRE, and the DHS on software assurance process improvements.
The week of 9/11/2007, OWASP planned a huge event known as OWASP Day. OWASP is planning another OWASP Day with a March 2008 time frame for those of us who missed out on the first one of its kind. One of the presenters in Belgium, Bart De Win, gave a presentation on “CLASP, SDL, and Touchpoints Compared”. All three are really just books on secure software processes, so comparing them at first seems a bit like doing a bunch of book reports (and possibly subjective going back to the whole “art is for art’s sake” argument). Bart’s comparison is interesting, but I’m interested in what all three are missing. Towards the end of this blog entry, I’ll recommend a new secure software process that takes into account security testing from both a software assurance model and a penetration-testing model.
The premise behind having a software development lifecycle that takes security into account is that at some point — business analysts, requirements writers, software architects, software engineers, programmers, and/or testers will perform tasks that are part of a process that involves security as a forethought. In other words, testing “for security” is not done “after-the-fact”, nor is it done “right before release”. Security testing before release is typically when development hands-off the application to a support or operations team. Quality testers refer to this as “operations testing”. If security is a total afterthought, quality testers usually call this “maintenance testing”. Both situations are really where penetration-testing is done, which is usually accomplished by security professionals, usually in an IT security team (or consultants hired by such a team). Many of these individuals actually prefer a black-box assessment, where knowledge or access to the configurations and source code is again an afterthought. Some pen-testers prefer a “source-code assisted black-box assessment” and would like access to the source code and configuration files, but policy or other constraints limit this kind of access.
One of the questions that might come up here has to do with penetration-testing as part of compliance objectives, such as Sarbanes-Oxley, SAS70, HIPAA, or the dreaded PCI-DSS. In this situation, you have assessors working in an auditor role. A very common trend is for a PCI “approved scanning vendor” (ASV) to perform a penetration-test using COTS “security scanners” which often require both customization and “vulnerability verification”. The verification comes into play because scanners will often identify a vulnerability when it turns out later that the vulnerability does not exist (a condition known as a Type I error, or false positive). ASV’s test once a year against a “test-ground” network and web application approved by the PCI Council, but nowhere does this process require the ASV to remove false positives or customize their scanner tools. Most COTS security scanners simply just work the first time against the ASV test-grounds. How often they work or don’t work against real-world networks and web applications without proper customization is left as an exercise for the reader to determine. Your guess is as good as mine.
What ever happened to full-disclosure, free information, and the hackers?
Free security research has been available since the MIT TMRC started playing with mainframes and micros. The media and software manufacturers to this day still don’t understand the motivations, tools, and techniques that hobbyist security researchers have employed — much of which has truly been the “art” of vulnerability finding. However, many hobbyists turned to starting or joining up with businesses during the dot-com era. The lost “art” of vulnerability finding made its way into the corporate environment. Around 2001 and 2002, the largest of software corporations (Microsoft was already mentioned) learned the benefit of performing self-assessments, including secure code review and even secure design inspection. Companies such as @Stake and Foundstone were founded, often brought in to perform these reviews/inspections as consultants, and then were both later acquired by Symantec and McAfee, respectively.
Other security researchers (especially ones that were unable to take part in the dot-com era success due to previous computer felony convictions, or other disadvantaged situations such as living in a third-world country) are possibly now what has become the criminal underground of the Internet. There are still many people who find themselves in between these two camps (gray hat hackers), but their numbers are few compared to what they used to be. If penetration-testing is still an art form, then these are the only people practicing it — the black hat and gray hat hackers. It is quite possible that some of the improvements in fuzz testing have come from these types in the past few years, although even many of those people have started their own companies or joined up with some larger organization. Where are the “hacker groups” that remain out there?
Software manufacturers are beginning to understand the problem, and big financials and e-commerce also have implemented their own secure software processes. Gadi Evron gave a presentation where he called out who was using fuzz testing in the corporate world earlier this year. The word on the street is that financials and e-commerce are “fuzzing before purchase,” i.e. they won’t buy a new product, especially a network security device or the latest DLP-HIPS-NAC-UTM solution without running an internally purchased Codenomicon, BreakingPoint Systems, Mu Security, or beSTORM fuzz testing engine and doing the best they can to break it first. “Fuzz before release” occurs when some vendors such as Microsoft, Symantec, and Cisco build their own custom fuzz testing engines such as FuzzGuru (Microsoft), SEEAS (Symantec), and Michael Lynn (Cisco — oh wait they sued him) — I mean CIAG (oh wait they dismantled that group, didn’t they?).
“The future is already here — it’s just unevenly distributed”
The quote above is taken from William Gibson to describe that the situation that we’re in doesn’t apply to everyone. However, there are some things that it obviously does apply to, which I’m about to cover. Surprisingly futuristic, today’s security testing tools are almost as good as all of the ones mentioned in the previous section. This is partially because fuzz testing isn’t the end-all-be-all for security testing. In fact, fault-injection and network security scanners (e.g. Hailstorm and Nessus) also aren’t the end-all-be-all in security testing. Secure design inspection and secure code review are what make the secure software processes actually work. However, testing tools for secure inspection/review are few and far between. They’re maturing very slowly, and many penetration-testers, developers, and managers feel that:
- Secure inspection/review tools have too many false positives for developers to deal with, slowing down the programming phase
- Static analysis tools have more false negatives than runtime analysis that combines fuzz or fault-injection testing, missing a lot of vulnerabilities
- Design/code review cannot verify vulnerabilities as well as runtime analysis, making removal of false positives that much more difficult and time consuming
- Runtime analysis tools combined with fuzz testing and fault-injection provides a much easier path to writing exploits
- Developers are difficult to work with and will never understand security issues
- Automated source code analyzers don’t support programming languages or frameworks used
- It’s cost-prohibitive to give every programmer a security testing tool when licensed on a per-IDE basis
If myself or the vendors behind these products can put these notions to rest — let us give it a shot. In 2008 there is no reason that any of the excuses above will apply for new software projects. Sure, there is tons of existing code — a lot of it in binary format — much of it legacy — and worst of all: your company or organization still relies on it without a plan to replace or even augment its functionality.
I feel as if I’m stuck in a similar situation using the primary software pieces that I use everyday — Firefox, IE, all the major browser-plugins made by Adobe (Flash and Acrobat), Apple (QuickTime), or Sun Microsystems (Java). Then there’s the other software that I use made by the likes of AOL, Mozilla + the usual suspects (Adobe, Apple, Mircosoft, and Sun) in the form of instant messaging clients, productivity applications (MS-Office, OpenOffice, iWork), and arts/entertainment (Windows MediaPlayer, iTunes, Adobe everything, Apple everything). These are the targets — the important software that we need to keep secure. Yet the only software manufacturer out of the list above that has a secure software process and writes their own fuzz testing engine is Microsoft. However, if we were able to secure these applications properly then other software would instead be targeted. I use enough embedded devices running some sort of burned-in software (that never or rarely updates) to come to the realization of this outcome. I’m also one of those types of security professionals that buys into some of the FUD with regards to web applications (especially SaaS) and open-source software used as third-party components in everything (the RNA to a full application’s DNA).
The Continuous-Prevention Security Lifecycle
The reality is that all software needs to be properly designed and inspected — all software requires a secure software process. Earlier I mentioned that the SDL, CLASP, and Touchpoints processes were “missing something”. While working on the matter, I have discovered some unique approaches that extend and simplify the primary three secure software process models. My suggested secure software process consists of only four elements:
- Developers using Continuous Integration (Fagan inspection + coding standards + unit testing + source code management + issue tracking + “nightly” build-servers)
- MITRE CAPEC used in the design review ; Secure design inspection performed using CAPEC
- MITRE CWE used in automated secure static code analyzers at build-time ; Secure manual code review performed using CWE
- CAPEC and CWE-driven automated fault-injection and/or fuzz testing tools at build-time measured with code coverage ; Verification of non-exploitables vs. exploitables
All of the above steps can be performed by untrained developers except for the parts after the semi-colons. For step 2, developers can use Klocwork K7 or Rational Rose/RequisitePro along with security professionals during secure design inspection, or provide the security team with their UML designs or requirements. For step 3, a manual code review workflow tool such as Atlassian Crucible can be used to combine Fagan inspection with the necessary security sign-off to complete a secure manual code review (to be completed on every check-in, component check-in, or before every nightly/major build — depending on the environment). Step 4 verification process requires the most attention by security professionals, although there is little reason that all vulnerabilities found can be issued with a low priority and verified before release. All the other steps are continuous and can be performed/fixed everyday, possibly at every check-in of code — but usually at least once a day in the nightly build.
The most important part of my “Continuous-Prevention Security Lifecycle” (CPSL) process is for developers to write unit tests that assert the behavior of each defect’s fix. This is known as continuous-prevention development, and it’s a special kind of regression test that works especially well for security vulnerabilities because it:
- Tests for the bug, as well as can identify bugs with similar behavior
- Fixes the bug, and possibly any bugs that work in the same way if generic enough
- Can be re-used in build-servers across projects
Penetration-testers should take special notice that my CPSL process does not include any operations or maintenance testing. All of the testing is done before quality testers (or developer-testers) even get to begin system integration or functional testing. This type of security testing is suggested to be done very early in the process, which follows similar guidelines as the SDL, CLASP, and Touchpoints processes suggest.
The benefits and drawbacks of open-source software
There are some that may complain about my itemized suggestions based on a limited budget. For those situations, open-source software can be used: e.g. Fujaba instead of Klockwork K7, NASA’s Software Assurance Technology Center (SATC) Automated Requirement Tool (ARM 2.1) instead of IBM Rational RequisitePro, and Trac instead of Atlassian Crucible. If you spent any time reading my last blog entry on 2007 Security Testing tools in review, then you’ll find gems such as PMD SQLi and FindBugs as reference secure static code analyzers (as well as the many mentioned for PHP, ASP, and Java web applications), plus countless of open-source fuzzers and fault-injectors.
As for defining a secure software process for open-source software projects, many of these are integrated or bundled with commercial software. Which brings me to a few points. First of all, commercial software developers should be testing third-party components in addition to their own code — anything that gets built on the build-server should go through the same checks, imported or not. Bugs will get found and fixed in open-source projects through this sort of effort, in addition to open-source projects that operate under my CPSL or other secure process. As a final point, it’s no longer theoretical that “the world can review open-source” thanks to efforts such as BUGLE: Google Based Secure Code Review.
Software security assurance: Predictions for 2008
One of my predictions for 2008 is that we’ll start to see individuals and companies that have invested in penetration-testing skills move towards awareness and compliance. The shift will in part be due to security testing moving to a place earlier in the development lifecycle, with “penetration-style” security testing tools being replaced with “secure software process friendly” tools. Many new tools for secure software process models will evolve from existing workflow management and design inspection development tools. Classic, gray hat “penetration-tester” tools such as automated fault-injectors and fuzzers will become Ant tasks on a build-server. Security testing, if pushed early in the life cycle, will actually improve code quality — causing less spending on quality testing at the cost of more time/dollars spent on developer-testing.
Do not let all of this confuse you into thinking there isn’t room for major improvements to secure software processes, security testing tools, or other security research. It’s just a simple re-focusing of where, who, and when security testing is done. This paradigm shift will allow initiatives like Build Security In, CAPEC, and CWE to really take off. New projects that concentrate on measuring and understanding false positives are already in larvae stages. Combining data from CAPEC into other projects such as the WASC Threat Classifications (in a similar way that the OWASP T10-2007 used CWE data) will lead to new attack patterns and ways of understanding current attack patterns. Maturity of CWE and CVE data will drive results for CWE-Compatible tools and services to lead into CWE-Effective equivalents.
By allowing developers “in” on the security industry’s closely-guarded and well-kept secrets, we’ll be able to protect applications in ways we have never done in the past. Secure frameworks such as HDIV will continue to improve, possibly to the point where security testing isn’t necessary for a large majority of attack paths and security weaknesses. Exploitation countermeasures based on AI might move into applications to prevent a large amount of exceptions such as those explored during penetration-testing efforts. At the very least we’ll start to see distributed applications logout users automatically or disable accounts that attempt automated fault-injection, potential fraud, or other unwanted attacks. It’s possible that you’ll even make a friend on a development team, or maybe even become a full-time “security developer” yourself. There will always be room for pen-tester artisans in the wild world of computer science and software engineering.

Hi there. It seems to me that you are mixing various concepts. For example, application pen testing is different from network pen testing (e.g., in network pen testing there is no source code analysis approach). (Many things you say appear in an address Dan Geer gave in the acsac conference a few years ago, however he makes the right distinction between the different forms of penetration testing.)
But more importantly, I’d like to remark that penetration testing is an assessment method. Its objectives and results are aligned to give those responsible for the security of the test’s target a snapshot of the target’s security –possibly along with remediation recommendations.
On the other hand, you cannot say that penetration testing is an art. This is the same as saying that engineering is an art because a few kids build sand castles without careful design and implementation. One could say that most practitioners don’t have a systematic approach based on careful studies, but this is of little importance. (Notice that we can say the same for most of the field of security, including crypto designs, e.g., AES, SHA-1, DH) When houses and bridges were built by craftsmen, with no engineering involved, their work wasn’t called art. It wasn’t. This is regardless their beauty, Mr. Keats. Methinks that the stage where no design and careful analysis in penetration testing is finishing and we need a better grasp in what is important to understand about it, and how to do it. Systematic research on penetration testing has been brewing for the last few years, and it will get better. I am working on this, and I think that more will follow.
Some questions: how can we measure the coverage of a pen test? Or what tools do we have to analyze a pen test (a posteriori)? Or even to compare two pen tests executed against the same target?
Cheers,
Ariel
@Ariel:
Actually, application pen-testing and network pen-testing are the same thing. Software security penetration-testing could be combined with source-code to provide a hybrid approach directly, by using the information from source-code to improve the test cases for the software security penetration-testing.
Application penetration-testing could also utilize source code, in a way similar to how W3AF plans to use path-traversal or other predictable locations to identify the source and then use it for additional test cases. Since network pen-testing includes application pen-testing, it could use it in the same way. Similarly, application and network pen-testing do not have to be zero-knowledge, and could incorporate aspects of source-code and/or configurations, other customizations, and environmental factors (e.g. architectural threat-modeling) to improve test cases.
I understand the goal of pen-testing in an assessment process, but I am questioning the validity and usefulness of black-box assessments.
The crypto designs you mentioned were not designed as a art, but as a science. Most modern applications are not designed like crypto systems. With crypto systems, these are usually designed and tested using formal, semi-formal, or informal methods. Pen-testing (included manual or automatic, using fuzz testing or fault-injection - as well as other techniques) is neither formal, semi-formal, or informal. It is “ad-hoc” and therefore an art because it has no predefined logical flow, no scientific input or output, and precludes the use of both proof or arguments for correctness. I will get to the differences in more detail in a later blog post.
If you can point me towards systematic research that includes at least “arguments for correctness” so that at least points in logical errors of testing can at least be described, then I will slightly concede your point. I’m working on this type of research with a team of individuals but it has just started and we haven’t released anything yet. I’m not at liberty to discuss it yet (another future blog post), but I can tell you that it has to do with benchmarking automated fault-injection scanners.
We can measure the coverage of a software or application penetration-test very easily using code coverage tools, however this does not provide a benchmark for the tool, as it doesn’t take into account false positives or false negatives, nor does it take into account configuration/customization or the tester behind the testing tool. Take a look of what I posted in response to Acidus on memestreams. I don’t want to say that this will be part of a later blog post, but hey - that’s one of my best answers right now because of my limited time in responding to your post here. I will try to respond to further questions if you have any more.
Hi again dre,
I should have described myself more clearly when differencing application vs. network penetration testing. The difference I see is that in an network pen test most of the devices and programs you’ll find running are well known, most of these have previously been audited and there might be known vulnerabilities for some versions of these; on the other hand, custom applications require a different approach (I’m not saying that all applications are custom, but you naming fuzzers and application scanners seems to agree with this). In a network penetration test the focus is in assessing what could an attacker do (with the publicly known information, e.g., known vulnerabilities) and is typically restricted to exploits available to the pen tester; say, beacause developing an exploit for a binary vulnerability is much more difficult (these days) than finding an injection vulnerability and developing the exploit for it. So often, in a pen-test with a fixed duration a black box approach is the only possibility.
We have been doing research in pen testing for several years at Core Security Technologies. In their PacSec talk Ivan and Gera show several facts (http://pacsec.jp/psj03/en/2-3iarce-gera-PacSec-JP2003.ppt), say, they provided a methodology to analyze computer attacks and show that some “paths” are more efficient than others. This work stands on the model of Futoransky et al. (http://www.coresecurity.com/files/attachments/Futoransky_Notarfrancesco_Richarte_Sarraute_NetworkAttacks_2003.pdf)
Further, Tiscornia and Russ recently presented a generalization of the above model to applications and a very interesting implementation of some of their ideas in Hack.lu and PacSec. We continue this line of research as we think there is a lot to learn about penetration testing.
I must say that web application scanners have proved useful in certain situations. Yet black, grey and white analysis still throw loads of false positives and false negatives and require many hours to analyze a single application. At present, automated detection of vulnerabilities cannot replace manual work. Hence, the systematic analysis of these tools will help to improve them. But we need to make it right. You cite some of the first attempts on this in your answer to Acidus. Some of these are closely related tothe work in software analysis; and we needn’t reinvent the wheel (e.g., the many known problems with coverage and heuristics that attack these problems). Personally, I still do not see the relevance of some of these measures. I am convinced that user intervention is necessary and therefore prefer to concentrate on helping these users with their work (e.g., replay information for all alarms to eliminate false positives or even automated exploit construction when possible).
Two last bits:
-Going back to the “art” argument and considering the dozens of available tools that are used for assessing the security of a web application I think that there is little art in them (and the underlying pen tests).
-What I meant by the cryto desings I mentioned is that there is no security proof accompaying them. They are only believed to be secure because of their untarnished reputation (an maybe some sound but insufficient arguments).
Cheers
@ Ariel: all very good points.
I want to see software assurance (SwA) tools take CAPEC/WASC and CWE/OWASP-T10 as input or send as output. Net/app pen-testing tools can take CVE as input or send as output. CVE is known vulnerabilities while CWE is known software weaknesses and CAPEC is known attack-paths.
Some pen-test scanning tools already combine these concepts (e.g. ones made by CORE, ImmunitySec, Rapid7, Foundstone), while others will be adding SwA support very soon (Qualys, W3AF, Tenable, nCircle). I’m not sure we’re going to see CWE-Compatible tools add support for net/app pen-testing. Similarly, I don’t think that CVE-based scanners are ever going to be CWE-Compatible or CWE-Effective.
Thanks for the discussion and pointers towards those documents. I agree that net/app pen-testing certainly have elements of science (also: I didn’t intentionally mean to mix the concepts). However, I was referring to SwA pen-testing (fuzzing or fault-injection), which are largely ad-hoc testing with little to no informal arguments or proofs. Just as you said, even crypto systems rarely undergo any formal, semi-formal, or informal method specification or testing.
As for the automated vs. manual testing: I agree. You’ll see in my CPSL above that Fagan inspection, manual code review, and manual verification of exploitables are required. What differentiates my CPSL from any other secure lifecycle is that the checks are turned into unit tests - which provides a process improvement sans type I/II errors.
I think to some degree - anything can be automated, even if you’re just using software to improve or speed-up workflow. This is why I listed some tools such as Klocwork K7 and Atlassian Crucible. Crucible doesn’t make it so that Fagan inspection removes the manual work, but rather makes the process more formal and improves the time to deliver results.
As for heuristics and coverage in fuzz or fault-injection testing tools, you are correct that not all measures are sound (e.g. protocol informatics), but certainly others (e.g. GA’s with coverage in EFS) provide value. At least, that’s what I got from Chalie Miller’s Real World Fuzzing presentation at Toorcon 9. There are certainly many limitations to the EFS approach, but these are well-known. Commercial fuzzers such as Codenomicon and MuSecurity borrow from these GA/cov and PI heuristics, respectively. Of course, BreakingPoint Systems and beSTORM have their own innovations, mostly around intelligent fault detection instead of heuristics and coverage.
Jared DeMott, who wrote EFS, is working on benchmarking (as stated in his BH-US-07 talk) as well as a book with Ari Takanen and Charles Miller (due in 2008). Gadi Evron and Noam Rathaus were also supposed to release a book on “Open Source Fuzzing Tools” that hasn’t made it out yet. Other benchmarking work is also being performed by the open reverse benchmarking project, which was started by Tom Stracener.
The next iteration of these types of tools are clearly going to include informal testing methods, although the degree to where we go with formal methods is yet unknown. From reading recent literature on ATP, I remain skeptical about formal methods. However, one cannot argue with the results of Coverity (i.e model-checking) and Fortify (or Ounce, GrammaTech, or Armorize) in the static analysis space. This is why I recommend both approaches: model-checking or static analysis to handle Type I errors (i.e. false positives), and fuzz/fault-injection testing to handle Type II errors (i.e. false-negatives), both backed by manual verification processes. Results can then be used to generate reusable unit tests that continuously prevent security-related defects.
I would like to see more formal/semi-formal/informal methods in secure design review tools and processes. This area is open for the most amount of research today, and could provide huge wins. The problem is terminology and approaches are widely varied, MITRE with CAPEC, Microsoft has their STRIDE model and TAM tool (and now a TAM-E tool for Enterprises), Octotrike has the Trike threat-modeling methodology (and Octotrike tool) and the Privilege-Centric Security Analysis, CMU has OCTAVE, and BSI/DHS/Cigital has Architectural Risk Analysis. Then there are several other methods that are derived from rating/scoring systems such as Microsoft DREAD, FIRST’s CVSS[2], Wysopal’s work on combining CWE with CVSS2, and HP/SPI-Dynamics’ Web Application Risk Modeling (W.A.R.M.). If further research is done in this area to turn it into more of science, then this can be easily passed down to security testing by designing/enhancing the test cases.
@ dre:
Great. This work you mention can only benefit us (and many thanks for the pointers!).
It seems to me that webapp testing tools comprising fuzzers et al. are still young and have a lot ahead, static analysis tools are further in the way (they grow on several years of work) and these two are catching up with security requirements. As you say, we are being more competent at specifying these requirements and describing threats. It is only sensible that these two meet.
We are becoming better at assessing certain security properties. For these, tests are necessary. As you mention, we are combining these tests intelligently. Good!
However, we are faced with a problem that is very difficult to solve: designing a good security model. We can build tests for certain known attacks, but we aim to prove that our systems escape all attacks. Even if we got automated theorem proving right for a security model (which I don’t think it is feasible), we’d need to come up with the right models. For example, crypto security models (e.g., Canetti, Dolev-Dwork-Naor, …) do not involve side channels (e.g., a system can be called secure under a security model and still be vulnerable to a timing attack).
Nobody said it would be easy ;)
@ Ariel: Let me give the pen-test vernacular one more try.
In information security, pen-testing refers to breaking into specific networks or computers (i.e. Application or Network Pen-Testing). In vulnerability research or software assurance, it refers to finding security-related bugs in software (i.e. Software Pen-Testing, Cryptanalysis).
However, we are faced with a problem that is very difficult to solve: designing a good security model
For systems, we have learned to mold concepts together. For example: Samhain, chkrootkit, rkhunter, the99lb, DieHard, PaX, chroot, improvements to GCC (userland) + St. Jude / St. Michael, DSI/DigSig, grsecurity (kernel) + protection from virtualization rootkits + detection of network covert channels + managed SIEM, etc. No formal methods here, but we could start by eliciting their designs, proving some of them, and model-checking their sources.
The same is not yet true for applications, which suffer from similar problems that systems have gone through (with very few sandboxes, trusted paths, logging, thresholding, monitoring, etc). Jeff Williams is working on a project called ESAPI which works to solve some of these issues for web applications. Modern applications of all types need higher functional protections and assurance levels. OWASP is also starting a Browser Security Project as well to solve the “other” side of this problem.
For defining access control matricies - RBAC is better than closed DAC is better than open DAC is better than programmatic controls. An even better method would be to use declarative controls, whereby application components have access to different databases with varying levels of database user privilege depending on application user group access rights.
I checked out some of the crypto threat models you mentioned, and while they appear interesting for solving crypto design and crypto protocol implementation security models, I’ll have to analyze them further to see if there is any value to other components in secure software design review.
Clearly you know exactly what I’m referring to with all of this since your presentation on ND2DB is exactly the type of research in software assurance that needs to happen.
@ dre:
I understand the difference and I agree on your take. There’s a lot of good research to work over. All the constructs you mention have really helped to secure our systems.
Notice, that there’s a better analysis in securing systems than in finding bugs or breaking them. It seems to me that one needs to be proficient in the latter to master the former.
I didn’t mean that the crypto models I mentioned could apply to a broader scope. In fact, I don’t think so. Also, they are aimed at proving security properties of systems, but not at finding problems in them.
Cheers,
Ariel
Yikes. Don’t tell the client. They like stuff to be scientific and repeatable. The art is quite lost on them. This is why I don’t like my penetration tests to be that “organized”.
I always make sure coverage is good across all the security areas like authentication, access control, xss, csrf, etc., but complicated pen testing structures are only useful for those people don’t get the art of hacking. I’m glad there are other people who see the intersection.
The recommendations, cleanup and the overall process of helping the client build more secure code are what should be demonstrably scientific. Fortunately, you can never lay out an “execution plan” for a great piece of art.
This is a discussion that people should have more as I think most clients fundamentally don’t understand what hackers do and how they think.
@ Arshan:
Often, I see penetration-testing or security companies or penetration-testing security vendors doing what I consider to be slightly more useful than commercial AV or overly expensive firewall appliances.
The presupposition of this thread is that pen-testing is dead / pen-testing doesn’t matter… whether software or network.
What I’m trying to say here is that a model like the CPSL can completely replace the need/desire to pen-test by security professionals — and for those people to take a hint and get out of the software assurance / security industry before they don’t have jobs just like the AV people.
If pen-testers still want to pretend that they are 17 yro haxors and “legally break-in” to Fortune 1000, then they can do it under the auspice of compliance.
From what I understand, Aspect Security does both PCI review as well as “real” security review. So you should be familiar with the difference here.
The recommendations, cleanup and the overall process of helping the client build more secure code are what should be demonstrably scientific. Fortunately, you can never lay out an “execution plan” for a great piece of art.
I’m not exactly sure what you mean here. If I hear you correctly, you’re saying that the only way to be scientific is to help customers build more secure code. This is part of what I’m trying to say here. As for an “execution plan”… it appears that both Ariel and I have given some pretty heavy plans for all angles of software assurance and penetration-testing (to include cryptanalysis). Did you miss those, because I can point you towards them?
I think most clients fundamentally don’t understand what hackers do and how they think.
I think clients need to have a security program that takes into account both application and software security. In my post on Building a security plan, I discuss possible frameworks for building a security program, as well as risk analysis for applications/software, vulnerability management for application security, vulnerability theory to understand how hackers think (which provides more details than most hackers know how they think themselves), and software assurance practices, as well as how they relate to the customer’s customer.
I think you have a very narrow view on my approaches, although not as quite as narrow as most of the rest of the industry — who usually outright disagrees with what I’m saying in this sort of anti-pen-test rant.
I guess I don’t know what to say about this unless I knew who you were implying:
> Often, I see penetration-testing or security companies or
> penetration-testing security vendors doing what I consider to
> be slightly more useful than commercial AV or overly expensive
> firewall appliances.
I haven’t interacted with all of the companies but I know that Aspect gives its clients an unbelievable amount of useful information on how to break and how to fix without FUD and useless false positives.
> This is part of what I’m trying to say here. As for an “execution
> plan”… it appears that both Ariel and I have given some pretty
> heavy plans for all angles of software assurance and
> penetration-testing (to include cryptanalysis). Did you miss
> those, because I can point you towards them?
Maybe I’m too romantic - I just don’t see someone who follows instructions well with an extremely detailed, multi-angle test plan doing as good a job as an experienced, organized hacker. Deviation, adaptation and creativity are what really allows attacks to happen. Too simple, maybe, but I think there’s truth in it.
> and for those people to take a hint and get out of the software
> assurance / security industry before they don’t have jobs just
> like the AV people.
Whoa, reverse-giving-it-back-to-the-security-vendors FUD! I like it. =P
> I’m not exactly sure what you mean here. If I hear you
> correctly, you’re saying that the only way to be scientific is to
> help customers build more secure code.
What I was saying is that metrics, assurance, SDLC improvement, security processes - that is where the repeatable, demonstrable science really can be.
Also, email me, I’ve got news about browser group. Well, maybe not news. Just email me!
@ Arshan:
Maybe I’m too romantic - I just don’t see someone who follows instructions well with an extremely detailed, multi-angle test plan doing as good a job as an experienced, organized hacker. Deviation, adaptation and creativity are what really allows attacks to happen. Too simple, maybe, but I think there’s truth in it
I think that experience is also best. It’s the brain behind the tools, not the tools that count most. However, there are good testing tools and bad testing tools. There are good test cases and bad test cases. There are good testing methodologies/processes and there are poor ones.
The CPSL guide mentioned specific tools, but it did not enumerate them in a list. I casually made references to them. You should also note that the CPSL is only informally documented on this blog and nowhere else in a very conversationalist style manner with excellent commentary such as Ariel’s and yours.
First of all, I want to say that “checklists save lives”. If it’s good enough to do the most difficult triage in an ER, and it’s good enough to keep people alive in IC units - then I’m sure there is just something to humans using good tools with good checklists.
MITRE CAPEC and CWE are our checklists. MITRE CWE Vulnerability Theory can turn a newbie into a professional with a little passion and some intense involvement in a weekend. Hand a Java programmer a copy of Secure Programming with Static Analysis, TAOSSA, and a bottle of tequila — then watch the vulns fly.
We’ve been doing non-standardized, ad-hoc testing for too long. What some people have created is a sort of movement towards a standard, informal method of security inspection/testing. We need some people using formal methods for security. We need lots of people performing secure SDLC work, preferably something closer to my CPSL than to the Micrsoft SDL.
Of course we’ll still need new vulnerability research done by new vulnerability researchers. It should be a little different than in the past, and it already basically is if you look around our industry.
This isn’t a new argument, there has been controversy over software testing for at least 20 years now. I think exploratory testing is best done with all the knowledge possible and the best tools.
What security professionals are missing is Vulnerability Theory, and they need to start with a Glossary of Vulnerability Testing Terminology.
They’re also missing out on some of the best tools. When Eugene Kapersky noticed that Peter Szor was using manual renaming of every variable in 1997, he introduced Peter to IDA Pro, which supported automated cross-references. Peter’s methodology was flawless, but he didn’t know or wasn’t using the latest in technology for structural analysis.
Right now all the best tools are language/bytecode-specific — best example is .NET Reflector and how it rocks all over IDA Pro. However, there are even more choice examples that would affect the CPSL. For example, testing tools are now being built into the frameworks such as WebTest in Grails. Defenses are built baked into frameworks such as HDIV. Certain techniques allow “after-the-fact” upgrades such as AOP weaves for code instrumentation and dependency injection for design (only supported well in Spring MVC today).
I think I’m arguing against, “A fool with a tool is still a fool”, but I’m not sure yet.