svadev AT lists.siebelschool.illinois.edu

Subject: Svadev mailing list

List archive

Re: [svadev] safecode tests

From: John Criswell <criswell AT illinois.edu>
To: geremy condra <debatem1 AT gmail.com>
Cc: svadev AT cs.uiuc.edu
Subject: Re: [svadev] safecode tests
Date: Tue, 25 Oct 2011 16:57:34 -0500
List-archive: <http://lists.cs.uiuc.edu/pipermail/svadev>
List-id: <svadev.cs.uiuc.edu>
Organization: University of Illinois

On 10/25/11 1:20 PM, geremy condra wrote:

On Tue, Oct 25, 2011 at 8:58 AM, John
Criswell<criswell AT illinois.edu>
wrote:

On 10/24/11 6:32 PM, Matthew Wala wrote:

It looks like you've configured everything okay. As far as I know no
one's been running the tests in the mem_safety directory recently so
it's not surprising SAFECode isn't catching everything.

There's at least two reasons I can think of that a lot of the double
frees are going unnoticed. Firstly just going through the runtime
library source code it seems that the function checkForBadFrees is
currently disabled in the debug runtime (I'm not sure why and I don't
know much about that part of the code, so something else might be
going on).

That's correct. Looking over the revision history, I found out why I
disabled it. Quoting from the commit log of Revision 136632:

"Do not report errors for bad frees; we need complete vs. incomplete
versions of pool_unregister() to report errors accurately."

The short answer is that several enhancements need to be made to get it to
work without generating false positives.

For the curious, there are two types of checks in SAFECode: incomplete and
complete. When SAFECode transforms a program one compilation unit at a time
within Clang, it does not know everything about the program (because it
can't do whole-program analysis), and so it inserts incomplete checks.
Incomplete checks try their best to detect memory safety errors, but
sometimes they conservatively allow operations to proceed if they can't find
the memory object in question in the lookup tables. The assumption is that
the memory object isn't registered because it was allocated by external
code.

SAFECode's version of libLTO (which is now available but not yet documented
in the Install Guide) will do whole-program analysis and convert incomplete
checks to complete checks when appropriate. It will use DSA to figure out
which pointers always point to memory objects allocated within the program
and which pointers can point to memory objects allocated by external library
code. Checks on the former pointers will be changed to complete checks;
these checks will raise a SAFECode run-time error if they can't find the
memory object to which the pointer points.

So, there's three issues here:

1) We need to have complete and incomplete versions of the checks for
invalid frees. Real programs were flagging false positives because
checkForBadFrees() always acted like a complete check.

2) We need to get libLTO polished and ready. Many of SAFECode's checks just
aren't valuable without it.

3) We should modify SAFECode's libLTO to transform the program to use the
automatic pool allocation memory allocator. This allocator is tolerant of
invalid frees (i.e., it will detect them and ignore them).

Very interesting, thanks for the information. Can you elaborate on
what other checks should or do need whole-program analysis?
Information on how to take advantage of the modifications you've made
to libLTO (apologies for my noobness) would also be very helpful.

SAFECode performs three kinds of checks: checks on loads and stores, checks on GEPs (pointer arithmetic), and checks on indirect function calls. The first two checks require whole-program analysis in order to determine whether all memory objects to which the pointer points are allocated within and manipulated by the instrumented program. If they are, then SAFECode knows that any errors it finds are real errors. Otherwise, an error might be due to a pointer that was passed in from external code which SAFECode has not analyzed and instrumented.

Indirect function call checks are pretty similar. Without the whole program, you can't compute an accurate and complete call-graph.

Whole-program analysis is useful for optimizations, too. With automatic pool allocation and DSA's type inference capability, we can remove run-time checks on loads and stores (which doesn't break sound analysis; you can read about that in the PLDI 2006 paper). When we have the whole program, we can change checks that use splay-tree lookups into checks that do no lookup (this cannot be done when a check checks a pointer from a global variable defined in another compilation unit).

To make SAFECode usable, we split it up into two components: a conservative set of passes in Clang that insert incomplete checks and a set of whole program transforms that modify/optimize the checks within libLTO. If you don't use libLTO, some of the checks become weaker (load/store checks and indirect function call checks do very little), but you can still catch quite a few errors (practically any buffer overflow error). Many or all of the bugs we caught in the Linux kernel (the SOSP 2007 paper) were caught with incomplete GEP checks.

For publications on SAFECode and its techniques, you can take a look at our publications page at http://sva.cs.illinois.edu/pubs.html. You may also be interested in the Memory Safety Menagerie (http://sva.cs.illinois.edu/menagerie/) which catalogs various memory safety techniques.

The SAFECode libLTO will probably work on most programs and is nearly ready to go. It's just too slow on one of our test cases (OpenSSH ssh client), and so I've been a little reluctant to suggest that people use it. You can find it in safecode/tools/LTO, and to install it, just follow the directions for regular libLTO for Linux (http://llvm.org/docs/GoldPlugin.html) or copy it into /usr/lib on Mac OS X (just be sure to backup the old /usr/lib/libLTO.dylib!).

Also a number of those tests use indirect function calls to
free(), which I think SAFECode doesn't handle....

This is a good point. There was a transform pass called
RaiseAllocationsPass that ensured that all calls to malloc() and free() were
direct calls. I don't recall if this was an LLVM pass or Poolalloc pass,
but we should get it working again with LLVM 3.0.

Time to go file some bug reports...

Is it possible to get links to these? I'd like to try to follow along
at home, if it isn't a problem.

Sure! All SAFECode bugs are in the LLVM Bug Database. This particular bug is PR#11230 (http://llvm.org/bugs/show_bug.cgi?id=11230) and PR#11231 (http://llvm.org/bugs/show_bug.cgi?id=11231).

-- John T.

Thanks again!
Geremy Condra

[svadev] safecode tests, geremy condra, 10/24/2011
- Re: [svadev] safecode tests, Matthew Wala, 10/24/2011
  - Re: [svadev] safecode tests, John Criswell, 10/25/2011
    - Re: [svadev] safecode tests, geremy condra, 10/25/2011
      - Re: [svadev] safecode tests, John Criswell, 10/25/2011
        
        Re: [svadev] safecode tests, geremy condra, 10/26/2011