svadev AT lists.siebelschool.illinois.edu

Subject: Svadev mailing list

List archive

Re: [svadev] Documentation

From: Jean-Baptiste Tristan <jean.baptiste.tristan AT gmail.com>
To: John Criswell <criswell AT uiuc.edu>
Cc: "svadev AT cs.uiuc.edu" <svadev AT cs.uiuc.edu>
Subject: Re: [svadev] Documentation
Date: Wed, 10 Mar 2010 17:18:55 -0500
List-archive: <http://lists.cs.uiuc.edu/pipermail/svadev>
List-id: <svadev.cs.uiuc.edu>

On Wed, Mar 10, 2010 at 4:07 PM, Jean-Baptiste Tristan
<jean.baptiste.tristan AT gmail.com>
wrote:
> On Wed, Mar 10, 2010 at 2:48 PM, John Criswell
> <criswell AT uiuc.edu>
> wrote:
>> Jean-Baptiste Tristan wrote:
>>>
>>> I was expecting to be able to apply SAFEcode on the example in figure
>>> 2 a of the PLDI 06
>>> paper and match the result (in figure 4) with what I obtain.
>>>
>>
>> Note that SAFECode in the PLDI 2006 publication used automatic pool
>> allocation. Using the default options to sc, automatic pool allocation is
>> not used, so you will only see one pool.
>>
>> Automatic pool allocation hasn't been used in awhile, so it is experiencing
>> some bitrot issues. Improving pool allocation's quality of implementation
>> is on our TODO list.
>>>
>>> With the input code
>>>
>>> int **x, *y, *z, ***w, u;
>>> x = (int **) malloc(4);
>>> y = (int *) malloc (4);
>>> z = (int *) malloc(4);
>>>
>>> *x = y;
>>> *y = 5;
>>> free(z);
>>> *z = 10;
>>>
>>> u = *z;
>>> w = (int ***) z;
>>> *w = x;
>>>
>>> return 0;
>>>
>>>
>>>
>>> SAFEcode gives me :
>>>
>>>
>>> call void @__sc_dbg_poolargvregister(i32 %argc, i8** %argv)
>>> %sizetmp = mul i32 1, 4 ; <i32> [#uses=2]
>>> %call29 = call i8* @__sc_dbg_src_poolalloc(%PoolDescriptor*
>>> @__poolalloc_GlobalPool, i32 %sizetmp, i32 0, i8* getelementptr
>>> inbounds ([5 x i8]* @sourcefile, i32 0, i32 0), i32 1) ; <i8*>
>>> [#uses=2]
>>> %call2.casted = bitcast i8* %call29 to i8* ; <i8*> [#uses=1]
>>> %__poolalloc_GlobalPool.casted8 = bitcast %PoolDescriptor*
>>> @__poolalloc_GlobalPool to i8* ; <i8*> [#uses=1]
>>> call void @__sc_dbg_src_poolregister(i8*
>>> %__poolalloc_GlobalPool.casted8, i8* %call2.casted, i32 %sizetmp, i32
>>> 6, i8* getelementptr inbounds ([5 x i8]* @sourcefile7, i32 0, i32 0),
>>> i32 7)
>>> %0 = bitcast i8* %call29 to i8* ; <i8*> [#uses=1]
>>> %conv = bitcast i8* %0 to i32** ; <i32**> [#uses=2]
>>> %sizetmp3 = mul i32 1, 4 ; <i32> [#uses=2]
>>> %call1410 = call i8* @__sc_dbg_src_poolalloc(%PoolDescriptor*
>>> @__poolalloc_GlobalPool, i32 %sizetmp3, i32 1, i8* getelementptr
>>> inbounds ([5 x i8]* @sourcefile2, i32 0, i32 0), i32 2) ; <i8*>
>>> [#uses=2]
>>> %call14.casted = bitcast i8* %call1410 to i8* ; <i8*> [#uses=1]
>>> %__poolalloc_GlobalPool.casted7 = bitcast %PoolDescriptor*
>>> @__poolalloc_GlobalPool to i8* ; <i8*> [#uses=1]
>>> call void @__sc_dbg_src_poolregister(i8*
>>> %__poolalloc_GlobalPool.casted7, i8* %call14.casted, i32 %sizetmp3,
>>> i32 5, i8* getelementptr inbounds ([5 x i8]* @sourcefile6, i32 0, i32
>>> 0), i32 6)
>>> %1 = bitcast i8* %call1410 to i8* ; <i8*> [#uses=1]
>>> %conv2 = bitcast i8* %1 to i32* ; <i32*> [#uses=2]
>>> %sizetmp5 = mul i32 1, 4 ; <i32> [#uses=2]
>>> %call3611 = call i8* @__sc_dbg_src_poolalloc(%PoolDescriptor*
>>> @__poolalloc_GlobalPool, i32 %sizetmp5, i32 2, i8* getelementptr
>>> inbounds ([5 x i8]* @sourcefile3, i32 0, i32 0), i32 3) ; <i8*>
>>> [#uses=3]
>>> %call36.casted = bitcast i8* %call3611 to i8* ; <i8*> [#uses=1]
>>> %__poolalloc_GlobalPool.casted = bitcast %PoolDescriptor*
>>> @__poolalloc_GlobalPool to i8* ; <i8*> [#uses=1]
>>> call void @__sc_dbg_src_poolregister(i8*
>>> %__poolalloc_GlobalPool.casted, i8* %call36.casted, i32 %sizetmp5, i32
>>> 4, i8* getelementptr inbounds ([5 x i8]* @sourcefile5, i32 0, i32 0),
>>> i32 5)
>>> %2 = bitcast i8* %call3611 to i8* ; <i8*> [#uses=1]
>>> %conv4 = bitcast i8* %2 to i32* ; <i32*> [#uses=4]
>>> store i32* %conv2, i32** %conv
>>> store i32 5, i32* %conv2
>>> %conv8 = bitcast i32* %conv4 to i8* ; <i8*> [#uses=1]
>>> call void @__sc_dbg_poolunregister_debug(i8* bitcast
>>> (%PoolDescriptor* @__poolalloc_GlobalPool to i8*), i8* %call3611, i32
>>> 7, i8* getelementptr inbounds ([5 x i8]* @sourcefile8, i32 0, i32 0),
>>> i32 8)
>>> call void @__sc_dbg_src_poolfree(%PoolDescriptor*
>>> @__poolalloc_GlobalPool, i8* %conv8, i32 3, i8* getelementptr inbounds
>>> ([5 x i8]* @sourcefile4, i32 0, i32 0), i32 4)
>>> store i32 10, i32* %conv4
>>> %tmp11 = load i32* %conv4 ; <i32> [#uses=0]
>>> %conv13 = bitcast i32* %conv4 to i32*** ; <i32***> [#uses=1]
>>> store i32** %conv, i32*** %conv13
>>> ret i32 0
>>>
>>>
>>> And I have difficulties to find the pool information that may be used
>>> to perform type verification
>>> or the poolinit, pooldestroy operations.
>>>
>>
>> The pool is explicitly passed into the poolalloc and poolfree calls (named
>> __sc_dbg_src_poolalloc and __sc_dbg_src_poolfree, respectively). I do not
>> see the call to poolinit(); I believe calls to poolinit() for global (i.e.,
>> context insensitive) pools are done in a constructor function instead of in
>> the main() function (because C++ global variables may have constructors
>> that
>> are called before main(), so the pools that they use must be initialized
>> before main() too).
>
> Yes, OK.
>
> %PoolDescriptor = type [92 x i8*]
> @__poolalloc_GlobalPool = global %PoolDescriptor zeroinitializer ;
> <%PoolDescriptor*> [#uses=9]
>
> Having only one pool doesn't change the safety guarantees, it only
> results in more runtime checks
> and an increased memory consumption that what would be needed with a
> better pool allocation. Is that right ?
>

Hmmm... Please let me try to refine this... If there's only one pool,
then it is more likely
that it will not be homogeneous, and therefore will require
runtime-checks for memory accesses.

>> If automatic pool allocation would have been used, then each allocation
>> would have ended up in a different pool (unless DSA thinks they may alias).
>>
>> The SAFECode transform does not annotate the LLVM bitcode to link pools to
>> the types inferred for that pool by DSA. Instead, that information is kept
>> in memory by the DSA/Pool Allocation passes; the SAFECode transform passes
>> query DSA/Pool Allocation whenever they need to know the type and pool
>> information of a pointer.
>>
>> Writing a pass that annotates pools with the DSA-inferred type of the
>> objects within the pool would probably be a straightforward pass to write.
>
> That's a good news. I think it would be nice to add this piece of
> information
> so that type checking can be done completely independently of the
> SAFEcode tools and
> allowing some kind of proof-carying code.
>
>
>>
>> -- John T.
>>
>>> -John
>>>
>>>
>>> On Tue, Mar 9, 2010 at 7:09 PM, Adve, Vikram Sadanand
>>> <vadve AT illinois.edu>
>>> wrote:
>>>
>>>>
>>>> Jean,
>>>>
>>>> FYI, the type encoding from the PLDI06 paper hasn't been tested much
>>>> since the SOSP 07 paper. DSA and Poolalloc (and LLVM) have evolved
>>>> since then so the old code probably doesn't work.
>>>>
>>>> What exactly do you want with the code? I think the PLDI paper is
>>>> more reliable, anyway, though much less detailed.
>>>>
>>>> --Vikram
>>>> http://llvm.org/~vadve
>>>>
>>>>
>>>> On Mar 9, 2010, at 3:24 PM, "John Criswell"
>>>> <criswell AT uiuc.edu>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> Jean-Baptiste Tristan wrote:
>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have transformed a piece of code using sc, and the resulting code
>>>>>> has a few extra function declarations such as:
>>>>>>
>>>>>> declare void @sc.lscheck(i8*, i8*)
>>>>>>
>>>>>> Is there any documentation that could help me understand what those
>>>>>> functions do ?
>>>>>>
>>>>>>
>>>>>
>>>>> There isn't documentation per se, but the comments in
>>>>> runtime/DebugRuntime might be able to help explain what these
>>>>> functions
>>>>> do. You might also want to consult the SOSP 2007 paper on SVA; it
>>>>> might
>>>>> describe the run-time checks more clearly and is closer to the current
>>>>> SAFECode implementation than the PLDI paper.
>>>>>
>>>>> Briefly, the run-time functions that SAFECode inserts are:
>>>>>
>>>>> 1) sc.lscheck - Load/store check. This is the poolcheck() function
>>>>> from
>>>>> the PLDI 2006 publication.
>>>>>
>>>>> 2) sc.poolcheckalign - Load/store alignment check. Ensures that the
>>>>> pointer points to a valid object and is of the correct alignment
>>>>>
>>>>> 3) sc.boundscheck[ui] - Ensure that an indexing operation (a GEP) does
>>>>> not create an out of bounds pointer. This check was introduced, I
>>>>> believe, in the ICSE publication.
>>>>>
>>>>> 4) pool_register/pool_unregister - These functions tell the run-time
>>>>> when objects have been allocated or deallocated. There are now
>>>>> separate
>>>>> versions for global, stack, and heap objects so that the runtime can
>>>>> treat them differently.
>>>>>
>>>>> 5) pool_shadow/pool_unshadow - These functions implement the MMU
>>>>> remapping for danging pointer detection. This technique is
>>>>> described in
>>>>> the DSN paper. As dangling pointer *detection* (as opposed to
>>>>> *protection*) is disabled by default, you probably won't see any calls
>>>>> to these functions.
>>>>>
>>>>> Note that there are alternative versions of these functions that
>>>>> take a
>>>>> tag and source filename/line number information. These are used for
>>>>> the
>>>>> SAFECode debugging tool.
>>>>>
>>>>>
>>>>>>
>>>>>> Some of those functions have a name that is close to the name of some
>>>>>> of the statements
>>>>>> or expressions used in the type system described in the paper
>>>>>> "SAFEcode: Enforcing Alias Analysis
>>>>>> for Weakly Typed Languages". Do you use those functions to "encode"
>>>>>> the type system described in
>>>>>> this paper into regular LLVM code ?
>>>>>>
>>>>>>
>>>>>
>>>>> I don't believe they "encode" the type system; rather, they implement
>>>>> the run-time checks required to provide the memory safety guarantees
>>>>> described in the PLDI and ICSE papers. The typing information, I
>>>>> believe, is implicit in the pools used in the calls to poolalloc and
>>>>> the
>>>>> above run-time checks (when automatic pool allocation is enabled;
>>>>> currently, SAFECode disables automatic pool allocation and uses a
>>>>> simpler transform which puts all heap objects into a single pool; it
>>>>> adjusts the points-to analysis results to reflect the fact that
>>>>> everything is going into a global pool).
>>>>>
>>>>> There is code to encode the points-to analysis results in the LLVM
>>>>> bitcode. That is the code in the safecode/lib/ProofWrap directory.
>>>>> Dinakar wrote it for the SOSP 2007 paper. I have never used it, so I
>>>>> can't comment on its functionality at present. I may look into it at
>>>>> some point, but that depends on work priorities.
>>>>>
>>>>> -- John T.
>>>>>
>>>>>>
>>>>>> -John
>>>>>> _______________________________________________
>>>>>> svadev mailing list
>>>>>> svadev AT cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/svadev
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> svadev mailing list
>>>>> svadev AT cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/svadev
>>>>>
>>
>>
>

[svadev] Documentation, Jean-Baptiste Tristan, 03/09/2010
- Re: [svadev] Documentation, John Criswell, 03/09/2010
  - Re: [svadev] Documentation, Adve, Vikram Sadanand, 03/09/2010
    - Re: [svadev] Documentation, Jean-Baptiste Tristan, 03/10/2010
      - Re: [svadev] Documentation, John Criswell, 03/10/2010
        
        Re: [svadev] Documentation, Jean-Baptiste Tristan, 03/10/2010
        
        Re: [svadev] Documentation, Jean-Baptiste Tristan, 03/10/2010
        
        Re: [svadev] Documentation, Adve, Vikram Sadanand, 03/10/2010
        
        Re: [svadev] Documentation, John Criswell, 03/11/2010
        
        Re: [svadev] Documentation, Adve, Vikram Sadanand, 03/11/2010