svadev AT lists.siebelschool.illinois.edu

Subject: Svadev mailing list

List archive

[svadev] run-time check clarification

From: Daniel Huang <dehuang AT fas.harvard.edu>
To: "<svadev AT cs.illinois.edu>" <svadev AT cs.illinois.edu>
Subject: [svadev] run-time check clarification
Date: Thu, 31 Jan 2013 12:05:23 -0500
List-archive: <http://lists.cs.uiuc.edu/pipermail/svadev/>
List-id: <svadev.cs.uiuc.edu>

Hi Svadev,

I don't understand how to read some of the code generated by the SAFEcode compiler. The below is an excerpt of emitted code.

struct.foo = { i32; float }

PDa: struct.foo

1 %call = call i8* @poolalloc([92 x i8*]* %PDa, 8)

2 %x1 = bitcast i8* %call to i32*

3 %PDa5 = bitcast [92 x i8*]* %PDa to i8*

4 call void @poolcheckui_debug(i8* %PDa5, i8* %call, ...)

5 store i32 42, i32* %x1

In line 1, we call poolalloc for 8 bytes of memory in pool PDa. Thus, I deduce that pool PDa holds objects of type struct.foo. In line 2, we cast the result of the allocation to an i32*. At this point, I'm confused. Why is this a bitcast instead of a getelementptr 0, 0, the first 0 saying we want the first memory object, and the second saying the first element of the struct? I agree that per-LLVM semantics, this is "safe", but wouldn't we do a getelementptr if I was accessing the second element of the struct instead of the first? In other words, why is SAFEcode doing some special-casing if we are accessing the first element of a struct? Line 4 is also confusing to me. We call poolcheck, but I would have expected the run-time boundscheck. I understand poolcheck(pool, ptr) to mean, check that ptr points into pool at the correct alignment for the type stored in pool (in this case struct.foo). I understand boundscheck(pool, ptr, ptrO) to mean, check that ptr points into pool at the correct alignment for the type stored in pool. Moreover, check that ptrO is at the correct offset, pointing to the correct type if we offset the appropriate amount from the type stored in the pool. As an aside, don't all pool descriptors have type [92 x i8*]*, so why don't all SAFEcode library functions accept that type instead of an i8* in an argument position expecting a pool?

I expected the code to look like this (I've just shown the delta):

2 %x1 = getelementptr %call i64 0, i32 0

2.5 %x1.casted = bitcast i32* %x1 to i8*

4 call void @boundscheckui_debug(i8* %PDa5, i8* %call, i8* %x1.casted, ....)

Given my understanding of the run-time checks, I would say that the emitted code will execute "safely". The only benefit of emitting the top code instead of the bottom one is that the top may be slightly faster, due to boundscheck being more expensive than a poolcheck. The bitcast vs. getelementptr should reduce to the same thing during code generation.

If the explanation of the top code seems reasonable, I'll update my formalism of poolcheck to implicitly handle sub-typing. However, I would argue that the bitcast in line 2 really should be a getelementptr, since I expect bitcast mostly to be used because LLVM does not have parametric polymorphism, not to index into an aggregate data structure.

Any clarification would be appreciated. This is really important if I want as "complete" a verified type-checker as possible.

Regards,

Dan

[svadev] run-time check clarification, Daniel Huang, 01/31/2013
- Re: [svadev] run-time check clarification, John Criswell, 01/31/2013
  - Re: [svadev] run-time check clarification, Daniel Huang, 01/31/2013