When Do I Save Caller Save Registers
7.5
Lecture 6: Checking for errors and calling functions
Where we left off last time, we could work with both numbers and booleans in our program. Unfortunately, we had no way of ensuring that we only worked with them consistently, as opposed to, say, applying an arithmetic operation to boolean values. Let's remedy that.
1Checking for errors: Calling functions
Error handling is going to be a pervasive feature in our compiled output: we need to check the arguments and returned values of every operation are valid. That sequenece of checking instructions will appear at every location in the program that it's needed. But the error handling itself is identical everywhere: we want to show some kind of message that describes the error that occured. So for example, we might want to check if a value is is a number as follows:
... ;; get EAX to have the value we need test EAX, 0x00000001 ;; check only the tag bit of the value jnz error_not_number ;; if the bit is set, go to some centralized error handler error_not_number: ?????
(The test
instruction performs a bitwise-and of its two arguments and discards the result, but sets the zero and signedness flags to be used with conditional jumps. It's convenient in this case, though for more complex tag checks, we might need a cleverer sequence of assembly instructions, possibly involving a second register as a temporary value.)
1.1Caller- and callee-save registers
What code should we have at the error_not_number
label? We'd like to be able to be able to print some kind of error message when the arguments to an operation are invalid. But our language doesn't have any notion of strings yet, so we have no way of representing a message, let alone actually printing it. Fortunately, we have main.c
available to us, and C does have strings and printing. Our goal should then be to define a new function in main.c
and somehow call it from our compiled output.
(Note that this does not necessarily mean that we can call arbitrary functions from our source language, though it certainly does set the stage for us to do so later!)
To understand how to call a C function, we need to understand a bit about the C calling convention. The calling convention describes an agreement between the callers of functions and the callee functions on where to place arguments and return values so the other function can find them, and on which function is responsible for saving any temporary values in registers.
We've already encountered one part of the calling convention: "the answer goes in EAX
." This simple statement asserts that the callee places its answer in EAX
, and the caller should expect to look for the answer there...which means that the caller should expect the value of EAX
to change as a result of the call. If the caller needs to keep the old value of EAX
, it is responsible for saving that value before performing the call: we say this is a caller-save register. On the other hand, functions are allowed to manipulate the ESP
register to allocate variables on the stack. When the function returns, though, ESP
must be restored to its value prior to the function invocation, or else the caller will be hopelessly confused. We say that ESP
is a callee-save register. In general, the calling convention specifies which registers are caller-save, and which are callee-save, and we must encode those rules into our compilation as appropriate.
So far, our compilation has only ever dealt with local variables; we haven't considered what it means to accept any parameters as inputs. Where should they go? Our current reference point for finding local variables is ESP
, and our variables are found at consecutive offsets just smaller than it, which implies our parameters probably ought to go in the other direction, at consecutive offsets after it. Since we're counting "outwards" from this reference point, we should wind up with a picture that looks vaguely like this:
This suggests that we can call a function in C by push
ing its arguments onto the stack in reverse order, and then simply call
ing it. The push
instruction decrements ESP
by one word, and then moves its argument into the memory location now pointed to by the new ESP
value. Pictorally,
Initial | | | ||
|
|
|
The call
instruction is slightly more complicated, mostly because it needs to handle the bookkeeping for what should happen when the call returns: where should execution resume? Every instruction exists at some address in memory, and the currently executing instruction's address is stored in EIP
, the instruction pointer. Our assembly code should never modify this register directly. Instead, the call
instruction first pushes a return address describing the location of the next instruction to run —EIP
just after the call
instruction itself —
Putting these two instructions together, we can implement error_non_number
:
error_non_number: push EAX ;; Arg 2: push the badly behaved value push 1 ;; Arg 1: a constant describing which error-code occurred call error ;; our error handler
And then in main.c
:
const int ERR_NOT_NUMBER = 1; const int ERR_NOT_BOOLEAN = 2; // other error codes here void error(int errCode, int val) { if (errCode == ERR_NOT_NUMBER) { fprintf(stderr, "Expected number, but got %010x\n", val); } else if (errCode == ERR_NOT_BOOLEAN) { fprintf(stderr, "Expected boolean, but got %010x\n", val); } else ... exit(errCode); }
1.2The base pointer
If we try to use this code as is, it will crash. We are not yet quite respecting the calling convention completely. Notice that we refer to it as the "call stack", and yet we only are pushing onto our stack: we're never popping anything back off! Moreover, all this pushing and (eventually) popping keeps changing our ESP
value, which means anything we try to do to access our local variables will break at runtime.
Instead of basing our local variables off this constantly-shifting stack pointer, the calling convention stipulates that we save the stack pointer as it was at the beginning of our function. Provided we never modify that register during the execution of our function, it will remain a constant against which our local-variables' offsets can be based. This register, EBP
, is accordingly known as the base pointer.
Do Now!
Should
EBP
be a caller-save or callee-save register?
Since the mere act of using a call
instruction changes ESP
, we can't know what the correct value should be until inside the callee. Therefore, if the callee sets the value, it must be responsible for restoring the old value. To do that, we must push the old value of EBP
onto the stack, and restore it on our way out. We can now revise our original sketch of a C-compatible stack:
Between the parameters and the locals we have two additional slots, for storing the return address and the old value of EBP
. The stack obeys the invariant that
-
Every function's locals are saved in its stack frame, the region between
EBP
andESP
. -
EBP
always points to the beginning of the current stack frame -
ESP
always points to the beginning of the next stack frame
To preserve this invariant, we have to add some setup and teardown code to the beginnings and ends of functions. In particular, we need to add this code to our_code_starts_here
, because our code really just is a function that participates in the C stack. We also need to add code surrounding every function call.
In the callee:
-
At the start of the function:
push EBP ; save (previous, caller's) EBP on stack mov EBP, ESP ; make current ESP the new EBP sub ESP, 4*N ; "allocate space" for N local variables
-
At the end of the function
mov ESP, EBP ; restore value of ESP to that just before call ; now, value at [ESP] is caller's (saved) EBP pop EBP ; so: restore caller's EBP from stack [ESP] ret ; return to caller
In the Caller:
-
To call a function target that takes M parameters:
push arg_M ; push last arg first ... ... push arg_2 ; then the second ... push arg_1 ; finally the first call target ; make the call (which puts return addr on stack) add esp, 4*M ; now we are back: "clear" args by adding 4*numArgs
Note that if you are compiling on MacOS, you must respect the 16-Byte Stack Alignment Invariant.
Do Now!
Draw the sequence of stacks that results from our error-handling code, using this more precise understanding of what belongs on the stack.
1.3Putting the pieces together
Our compiler should now insert type checks, akin to the test
instruction with which we started this lecture, everywhere that we need to assert the type of some value. We then need to add a label at the end of our compiled output for each kind of error scenario that we face. The code at those labels should push the offending value (likely in EAX
) onto the stack, followed by the error code, and then call the error
function in main.c
. That function must be elaborated with a case for each kind of error we face.
We must change the prologue and epilogue of our_code_starts_here
to include the callee's responsibilities of saving and restoring EBP
and ESP
. Also, we must change our compilation of local variables! In particular, we need to stop referring directly to ESP
, and instead use EBP
—
With all these changes in place, our code should now properly test for erroneous operands and complain rather than produce bogus output. Unfortunately, it's still slightly broken.
1.4Stack frame sizes
We have one last problem to manage. Since the code sequence for a function call uses push
to push arguments onto the stack, we need to ensure that ESP
is above all our local variables. We need to "preallocate" space for them in our stack frame. To do that, we need to know how much space we'll need, which in turn means we need to know how many local variables we'll need. Sadly, computing the exact answer here is undecidable. But we can overapproximate:
-
We could just allocate 100 variables and call it good-enough-for-now. This is lousy, but it's a decent heuristic with which to test all the rest of our code.
-
We could count the total number of let-bindings in our expression.
Can we do better? Consider how the environment grows and shrinks as we manipulate an expression: its maximum size occurs at the deepest number of nested let-bindings. Surely we'll never need more locals than that, since we'll never have more names in scope at once than that!
Exercise
Construct a function
countVars
that computes this quantity. Work by cases over the ANF'ed AST.
Once we've computed this quantity, we use to decrement ESP
at the start of our_code_starts_here
, and now at last our compiled output works correctly.
2Testing
Our programs can now produce observable output! Granted, it is only complaints about type mismatches, so far, but even this is useful.
Exercise
Construct a test program that demonstrates that your type testing works properly. Construct a second test program that demonstrates that your
if
expressions work properly, too. Hint: sometimes, "no news is good news."
Exercise
Enhance your compiler with a new unary primitive,
When Do I Save Caller Save Registers
Source: https://course.ccs.neu.edu/cs4410sp19/lec_function-calls_notes.html
Posted by: roccoloond1999.blogspot.com
0 Response to "When Do I Save Caller Save Registers"
Post a Comment