Linking C++ and Assembly

Using Visual Studio C++ Version 6

Masm 6.13

 

The task is to link a C++ module with an Assembly language module. To accomplish this, I began with a simple C++ application that calls an external function:

 

void sample();

int main(){

     sample();

     return 0;

}

 

I set the compile options to produce an assembly listing file (with source). Here is the result (reduced to a few critical lines):

     .386P

.model FLAT

PUBLIC    _main

EXTRN     ?sample@@YAXXZ:NEAR               ; sample

_TEXT     SEGMENT

_main     PROC NEAR                    ; COMDAT

; 4    : sample();

     call ?sample@@YAXXZ               ; sample

 

The .386 and model info are necessary when we actually write the function in assembly language. The curious part is the actual symbol used for the external function sample: ?sample@@YAXXZ. This is one of the oddities of the C++ compiler. It is called name decoration. The extra information helps the compiler differentiate between various overloads of functions. We could try to name our assembly language implementation of this function using the decorated name, but this is not the best option.

 

Through a search of the MSDN on linking mixed language modules, you can find how to eliminate this annoying decoration. By enclosing the prototype declaration in an extern “C” statement, you can get the usual C name “decoration” which is simply the addition of an underscore to the function name:

 

extern "C" {void sample();}

 

Recompiling yields the following function call:

 

            call _sample

 

This will be a simpler function name to try to match in our separate module. We also get an expected  link error (due to the missing function):

 

main.obj : error LNK2001: unresolved external symbol _sample

 

Next, we will create an assembly language program defining the procedure _sample. The following is entered in sample,asm and then assembled to produce the object file, sample.obj.

 

.386

.model flat

.code

public _sample

_sample proc

     ret

_sample endp

end

 

The object file should next be made a part of the C++ project, so the linker knows to look in the file for the required public function. A rebuild of the project yields the following:

 

.\Sample.obj : warning LNK4033: converting object format from OMF to COFF

main.obj : error LNK2001: unresolved external symbol _sample

 

The first warning may be ignored. The linker used in Visual Studio expects object files to be in Common Object File Format.  The assembler creates object files in 32-bit Object Module Format. The linker automatically converts to the appropriate format. There are utilities to convert the object files permanently (exe2bin or the 32-bit lib utility supplied with Visual Studio).

 

The second problem – still being unable to find the function _sample – is a bit more problematic.  By loading the sample.obj file into Visual Studio (it is displayed in hex), the problem becomes apparent; the function name is defined in the object file as _SAMPLE. Remember – C++ is a case-sensitive language. There are at least 2 solutions. The first is to make the assembler preserve the case of the symbols (use the /Mx option) or to simply change the case of the function in the C++ program to match the one in the object file. I chose to reassemble with the /Mx option. After that step, linking went smoothly.

 

The next step is to investigate how arguments are passed and what the functions responsibilities are. The MSDN Library provides some helpful information and explains the options available. The info is under Calling Conventions. C++ compiled functions can use a variety of calling mechanisms. The programmer selects one using a keyword (__cdecl, __stdcall, __fastcall). Arguments are pushed on the stack in right to left order (fastcall uses registers for the first 2 suitably sized arguments and resorts to the stack for the others). In cdecl, the calling function must remove the arguments after the return. In stdcall, the called function removes the arguments. Each of these options has slightly different name decoration schemes. Cdecl and stdcall both prepend an underscore. Stdcall also adds an @ and an integer indicating the total number of bytes represented by the parameter list (in decimal). Fastcall prepends an @ sign and adds the parameter list size to the end of the name.

 

Also of interest is how arguments are actually stored on the stack. The compiler always widens data to fit in a double word (32-bits) so arguments are spaced by 4 bytes. Passing by reference is always accomplished by passing the address of the argument. Addresses are 4-byte values.

 

Changing the earlier example to incorporate arguments:

 

extern "C" {void sample(int, int &, char, int []);}

int main(){

     int a, b, c[10];

     char w;

     sample(a, b, w, c);

Results in the following assembly listing:

 

_a$ = -4

_b$ = -8

_c$ = -48

_w$ = -52

     lea  eax, DWORD PTR _c$[ebp]

     push eax

     mov  cl, BYTE PTR _w$[ebp]

     push ecx

     lea  edx, DWORD PTR _b$[ebp]

     push edx

     mov  eax, DWORD PTR _a$[ebp]

     push eax

     call _sample

     add  esp, 16                      ; 00000010H

 

Notice how the locations of the automatic variables are defined symbolically. These are locations on the runtime stack (in main’s stack frame). Look how the arguments are converted to double words and pushed onto the stack. Also note that the arguments are removed after the call by the calling program (not the procedure).

 

If the calling convention is changed to stdcall, look at the effect:

Source:  extern "C" {void _stdcall sample(int, int &, char, int []);}

Assembly listing:

     lea  eax, DWORD PTR _c$[ebp]

     push eax

     mov  cl, BYTE PTR _w$[ebp]

     push ecx

     lea  edx, DWORD PTR _b$[ebp]

     push edx

     mov  eax, DWORD PTR _a$[ebp]

     push eax

     call _sample@16

 

There is no stack operation after the call as the calling program has the responsibility of removing the arguments from the stack. Note also the function name has changed. We would have to make the appropriate change in our module. Be sure you understand which calling convention is in effect because failing to do the correct thing to the stack will result in a disaster.

 

To allow the assembly language module to call one of the C++ functions, we do essentially the same thing, only in reverse. In the assembly module we declare the name to be extern. In C++, functions automatically have external linkage so nothing special needs to be done other than to define the function.

 

In the assembly module:

extern _backAtYou:proc

;also add a suitable call wrapped with appropriate stack manipulations

 

In the C++ module:

extern "C" {void _cdecl backAtYou(int, char *);}

void _cdecl backAtYou(int x, char * y){

     char temp;

     temp = (char)x;

     *y = temp;

}

 

Here is the result of the compile (unrelated items removed). A careful study of the function will help in understanding how the passed arguments are accessed and utilized. Notice that the formal parameters are assigned names representing offsets in the stack. Look carefully at the allocation of local storage for temp and how it is accessed. Notice that ret 0 is the exit protocol – this function was declared using cdecl so it does not remove the arguments.

 

PUBLIC    _backAtYou

_TEXT     SEGMENT

_x$ = 8

_y$ = 12

_temp$ = -4

_backAtYou PROC NEAR                       ; COMDAT

; 13   : void _cdecl backAtYou(int x, char * y){

     push ebp

     mov  ebp, esp

     sub  esp, 4

; 14   : char temp;

; 15   : temp = (char)x;

     mov  al, BYTE PTR _x$[ebp]

     mov  BYTE PTR _temp$[ebp], al

; 16   : *y = temp;

     mov  ecx, DWORD PTR _y$[ebp]

     mov  dl, BYTE PTR _temp$[ebp]

     mov  BYTE PTR [ecx], dl

; 15   : }

     mov  esp, ebp

     pop  ebp

     ret  0

_backAtYou ENDP

 

The program successfully links, and if the stack is manipulated properly, the program can be executed. Using the debugger, one can follow the steps from main into sample back to backAtYou and then reverse to main again. You may need to view the Dissassembly window (under View menu) to step into the sample function’s code.

 

Assignment: Write a C++ program that accepts strings from the user and then calls an assembly language routine that will cause the strings to be displayed in all uppercase and all lowercase. The assembly language function must allocate space on the stack to copy the array (be sure to leave room for the nul character). Then the array is converted to all uppercase characters, passed to a C++ function named display, converted to all lowercase characters, and passed to display. Display of course prints the string followed by a newline.

 

Use stdcall calling conventions for your assembly routine and cdecl for display. The required protocols are as follows:

 

extern “C” {void _stdcall UpDn(int length, char * thestring); }

extern “C” {void _cdecl display(char * thestring); }