Name Mangling

Table of Contents

1 Name Mangling简介

In compiler construction, name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

It provides a way of encoding additional information in the name of a function, structure, class or another datatype in order to pass more semantic information from the compilers to linkers.

The C++ and Java languages provide function overloading, which means that you can write many functions with the same name, providing that each function takes parameters of different types. In order to be able to distinguish these similarly named functions C++ and Java encode them into a low-level assembler name which uniquely identifies each different version. This process is known as mangling.

参考:
Calling Conventions for different C++ compilers and operating systems: http://www.agner.org/optimize/calling_conventions.pdf
Name mangling demystified: http://www.int0x80.gr/papers/name_mangling.pdf
c++filt: https://sourceware.org/binutils/docs/binutils/c_002b_002bfilt.html#c_002b_002bfilt

1.1 How to Demangling

方法一:GNU binutils c++filt.
方法二: nm has a -C option which demangles the symbol names in object file.
方法三:Demangling online: http://demangler.com/

2 C++ Name Mangling

The C++ language does not define a standard decoration scheme, so each compiler uses its own. Few linkers can link object code that was produced by different compilers.

注:C++中,不同编译器使用的Mangling规则不一样。

2.1 Name Mangling rule in GCC 3.x and later

GCC 3.x and later has adopted the name mangling scheme defined in the Itanium C++ ABI.

2.1.1 简单实例

下面介绍一个很简单的C++ Name Mangling例子(在GCC 4.9.2中测试):

$ cat 1.c
int fun1(char*, int, double, char, float)
{
    return 1;
}
$ g++ -c 1.c    # Use g++, rather than gcc.
$ nm 1.o | grep fun1
0000000000000000 T _Z4fun1Pcidcf

上面例子中_Z4fun1Pcidcf是int fun1(char*, int, double, char, float)完成Name Mangling后的结果。

其中:
_Z 是所有Name Mangling的前缀(以下划线和大写字母开头的标识符在C标准里是保留字,所以不会被用户使用而导致名字冲突);数字4是函数名字的长度;fun1是函数名字;后面的部分是函数参数(Pc代表Pointer to char, i代表integer, d代表double, c代表char, f代表float)。


Author: cig01

Created: <2011-08-12 Fri 00:00>

Last updated: <2016-07-09 Sat 18:16>

Creator: Emacs 25.1.1 (Org mode 9.0.7)