Gcc

Table of Contents

1 GCC简介

GCC stands for “GNU Compiler Collection”. GCC is an integrated distribution of compilers for several major programming languages. These languages currently include C, C++, Objective-C, Objective-C++, Java, Fortran, Ada, and Go. The name historically stood for “GNU C Compiler”, and this usage is still common when the emphasis is on compiling C programs.

GCC website: https://gcc.gnu.org/
GCC online documentation: https://gcc.gnu.org/onlinedocs/
GCC Manual: https://gcc.gnu.org/onlinedocs/gcc/
GCC Internals Manual: https://gcc.gnu.org/onlinedocs/gccint/
GCC Releases: https://www.gnu.org/software/gcc/releases.html

2 GCC命令行选项

2.1 基本选项

Table 1: GCG基本选项
GCC选项 说明
-v 打开verbose模式
-Wall 打开所有warning
-O 优化代码。-O相当于-O1。指定-O2或-O3可进行更大优化,指定-O0可禁止优化。同时指定多个-O时仅最后一个生效。
-g 增加调试符号信息。指定-g并不隐含着禁止优化。
-E 仅预处理,不编译
-S 预处理和编译(生成汇编源码),但不汇编为目标文件
-c 编译和汇编,但不链接
-o file 指定输出文件的名字,输出文件可能是executable file, object file, assembler file or preprocessed C code.
-Idir Add the directory dir to the head of the list of directories to be searched for header files.
-llibrary Search the library named library when linking
-Ldir Add directory dir to the list of directories to be searched for -l.

2.1.1 指定C标准 (-std)

指定gcc使用1990年的标准(c89和c90是同义的,标准化工作从1989年开始):

gcc -ansi prog.c      # -ansi在编译C时相当于-std=c89,编译C++时相当于-std=c++98
gcc -std=c89 prog.c
gcc -std=c90 prog.c

指定gcc使用1999年的C标准:

gcc -std=c99 prog.c

注:如果没有显示指定,gcc会默认使用-std=gnu89来编译C语言,它在c89的基础上包含了一些c99的特性,还包含了与gcc相关的一些特性。

2.2 实用选项

2.2.1 -fomit-frame-pointer

-fomit-frame-pointer 表示省略帧指针(即%ebp寄存器不再保存帧指针)。

从GCC 4.6开始,针对X86平台的32-bit程序,-fomit-frame-pointer已经是默认选项了!
http://gcc.gnu.org/gcc-4.6/changes.html

如果需要保留帧指针,则可以指定: -fno-omit-frame-pointer

2.2.2 -fstack-protector

-fstack-protector
Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.
-fstack-protector-all
Like -fstack-protector except that all functions are protected.
-fstack-protector-strong
Like -fstack-protector but includes additional functions to be protected — those that have local array definitions, or have references to local frame addresses.
-fstack-protector-explicit
Like -fstack-protector but only protects those functions which have the stack_protect attribute

注:-fstack-protector可以防止堆栈溢出。

参考:
https://en.wikipedia.org/wiki/Buffer_overflow_protection
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
http://www.ibm.com/developerworks/cn/linux/l-cn-gccstack/
http://www.thegeekstuff.com/2013/02/stack-smashing-attacks-gcc/

2.2.3 -Wsequence-point

-Wsequence-point
Warn about code that may have undefined semantics because of violations of sequence point rules in the C and C++ standards.

This warning is enabled by -Wall for C and C++.

请永远不要写这样的代码,因为它们的行为不明确:

a = a++;
a[n] = b[n++];
a[i++] = i;

参考:
浅谈C/C++中的顺序点和副作用:http://www.cnblogs.com/dolphin0520/archive/2011/04/20/2022330.html

2.2.4 -finstrument-functions

-finstrument-functions
Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit.

参考:
https://balau82.wordpress.com/2010/10/06/trace-and-profile-function-calls-with-gcc/
http://linuxgazette.net/151/melinte.html
https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#Code-Gen-Options

2.2.5 -fno-asynchronous-unwind-tables(禁止产生.cfi_xxx)

先做个测试:编译一个简单的C程序为汇编代码
测试环境:Ubuntu 14.04, gcc 4.9

$ cat fun1.c
int fun1(void){
    return 0;
}

$ gcc -S fun1.c && cat fun1.s
	.file	"fun1.c"
	.text
	.globl	fun1
	.type	fun1, @function
fun1:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movl	$0, %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	fun1, .-fun1
	.ident	"GCC: (Ubuntu 4.9.2-0ubuntu1~14.04) 4.9.2"
	.section	.note.GNU-stack,"",@progbits

发现汇编代码中有很多.cfi为前缀的指令,它们是什么呢?
cfi stands for Call Frame Information.
Call Frame Information在Dwarf的文档中有介绍。

参见as文档:
https://sourceware.org/binutils/docs-2.17/as/CFI-directives.html

2.2.5.1 Call Frame Information有什么用

Modern ABIs don't require frame pointers to be used in functions. However missing FPs bring difficulties when doing a backtrace. One solution is to provide Dwarf-2 CFI data for each such function.

参考:http://www.logix.cz/michal/devel/gas-cfi/

2.2.5.2 如何不产生Call Frame Information

指定gcc的选项 -fno-asynchronous-unwind-tables 即可禁止产生Call Frame Information.

$ gcc -fno-asynchronous-unwind-tables -S fun1.c && cat fun1.s
	.file	"fun1.c"
	.text
	.globl	fun1
	.type	fun1, @function
fun1:
	pushq	%rbp
	movq	%rsp, %rbp
	movl	$0, %eax
	popq	%rbp
	ret
	.size	fun1, .-fun1
	.ident	"GCC: (Ubuntu 4.9.2-0ubuntu1~14.04) 4.9.2"
	.section	.note.GNU-stack,"",@progbits

2.2.6 Passing options to Assembler, Preprocessor, Linker

Table 2: gcc pass options to Assembler, Preprocessor, Linker
gcc options 描述
-Wa,option Pass option as an option to the assembler. If option contains commas, it is split into multiple options at the commas.
-Xassember option Pass option as an option to the assembler. If you want to pass an option that takes an argument, you must use -Xassembler twice, once for the option and once for the argument.
-Wp,option Pass option as an option to the preprocessor.
-Xpreprocessor option Pass option as an option to the preprocessor.
-Wl,option Pass option as an option to the linker.
-Xlinker option Pass option as an option to the linker.

2.3 影响GCC的环境变量

Table 3: 影响GCC的环境变量
environment variable 说明
LANG This variable is used to pass locale information to the compiler. LC_CTYPE and LC_MESSAGES default to the value of the LANG environment variable.
LC_CTYPE Specifies character classification. GCC uses it to determine the character boundaries in a string; this is needed for some multibyte encodings that contain quote and escape characters that are otherwise interpreted as a string end or escape.
LC_MESSAGES Specifies the language to use in diagnostic messages
LC_ALL If the LC_ALL environment variable is set, it overrides the value of LC_CTYPE and LC_MESSAGES
TMPDIR If TMPDIR is set, it specifies the directory to use for temporary files.
GCC_COMPARE_DEBUG Setting GCC_COMPARE_DEBUG is nearly equivalent to passing -fcompare-debug to the compiler driver.
GCC_EXEC_PREFIX If GCC_EXEC_PREFIX is set, it specifies a prefix to use in the names of the subprograms executed by the compiler.
COMPILER_PATH GCC tries the directories thus specified when searching for subprograms, if it can't find the subprograms using GCC_EXEC_PREFIX.
LIBRARY_PATH GCC uses these directories when searching for ordinary libraries for the -l option (but directories specified with -L come first).
Table 4: 影响GCC预处理器的环境变量
environment variable 说明
CPATH 用于指定头文件的搜索,相当于用-I指定目录。GCC先在-I指定目录中查找头文件,找不到再在CPATH中查找。可用于预处理所有语言(如C, C++, Objective-C等等)。
C_INCLUDE_PATH 相当于用-isystem指定目录,优先级比用-isystem指定的目录低。仅用于预处理C语言源码。
CPLUS_INCLUDE_PATH 相当于用-isystem指定目录,优先级比用-isystem指定的目录低。仅用于预处理C++语言源码。
OBJC_INCLUDE_PATH 相当于用-isystem指定目录,优先级比用-isystem指定的目录低。仅用于预处理Objective-C语言源码。
DEPENDENCIES_OUTPUT This variable is equivalent to combining the options -MM and -MF (see Preprocessor Options), with an optional -MT switch too.
SUNPRO_DEPENDENCIES This variable is the same as DEPENDENCIES_OUTPUT, except that system header files are not ignored, so it implies -M rather than -MM.

参考:https://gcc.gnu.org/onlinedocs/gcc/Environment-Variables.html

3 GNU Linker (ld) options

gcc can pass option as an option to the linker by using -Wl,option.
If option contains commas, it is split into multiple options at the commas. You can use this syntax to pass an argument to the option. For example, -Wl,-Map,output.map passes -Map output.map to the linker. When using the GNU linker, you can also get the same effect with -Wl,-Map=output.map.

References:
gcc options for linking: https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html#Link-Options
GNU Linker (ld) options: https://sourceware.org/binutils/docs/ld/Options.html#Options

3.1 链接时检测动态库中的unresolved symbols (-no-undefined)

用链接器的 -z defs-no-undefined 可以检测unresolved symbols。

$ cat silly.c
#include <stdio.h>

void forgot_to_define(FILE *fp);

void doit(const char *filename)
{
    FILE *fp = fopen(filename, "r");
    if (fp != NULL)
    {
        forgot_to_define(fp);
        fclose(fp);
    }
}

$ gcc -shared -fPIC -o libsilly.so silly.c && echo succeeded || echo failed
succeeded

$ gcc -Wl,z,defs -shared -fPIC -o libsilly.so silly.c && echo succeeded || echo failed
/tmp/ccmfaNa8.o: In function `doit':
silly.c:(.text+0x35): undefined reference to `forgot_to_define'
collect2: ld returned 1 exit status
failed

参考:http://stackoverflow.com/questions/1617286/easy-check-for-unresolved-symbols-in-shared-libraries

3.2 ld如何搜索共享库

The linker uses the following search paths to locate required shared libraries:

  1. Any directories specified by -rpath-link options.
  2. Any directories specified by -rpath options. The difference between -rpath and -rpath-link is that directories specified by -rpath options are included in the executable and used at runtime, whereas the -rpath-link option is only effective at link time. Searching -rpath in this way is only supported by native linkers and cross linkers which have been configured with the –with-sysroot option.
  3. On an ELF system, for native linkers, if the -rpath and -rpath-link options were not used, search the contents of the environment variable LD_RUN_PATH.
  4. On SunOS, if the -rpath option was not used, search any directories specified using -L options.
  5. For a native linker, search the contents of the environment variable LD_LIBRARY_PATH.
  6. For a native ELF linker, the directories in DT_RUNPATH or DT_RPATH of a shared library are searched for shared libraries needed by it. The DT_RPATH entries are ignored if DT_RUNPATH entries exist.
  7. The default directories, normally /lib and /usr/lib.
  8. For a native linker on an ELF system, if the file /etc/ld.so.conf exists, the list of directories found in that file.

If the required shared library is not found, the linker will issue a warning and continue with the link.

参考:https://sourceware.org/binutils/docs/ld/Options.html

4 GIMPLE

GIMPLE is a family of intermediate representations (IR) based on the tree data structure.

The gcc argument options -fdump-tree-all -fdump-tree-ssa -fdump-tree-optimized etc. might be used to have some GIMPLE intermediate representations dumped (in a simpler textual form).

参考:
https://gcc.gnu.org/wiki/GIMPLE
https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html

5 Tips

5.1 术语winding和unwinding是什么意思

Adding a subroutine's entry to the call stack is sometimes called "winding"; conversely, removing entries is "unwinding".

参考:http://en.wikipedia.org/wiki/Call_stack#Description

5.2 查看当前系统的GLIBC版本

如何查看当前系统的GLIBC版本?

方法1:

$ ldd --version
GNU C Library stable release version 2.5, by Roland McGrath et al.
Copyright (C) 2006 Free Software Foundation, Inc.

方法2:

$ /lib/libc.so.6
GNU C Library stable release version 2.5, by Roland McGrath et al.
Copyright (C) 2006 Free Software Foundation, Inc.

Author: cig01

Created: <2013-08-17 Sat 00:00>

Last updated: <2017-12-25 Mon 12:17>

Creator: Emacs 25.3.1 (Org mode 9.1.4)