C

Table of Contents

1. C 简介

C was originally developed by Dennis Ritchie between 1969 and 1973 at AT&T Bell Labs, and used to re-implement the Unix operating system. The developers were considering to rewrite the system using the B language, Thompson's simplified version of BCPL. However B's inability to take advantage of some of the PDP-11's features, notably byte addressability, led to C.

参考:
https://en.wikipedia.org/wiki/The_C_Programming_Language
The C Programming Language, 2nd: http://www.ime.usp.br/~pf/Kernighan-Ritchie/C-Programming-Ebook.pdf
C Programming - A Modern Approach, 2nd Edition: http://www.amazon.com/C-Programming-Modern-Approach-2nd/dp/0393979504/

1.1. C 语言标准

1990 年,国际标准化组织 ISO(International Organization for Standards)接受了 89 ANSI C 为 ISO C 的标准(ISO 9899-1990),这个版本被称为 ANSI C。 1999 年,ISO 又对 C 语言标准进行修订,在基本保留原来 C 语言特征的基础上,针对应该的需要,增加了一些功能,命名为 ISO/IEC9899:1999。2011 年 12 月 8 日,ISO 正式公布 C 语言新的标准:ISO/IEC 9899:2011,即 C11。

The latest publically available version of the C99 standard is the combined C99 + TC1 + TC2 + TC3, WG14 N1256, dated 2007-09-07.
The latest publically available version of the C11 standard is the document WG14 N1570, dated 2011-04-12.

C 标准头文件的变化如表 1 所示。

Table 1: C 标准头文件变迁
ANSI C, 1990 (15 个) ANSCI C Amendemnt 1, 1995 (18 个) C99, 1999 (24 个) C11, 2011 (29 个)
<assert.h> <assert.h> <assert.h> <assert.h>
<ctype.h> <ctype.h> <ctype.h> <ctype.h>
<errno.h> <errno.h> <errno.h> <errno.h>
<float.h> <float.h> <float.h> <float.h>
<limits.h> <limits.h> <limits.h> <limits.h>
<locale.h> <locale.h> <locale.h> <locale.h>
<math.h> <math.h> <math.h> <math.h>
<setjmp.h> <setjmp.h> <setjmp.h> <setjmp.h>
<signal.h> <signal.h> <signal.h> <signal.h>
<stdarg.h> <stdarg.h> <stdarg.h> <stdarg.h>
<stddef.h> <stddef.h> <stddef.h> <stddef.h>
<stdio.h> <stdio.h> <stdio.h> <stdio.h>
<stdlib.h> <stdlib.h> <stdlib.h> <stdlib.h>
<string.h> <string.h> <string.h> <string.h>
<time.h> <time.h> <time.h> <time.h>
  <iso646.h> <iso646.h> <iso646.h>
  <wchar.h> <wchar.h> <wchar.h>
  <wctype.h> <wctype.h> <wctype.h>
    <complex.h> <complex.h>
    <inttypes.h> <inttypes.h>
    <stdbool.h> <stdbool.h>
    <stddef.h> <stddef.h>
    <stdint.h> <stdint.h>
    <tgmath.h> <tgmath.h>
      <stdalign.h>
      <stdatomic.h>
      <stdnoreturn.h>
      <threads.h>
      <uchar.h>

参考:
C committee website: http://www.open-std.org/jtc1/sc22/wg14/
ISO/IEC 9899 - Programming languages - C: http://www.open-std.org/jtc1/sc22/wg14/www/standards

2. C 数据类型

2.1. 字符和数字

2.1.1. 各种数字所占字节数

C 语言中数字所占字节数(不同编译器可能有不同)如表 2 所示。

Table 2: C 语言中数字所占字节数
C 数字类型声明 32 位机器 64 位机器
[signed/unsigned] char 1 1
[signed/unsigned] short [int] 2 2
[signed/unsigned] int 4 4
[signed/unsigned] long [int] 4 8
[signed/unsigned] long long [int] 8 8
char * 4 8
float 4 4
double 8 8
long double - -

参考:深入理解计算机系统(原书第 2 版) 2.1.3 节

2.1.2. 固定大小的整数 (uint32_t)

为了更好的移植性,C99 中引入了新的头文件 stdint.h,这个文件中定义一些固定大小的整数。它们的声明形式为 intN_tuintN_t ,分别表示 N 位有符号和无符号整数,通常 N 为 8、16、32 和 64。比如 uint32_t 表示 32 位无符号整数。

2.1.3. 浮点数

C 语言中浮点数采用 IEEE 754 标准:

  • 单精度,对应 C 语言的 float
  • 双精度,对应 C 语言的 double
  • 扩展精度,对应 C 语言的 long double

IEEE 754 在线转换网站:
http://www.h-schmidt.net/FloatConverter/
http://babbage.cs.qc.cuny.edu/IEEE-754/

2.1.4. 有符号整数的二进制表示

有符号整数用补码(Two's complement)表示。

正整数的补码和其原码(即该数的二进制表示)相同, 负整数的补码是:将该数的绝对值的二进制形式,按位取反再加 1。 显然,对于有符号整数,如果最左边的一位是 0 则表示正数,是 1 则表示负数。

#include<stdio.h>

int main()
{
    printf(" 1=0x%x\n", 1);    /*  1=0x00000001 */
    printf("-1=0x%x\n", -1);   /* -1=0xffffffff */
    printf("-2=0x%x\n", -2);   /* -2=0xfffffffe */
    return 0;
}

2.2. 字面量

2.2.1. 字符字面量

字符字面量使用一对单引号 ' 包围,特殊字符可以在前面加上 \ 进行转义,比如:

#include<stdio.h>

int main()
{
    printf("%c\n", 'A');           // 输出 A
    printf("%c\n", '\'');          // 输出 ',单引号必须转义
    printf("%c\n", '\"');          // 输出 "
    printf("%c\n", '"');           // 输出 ",双引号不转义也可以
    printf("%c\n", '\\');          // 输出 \ 转义符
    printf("%c\n", '\t');          // 输出 tab
    return 0;
}

此外,可以使用 \ooo (o 是 8 进制数字,这里 1/2/3 个 o 都合法)和 \xhh (hh 是两个 16 进制数字)分别表示 8 进制和 16 进制的字符字面量,如:

#include<stdio.h>

int main()
{
    printf("%c\n", '\0');          // null,8 进制字符字面量,1 个 o 例子
    printf("%c\n", '\45');         // 输出 %,8 进制字符字面量,2 个 o 例子
    printf("%c\n", '\145');        // 输出 e,8 进制字符字面量,3 个 o 例子

    printf("%c\n", '\x00');        // null,16 进制字符字面量,同 \0
    printf("%c\n", '\x41');        // 输出 A,16 进制字符字面量
    return 0;
}

此外,字符字面量还可以使用 L, u, U 前缀,它们的含义为:

L'x'                  // 类型为 wchar_t
u'x'                  // 类型为 char16_t
U'x'                  // 类型为 char32_t

参考:https://en.cppreference.com/w/c/language/character_constant

2.2.2. 整数字面量

整数字面量可以是 8 进制(以 0 开始)、16 进制(以 0x 或者 0X 开始)、10 进制,如:

int d = 42;               // 10 进制
int o = 052;              // 8 进制
int x = 0x2a;             // 16 进制
int X = 0X2A;             // 16 进制

整数字面量可以添加后缀:后缀 u (或者 U ) 表示 unsigned;后缀 l (或者 L )表示 long;后缀 ll (或者 LL )表示 long long,如:

#include<stdio.h>

int main()
{
    printf("%u\n", 123u);                       // unsigned
    printf("%ld\n", 123l);                      // long
    printf("%lld\n", 123ll);                    // long long

    printf("%lu\n", 123ul);                     // unsigned long
    printf("%llu\n", 12345678901234567890ull);  // unsigned long long
    return 0;
}

2.2.3. 浮点数字面量

浮点数字面量默认为 double,如果有 f 或者 F 后缀,则是 float;如果有 l 或者 L 后缀,则是 long double,如:

#include<stdio.h>

int main()
{
    printf("%f\n", 123.2);                      // double
    printf("%f\n", 123.2f);                     // float
    printf("%Lf\n", 123.2l);                    // long double

    return 0;
}

浮点数可以采用科学计算法,标记是 e 或者 E ,如:

15.75
1.575E1       /* = 15.75   */
1575e-2       /* = 15.75   */
-2.5e-3       /* = -0.0025 */
25E-4         /* =  0.0025 */

如果浮点数的整数部分或小数部分是 0,则可以省略,如:

0.075e1
.0075e2
.075e1
75e-2

2.2.4. 字符串字面量

字符串字面量使用一对双引号 " 包围,字符串字面量的类型是 char[N] ,多个相邻的字符串字面量会自动合并为一个。

Adjacent string literals are concatenated into a single string. After any concatenation, a null byte \0 is appended to the string.

比如:

char* p = "\x12" "3"; // creates a static char[3] array holding {'\x12', '3', '\0'}
                      // sets p to point to the first element of the array

和字符字面量类似,字符串字面量也可以包含 8 进制和 16 进制,如:

char* p1 = "abc\0def";   // strlen(p1) == 3, but the array has size 8
char* p2 = "abc\x00def"; // same as above

可以用字符串字面量来初始化字符数组,如果显式指定的数组长度不够长,则会丢弃存不下的部分,如:

char a1[] = "abc";   // a1 is char[4] holding {'a', 'b', 'c', '\0'}
char a2[4] = "abc";  // a2 is char[4] holding {'a', 'b', 'c', '\0'}
char a3[3] = "abc";  // a3 is char[3] holding {'a', 'b', 'c'}
char a4[3] = "abcd"; // a4 is char[3] holding {'a', 'b', 'c'}

此外,字符串字面量还可以使用 L, u, U 前缀,它们的含义为:

L"xx"                  // 类型为 wchar_t[N]
u"xx"                  // 类型为 char16_t[N]
U"xx"                  // 类型为 char32_t[N]

2.3. 为现有类型创建新名字(typedef)

typedef 可以为现有类型创建一个新的名字, typedef 并不创建新的类型。它的用法如下:

typedef existing_type new_type_name;

下面是 typedef 的例子:

typedef char C;
typedef unsigned int WORD;

2.3.1. typedef 和数组

typedef 为数组创建别名,如:

typedef char Line[81];
Line line, secondline;

相当于:

char line[81];
char secondline[81];

2.3.2. typedef 和函数指针

typedef 为函数指针创建别名,如:

typedef void (*PrintHelloHandle)(int);
PrintHelloHandle pFunc;

相当于:

void (*pFunc)(int);

3. C 运算符

3.1. C 语言运算符的优先级和结合性

C 语言运算符位于 15 个优先级中,同一优先级的运算符,其运算先后顺序由结合性决定。

Following table lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order operators of equal precedence in an expression are applied.

+-----------------+--------------------------------------------+---------------+
|    Operator     |          Description                       | Associativity |
+-----------------+--------------------------------------------+---------------+
| ()              | Function call                              | left-to-right |
| []              | Array subscript                            |               |
| .               | Member of structure via object name        |               |
| ->              | Member of structure via pointer            |               |
+-----------------+--------------------------------------------+---------------+
| ++ --           | Increment/decrement                        | right-to-left |
| + -             | Unary plus/minus                           |               |
| ! ~             | Logical negation/bitwise complement        |               |
| (type)          | Case                                       |               |
| *               | Dereference                                |               |
| &               | Address (of operand)                       |               |
| sizeof          | Determine size in bytes                    |               |
+-----------------+--------------------------------------------+---------------+
| *  /  %         | Multiplication/division/modulus            | left-to-right |
+-----------------+--------------------------------------------+---------------+
| + -             | Addition/subtraction                       | left-to-right |
+-----------------+--------------------------------------------+---------------+
| <<  >>          | Bitwise shift left, Bitwise shift right    | left-to-right |
+-----------------+--------------------------------------------+---------------+
| <  <=           | Less than/less than or equal to            | left-to-right |
| >  >=           | Greater than/greater than or equal to      |               |
+-----------------+--------------------------------------------+---------------+
| ==  !=          | Equal to/not equal to                      | left-to-right |
+-----------------+--------------------------------------------+---------------+
| &               | Bitwise AND                                | left-to-right |
+-----------------+--------------------------------------------+---------------+
| ^               | Bitwise exclusive OR                       | left-to-right |
+-----------------+--------------------------------------------+---------------+
| |               | Bitwise inclusive OR                       | left-to-right |
+-----------------+--------------------------------------------+---------------+
| &&              | Logical AND                                | left-to-right |
+-----------------+--------------------------------------------+---------------+
| ||              | Logical OR                                 | left-to-right |
+-----------------+--------------------------------------------+---------------+
| ? :             | Ternary conditional                        | right-to-left |
+-----------------+--------------------------------------------+---------------+
| =               | Assignment                                 | right-to-left |
| += -=           | Addition/subtraction assignment            |               |
| *= /=           | Multiplication/division assignment         |               |
| %= &=           | Modulus/bitwise AND assignment             |               |
| ^= |=           | Bitwise exclusive/inclusive OR assignemnt  |               |
| <<= >>=         | Bitwise shift left/right assignment        |               |
+-----------------+--------------------------------------------+---------------+
| ,               | Comma (separate expressions)               | left-to-right |
+-----------------+--------------------------------------------+---------------+

从上表中,可得知运算符的优先级有下面规律:

初等运算符 () [] . ->
    ↓
单目运算符
    ↓
算术运算符(先乘除和模运算,后加减)
    ↓
关系运算符
    ↓
逻辑运算符(不包括单目运算符!)
    ↓
条件运算符
    ↓
赋值运算符
    ↓
逗号运算符

其中,位运算符的优先级比较分散:按位取反是单目运算符,左移右移在关系运算符之前,而 &, ^, | 在关系运算符之后。

参考:
http://www.difranco.net/compsci/C_Operator_Precedence_Table.htm
The C Programming Language, 2nd, 2.12 Precedence and Order of Evaluation

3.2. C 语言位操作符 (&, |, ^, ~, >>, <<)

C 语言的位操作符有 &, |, ^, ~, >>, << 。表 3 是前 3 个操作符 &, |, ^ 的真值表。

Table 3: The truth tables for &, |, and ^.
p 值 q 值 p & q 结果 p | q 结果 p ^ q 结果
0 0 0 0 0
0 1 0 1 1
1 1 1 1 0
1 0 0 1 1

其中,“异或”(XOR)操作符 ^ 规则: 按位操作,两个操作数不相同时为 1,相同时为 0。

异或操作符可以用于加密。比如明文为 text,而密钥为 security,它们异或后得到密文:

text ^ security = ciphertext

拿密文 ciphertext 和密钥 security 再进行一次异或就可以解密:

security ^ ciphertext = text

3.2.1. 右移

对于右移运算 x >> k ,有两种形式:“逻辑右移”在左端补 k 个 0,而“算术右移”在左端补 k 个最高有效位的值。

C 语言标准没有明确定义使用哪种右移。对于无符号数,右移必须是逻辑的。 对于有符号数,几乎所有的编译器都使用算术右移(左端补符号位)。

4. C 枚举

C 语言中,使用 enum 可以定义枚举。枚举默认从整数 0 开始,如:

#include<stdio.h>

int main()
{
    enum boolean { NO, YES };
    printf("NO=%d\n", NO);          // 输出 NO=0
    printf("YES=%d\n", YES);        // 输出 YES=1
    return 0;
}

当然,我们可以指定某个枚举量的值,后续没显式指定的枚举量其值会依次加 1,如:

#include<stdio.h>

enum week{Mon=1, Tue, Wed, Thur, Fri, Sat, Sun};    // 指定了首个枚举量为 1

int main()
{
    enum week day;

    day = Wed;
    printf("day=%d\n", day);        // 输出 day=3
    return 0;
}

5. C 结构体

Pointers to structures are so frequently used that an alternative notation(->) is provided as a shorthand.
If p is a pointer to a structure, then p->member-of-structure refers to the particular member.

Both . and -> associate from left to right, so if we have

struct point {
    int x;
    int y;
};

struct rect {
    struct point pt1;
    struct point pt2;
};

struct rect r, *rp = &r;

then these four expressions are equivalent:

    r.pt1.x
    rp->pt1.x
    (r.pt1).x
    (rp->pt1).x

参考:The C Programming Language, 2nd, 6.2 Section

5.1. 结构体的内存对齐

结构体的内存对齐有两点要注意:

  1. 每个成员变量的首地址,必须是它的类型的对齐值的整数倍,如果不是,则它与前一个成员变量之间要填充一些字节来满足要求;
  2. 整个结构体的大小,必须是该结构体中所有成员的类型中对齐值最大者的整数倍,如果不是,则在最后一个成员后面填充一些字节以满足要求。

5.1.1. 各类型对齐值

各类型的对齐值如下:
The following typical alignments are valid for compilers from Microsoft (Visual C++), Borland/CodeGear (C++Builder), Digital Mars (DMC), and GNU (GCC) when compiling for 32-bit x86:

  • A char (one byte) will be 1-byte aligned.
  • A short (two bytes) will be 2-byte aligned.
  • An int (four bytes) will be 4-byte aligned.
  • A long (four bytes) will be 4-byte aligned.
  • A float (four bytes) will be 4-byte aligned.
  • A double (eight bytes) will be 8-byte aligned on Windows and 4-byte aligned on Linux (8-byte with -malign-double compile time option).
  • A long long (eight bytes) will be 8-byte aligned.
  • A long double (ten bytes with C++Builder and DMC, eight bytes with Visual C++, twelve bytes with GCC) will be 8-byte aligned with C++Builder, 2-byte aligned with DMC, 8-byte aligned with Visual C++, and 4-byte aligned with GCC.
  • Any pointer (four bytes) will be 4-byte aligned. (e.g.: char*, int*)

The only notable differences in alignment for an LP64 64-bit system when compared to a 32-bit system are:

  • A long (eight bytes) will be 8-byte aligned.
  • A double (eight bytes) will be 8-byte aligned.
  • A long double (eight bytes with Visual C++, sixteen bytes with GCC) will be 8-byte aligned with Visual C++ and 16-byte aligned with GCC.
  • Any pointer (eight bytes) will be 8-byte aligned.

参考:http://en.wikipedia.org/wiki/Data_structure_alignment

5.1.2. 结构体对齐实例

下面类型的变量会占多少个字节呢?

struct MixedData
{
    char Data1;
    short Data2;
    int Data3;
    char Data4;
};

答案是 12 字节。分析如下:

struct MixedData  /* After compilation in 32-bit(64-bit) x86 machine */
{
    char Data1;   /* 1 byte */
    char Padding1[1]; /* 1 byte for the following 'short' to be aligned on
                         a 2 byte boundary assuming that the address where
                         structure begins is an even number */
    short Data2;  /* 2 bytes */
    int Data3;    /* 4 bytes - largest structure member */
    char Data4;   /* 1 byte */
    char Padding2[3]; /* 3 bytes to make total size of the structure 12 bytes */
};

例子摘自:https://en.wikipedia.org/wiki/Data_structure_alignment

6. C 指针和数组

考虑下面这两个声明:

int a[5];               // a 为 int 数组,这个数组可以保存 5 个元素
int *b;                 // b 为 int 指针

数组 a 和指针 b 都可以进行间接访问和下标引用操作,但它们存在很大区别。

声明一个数组时,如 a,编译器将根据声明所指定的元素数量为数组保留内存空间,然后再创建数组名,它的值是一个常量,指向这段空间的起始位置。
声明一个指针时,如 b,编译器只为指针本身保留内存空间,它并不为任何整形值分配内存空间。而且,指针变量并未被初始化为指向任何现有的内存空间,如果它是一个自动变量,它甚至根本不会被初始化。

把这两个声明可形象地表示如下:

a
+----+----+----+----+----+
|    |    |    |    |    |
+----+----+----+----+----+

b
+----+
|    |
+----+

因此,上述声明之后,表达式 *a 是完全合法的(就是数组 a 首元素的值),而表达式 *b 将访问内存中某个不确定的位置。另一方面,表达式 b++ 可以通过编译,而 a++ 却不行,因为 a 的值是个常量。

参考:C 和指针 8.1.5 数组和指针

6.1. Pointers

A pointer is a variable that contains the address of a variable.

The unary operator & gives the address of an object.
The unary operator *, when applied to a pointer, it accesses the object the pointer points to.

6.2. Array Decay to Pointer

"Decay" refers to the implicit conversion of an expression from an array type to a pointer type.

int a[] = { 1, 3, 5, 7, 9 };
int *p = a;

You lose the ability of the sizeof operator to count elements in the array:

printf("%zu\n", sizeof(a));   // 输出: 20, 可算出数组中有5个元素
printf("%zu\n", sizeof(p));   // 64位机器中输出: 8, 无法推算出对应数组的元素个数

This lost ability is referred to as "decay".

参考:http://stackoverflow.com/questions/1461432/what-is-array-decaying

6.3. 数组名为数组首元素的地址

Since the name of an array is a synonym for the location of the initial element, the assignment pa=&a[0] can also be written as pa = a.

int a[10];             /* defines an array of size 10 */
int *pa;
pa = &a[0];            /* sets pa to point to a[0] */
pa = a;                /* same as above */

6.4. Convert a[i] to *(a+i)

C converts a[i] to *(a+i) immediately, the two forms are always equivalent.
Thus, &a[i] and a+i are also identical.

6.5. 二维数组

定义了一个二维数组 a

int a[2][4] = { {0,1,2,3}, {4,5,6,7} };

可以通过 a[0][0]a[0][1] 等访问二维数组的各个元素。

C 编译器总是做这样的转换 a[i][j] = *(a[i] + j) = *(*(a+i) + j)

6.5.1. 二维数组和指针实例——为什么有 warning

考虑下面代码(1.c):

#include<stdio.h>

int main()
{
    int a[2][4] = { {0,1,2,3}, {4,5,6,7} };
    int *p=a;                 /* Wrong!!! Please use &a[0][0] or a[0] */
    printf("%d\n", *(p+1));
    return 0;
}

编译上面程序时,会出现类似下面的 warning:

cc     1.c   -o 1
1.c: In function 'main':
1.c:6:12: warning: initialization from incompatible pointer type
     int *p=a;
            ^

为什么呢?因为 a 是二维数组,它有两个元素,每个元素都是一个一维数组。而前面提到过,数组名就是数组首元素的地址,也就是说 a 是一维数组的地址,所以把 a 赋值给一个 int * 类型的指针是不合适的(会有不兼容的指针类型转换)!要使 warning 消失可以这样 int *p=&a[0][0]; ,或可以省写为 int *p=a[0] (because &a[i] and a+i are always identical)

总结: a[0] points to the first element of row 0, and a[1] points to the first element of row 1.

6.5.2. 二维数组和指针实例——“行指针”

要为行指针(如下面例子中的 p )赋值,直接用二维数组名(数组首元素地址)即可。

#include<stdio.h>

int main()
{
    int a[2][4] = { {0,1,2,3}, {4,5,6,7} };
    int (*p)[4]=a;                  /* or use &a[0], but don't need to do */
    printf("%d\n", *(p+1));         /* 输出数组{4,5,6,7}的地址 */
    printf("%d\n", *(*(p+1) + 1));  /* 输出数字5 */
    return 0;
}

分析如下: a 是二维数组,它有两个元素,每个元素都是一个一维数组。而前面提到过,数组名为数组首元素的地址,也就是说 a 是一维数组的地址,所以可以这样定义 p 指定二维数组的首行 int (*p)[4]=a; ,或者 p 也可以这样定义 int (*p)[4]=&a[0]; (because &a[i] and a+i are always identical)

参考:
http://stackoverflow.com/questions/24578628/pointer-to-an-entire-row-in-a-2-d-array

6.5.3. 二维数组能否转换为“指针的指针”

我们无法将二维数组转换为“指针的指针”,2D array 和 pointer-to-pointer 是不兼容的类型。

下面代码是错误的,在编译时会有 Warning (Incompatible pointer types).

  int a[2][3] = {{1,2,3}, {4,5,6}};
  int **p=a;                        /* Wrong! Incompatible pointer types. */

如果我们确实需要将二维数组“转换”为指针的指针,则可以像下面这样做:

  int a[2][3] = {{1,2,3}, {4,5,6}};
  int *a_rows[2] = {a[0], a[1]};      /* 通过a_rows作为“中介” */
  int **p=a_rows;

参考:
http://stackoverflow.com/questions/8203700/conversion-of-2d-array-to-pointer-to-pointer
http://stackoverflow.com/questions/1052818/create-a-pointer-to-two-dimensional-array

6.5.4. 函数参数为二维数组

If a two-dimensional array is to be passed to a function, the parameter declaration in the function must include the number of columns; the number of rows is irrelevant.

For instance, consider this two-dimensional array:

char daytab[2][13] = {
    {0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31},
    {0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}
};

If the array daytab is to be passed to a function f, the declaration of f would be:

f(int daytab[2][13]) { ... }

It could also be

f(int daytab[][13]) { ... }

since the number of rows is irrelevant, or it could be

f(int (*daytab)[13]) { ... }

which says that the parameter is a pointer to an array of 13 integers. The parentheses are necessary since brackets [] have higher precedence than *.

参考:
"The C Programming Language, 2nd" 5.7 Multi-dimensional Arrays

6.6. 字符数组初始化

初值个数小于数组长度,其余的所有元素会被自动设置为空字符 \0

char c[5]={'a', 'b'};
char c[5]={'a', 'b', '\0', '\0', '\0'};

上面两个定义相同。

char c[]={"I am happy"};
char c[]="I am happy";
char c[]={'I',' ','a','m',' ','h','a','p','p','y','\0'};

上面三种方式是等价的(长度为 11)。注意,它们和下面数组(长度为 10)是不同的。

char c[]={'I',' ','a','m',' ','h','a','p','p','y'};

6.7. Pointers to Pointers

什么时候会用到指针的指针呢?

  • The name of an array usually yields the address of its first element. So if the array contains elements of type t, a reference to the array has type t *. Now consider an array of arrays of type t: naturally a reference to this 2D array will have type (t *)* = t **, and is hence a pointer to a pointer.
  • Even though an array of strings sounds one-dimensional, it is in fact two-dimensional, since strings are character arrays. Hence: char **.
  • A function f will need to accept an argument of type t ** if it is to alter a variable of type t *.
  • Many other reasons that are too numerous to list here.

参考:http://stackoverflow.com/questions/897366/how-do-pointer-to-pointers-work-in-c

6.7.1. 给“指针的指针”分配内存

We will take an example for allocating memory to a pointer to pointer to float values. Let the number of rows be '4' and the number of columns '3'.

float **float_values;

float_values = (float**) malloc(4 *sizeof(float*));         //allocate memory for rows

for(int i=0; i<4; i++) {
   *(float_values + i) = (float*) malloc(3 *sizeof(float)); //for each row allocate memory for columns
}

c_ptr_to_ptr.jpg

Figure 1: Allocating memory to a 'Pointer to Pointer' variable

参考:http://www.codeproject.com/Articles/12449/Allocating-memory-to-a-Pointer-to-Pointer-variable

7. C 声明

C99 的声明语法:

declaration:
    declaration-specifiers [init-declarator-list];

declaration-specifiers:
    storage-class-specifier [declaration-specifiers]
    type-specifier [declaration-specifiers]
    type-qualifier [declaration-specifiers]

......

参考:ISO&IEC-9899-1999(E), 6.7 Declarations

7.1. Storage-class specifiers (extern, static, ...)

storage-class-specifier 语法:

storage-class-specifier:
    typedef
    extern
    static
    auto
    register

注 1:一个变量最多用一个 storage-class-specifier
注 2:typedef 和存储类型没有任何关系,把它们放在一起仅是为了简化语法规则。
The typedef specifier is called a 'storage-class specifier' for syntactic convenience only.

参考:
http://stackoverflow.com/questions/8674236/is-typedef-a-storage-class-specifier
ISO&IEC-9899-1999(E), 6.7.1 Storage-class specifiers

7.1.1. Storage-class specifier: extern

extern changes the linkage. With this keyword, the function/variable is assumed to be available somewhere else and the resolving is deferred to the linker.

extern 表示函数或变量的定义在其它地方。

7.1.1.1. extern 修饰函数

函数声明默认是 extern 的。

int foo(int arg1, char arg2);
extern int foo(int arg1, char arg2);    /* extern keyword can be omitted */
7.1.1.2. extern 修饰变量

If the program is in several source files, and a variable is defined in file1 and used in file2 and file3, then extern declarations are needed in file2 and file3 to connect the occurrences of the variable.

extern 实例 1:

/* file1.c */
int a=1;

/* file2.c */
#include<stdio.h>
extern int a;             /* 用extern说明a定义在其它位置 */

int main()
{
    printf("a=%d\n", a);   /* 输出a=1 */
    return 0;
}

上例是 extern 的典型用法,用于说明全局变量在其它文件中定义。
注意:如果 file2.c 中去掉 extern 关键字,则相当于在 file2.c 中定义了另外一个未初始化全局变量 a,链接时它属于一个 弱符号 ,当把 file2.o 和 file1.o 链接到一起时,会选择 file1.o 中的已初始化的全局变量,因为它是 强符号 ,所以程序的输出结果会一样,还是 a=1

extern 实例 2:

#include<stdio.h>

int a=1;
int b=1;

int main()
{
    int a;
    extern int b;       /* 用extern说明b定义在其它位置 */

    printf("a=%d\n", a);   /* 输出 a=0 */
    printf("b=%d\n", b);   /* 输出 b=1 */

    return 0;
}

说明:由于 b 定义在同一个文件中,上面例子中的 extern int b;这行是多余的,实践中往往会省略。

参考:The C Programming Language, 2nd, 1.10 External Variables and Scope

7.1.2. Storage-class specifier: static

The static declaration, applied to an external variable or function, limits the scope of that object to the rest of the source file being compiled.

当全局变量(或函数)仅仅只在当前文件中被使用时,建议使用 static 修饰,这样该全局变量(或函数)不能被其它文件引用,这有利用程序的模块化。不同的文件中可以使用相同名字的 static 全局变量(或函数),相互不会影响。

7.1.2.1. static 修饰局部变量

The static declaration can also be applied to internal variables. Internal static variables are local to a particular function just as automatic variables are, but unlike automatics, they remain in existence rather than coming and going each time the function is activated. This means that internal static variables provide private, permanent storage within a single function.

7.1.3. Storage-class specifier: auto

auto is the default storage class for local variables. auto can only be used within functions.

由于函数内的局部变量默认就是 auto,所有基本上不用显式地使用 auto 关键字。

7.1.4. Storage-class specifier: register

A register declaration advises the compiler that the variable in question will be heavily used.
The idea is that register variables are to be placed in machine registers, which may result in smaller and faster programs. But compilers are free to ignore the advice.

However, if an object is declared register, the unary & operator may not be applied to it, explicitly or implicitly. The rule that it is illegal to calculate the address of an object declared register.

7.2. Type specifiers (void, char, ...)

type-specifier 语法:

type-specifier:
    void
    char
    short
    int
    long
    float
    double
    signed
    unsigned
    _Bool
    _Complex
    _Imaginary
    struct-or-union-specifier
    enum-specifier
    typedef-name

Type specifiers 可以指定多个,如 unsigned int 等。

参考:ISO&IEC-9899-1999(E), 6.7.2 Type specifiers

7.3. Type qualifiers (const, restrict, volatile)

type-qualifier 语法:

type-qualifier:
    const
    restrict
    volatile
Table 4: C 中的 type qualifier
type qualifier description
const 只读对象
restrict C99 中新增,它只能用于指针。
volatile 可变对象

注:标准中没有限定一个变量最多用一个 type qualifier,下面是一个同时使用 const 和 volatile 的例子:

extern const volatile int real_time_clock
/* real_time_clock may be modifiable by hardware, but cannot
   be assigned to, incremented, or decremented. */

参考:ISO&IEC-9899-1999(E), 6.7.3 Type qualifiers

7.3.1. Type qualifier: const

The qualifier const can be applied to the declaration of any variable to specify that its value will not be changed.

实例:const 和指针

const int *A;        //const修饰指向的对象,A可变,A指向的对象不可变
int const *A;        //const修饰指向的对象,A可变,A指向的对象不可变(同上)
int *const A;        //const修饰指针A, A不可变,A指向的对象可变
const int *const A;  //指针A和A指向的对象都不可变

说明:int *const A 这种没有初始化的声明没有用处(因为 A 不可以变),应该在定义时初始化,如 int *const A=&a

7.3.2. Type qualifier: volatile

volatile 的作用是防止编译器对代码进行优化而改变了程序原有意图。
比如如下程序:

XBYTE[2]=0x55;
XBYTE[2]=0x56;
XBYTE[2]=0x57;
XBYTE[2]=0x58;

对外部硬件而言,上述四条语句分别表示不同的操作,会产生四种不同的动作,但是编译器可能会对上述四条语句进行优化,认为只有最后一条有效,忽略前三条语句(只产生一条代码)。如果使用 volatile,则编译器会逐一的进行编译并产生相应的机器代码(产生四条代码)。

7.3.2.1. volatile 实例

下面是 volatile 实例:

/*  这个程序没有可移植性,不同的编译器结果不一样! */
#include <stdio.h>
int main()
{
    volatile int i = 10;    /*   */
    int a,b;

    a = i;
    printf("i=%d\n", a);
    /* 下面汇编语句的作用是改变内存中i的值为20,但是C编译器不知道 */
    asm ("movl $20,   8(%rbp)");

    b = i;
    printf("i=%d\n", b);

    return 0;
}

由于变量 ivolatile 修饰,会输出:

i=10
i=20

如果去掉修饰 ivolatile 关键字,则可能会输出(C 编译器不知道 i 被内嵌的汇编代码修改了,优化了变量 bi 初值):

i=10
i=10

参考:
http://blog.chinaunix.net/uid-22906954-id-4598507.html
http://baike.baidu.com/view/608706.htm

7.3.3. Type qualifier: restrict

restrict 是 C99 标准引入的,它只可以用于限定和约束指针,表明指针是访问一个数据对象的唯一且初始的方式。即它告诉编译器,所有修改该指针所指向内存中内容的操作都必须通过该指针来修改,而不能通过其它变量或指针来修改;这样做的好处是能帮助编译器进行更好的优化代码,生成更有效率的汇编代码。

7.4. 如何分析 C 中复杂的声明

C 语言中的声明可以写得非常复杂,用下面的优先级规则可以帮助理解。

C 语言声明的优先级规则

A 声明从它的名字开始读取,然后按照优先级顺序依次读取。
B 优先级从高到低依次是:
  B.1 声明中被括号括起来的那部分。
  B.2 后缀操作符:括号()表示这是一个函数,而方括号[]表示这是一个数组。
  B.3 前缀操作符:星号*表示“指向...的指针”。
C 如果const和(或)volatile关键字的后面紧跟类型说明符(如int,long等),那么它作用于类型说明符。在其他情况下,const和(或)volatile关键字作用于它左边紧邻的指针星号。

参考:
C 专家编程 3.3 节
C 声明在线分析 http://cdecl.org/

7.4.1. C 声明分析实例

考虑声明:

char * const *(*next)();

我们可以采用表 5 所示的分析过程。

Table 5: 声明 char * const *(*next)(); 的分析过程
适用规则 解释
A 首先,看变量名"next",并注意到它直接被括号所括住
B.1 所以先把括号里的东西作为一个整体,得出"next 是一个指向...的指针"
B 然后考虑括号外面的东西,在星号前缀和括号后缀之间作出选择
B.2 B.2 规则告诉我们优先级较高的是右边的函数括号,所以得出"next 是一个函数指针,指向一个返回...的函数"
B.3 然后,处理前缀"*",得出指针所指的内容
C 最后,把"char * const"解释为指向字符的常量指针

总结上面的分析过程,可知这个声明表示:“next 是一个指针,它指向一个函数,该函数返回另一个指针,该指针指向一个类型为 char 的常量指针”。

7.5. 定义和声明的区别

``Definition'' refers to the place where the variable is created or assigned storage; ``declaration'' refers to places where the nature of the variable is stated but no storage is allocated.

8. C 语句

8.1. if 语句

if 语句形式一:

if (expression)
  statement1
else
  statement2

where the else part is optional.

if 语句形式二:

if (expression)
  statement
else if (expression)
  statement
else if (expression)
  statement
else if (expression)
  statement
else
  statement

where the else part is optional.

8.2. switch 语句

The switch statement is a multi-way decision that tests whether an expression matches one of a number of constant integer values, and branches accordingly.

switch (expression) {
  case const-expr: statements
  case const-expr: statements
  default: statements
}

Each case is labeled by one or more integer-valued constants or constant expressions. If a case matches the expression value, execution starts at that case. All case expressions must be different. The case labeled default is executed if none of the other cases are satisfied. A default is optional; if it isn't there and if none of the cases match, no action at all takes place. Cases and the default clause can occur in any order.

The break statement causes an immediate exit from the switch.

8.3. for, while 和 do-while 语句

C 语言中有三种循环语句,分别如图 2,图 3,及图 4 所示。

c_for_loop.jpg

Figure 2: for loop in C

c_while_loop.jpg

Figure 3: while loop in C

c_do_while_loop.jpg

Figure 4: do while loop in C

8.4. break 和 continue 语句

C 中 breakcontinue 如表 6 所示。

Table 6: C 中 break 和 continue
Control Statement 描述
break Terminates the loop or switch statement and transfers execution to the statement immediately following the loop or switch.
continue Causes the loop to skip the remainder of its body and immediately retest its condition prior to reiterating.

说明: switch 语句中 continue 无意义,若有 continue 则属于外层的循环语句。

8.5. goto 和 label

goto 和 label 的使用是 local to function,即只在函数内有效。

9. C 函数

9.1. Arguments - Call by Value

In C, all function arguments are passed by value. This means that the called function is given the values of its arguments in temporary variables rather than the originals. This leads to some different properties than are seen with call by reference languages like Fortran or with var parameters in Pascal, in which the called routine has access to the original argument, not a local copy.

When the name of an array is used as an argument, the value passed to the function is the address of the beginning of the array - there is no copying of array elements.

9.1.1. 参数的执行顺序

C99 标准中没有指定函数参数的执行顺序。

From C99

6.5.2.2 Function calls
...
10 The order of evaluation of the function designator, the actual arguments, and subexpressions within the actual arguments is unspecified, but there is a sequence point before the actual call.

上面条款明确说明了:函数参数执行顺序是 unspecified 的,但能保证参数执行在真正调用函数前(因为真正调用函数前有sequence point)。

9.1.2. 函数参数为 void

在 C 语言中,函数参数为空和指定为 void 的含义不同。
注:在 C++ 中它们含义相同,都是不接受参数。

int func1();      // In C, declare function taking unspecified parameters
int func2(void);  // In C, declare function taking zero parameters

参考:http://stackoverflow.com/questions/7140045/function-pointer-declaration

9.2. 函数的变长参数

如何实现函数的变长参数?

在 C89 中,头文件 stdarg.h 中定义了一个类型 va_list ,三个宏 va_start, va_arg, va_end

9.2.1. 函数变长参数实例 1

函数变长参数实例 1:

#include <stdio.h>
#include <stdarg.h>

/* minprintf: minimal printf with variable argument list */
void minprintf(char *fmt, ...)
{
    va_list ap; /* points to each unnamed arg in turn */
    char *p, *sval;
    int ival;
    double dval;
    va_start(ap, fmt); /* make ap point to 1st unnamed arg */
    for (p = fmt; *p; p++) {
        if (*p != '%') {
            putchar(*p);
            continue;
        }
        switch (*++p) {
        case 'd':
            ival = va_arg(ap, int);
            printf("%d", ival);
            break;
        case 'f':
            dval = va_arg(ap, double);
            printf("%f", dval);
            break;
        case 's':
            for (sval = va_arg(ap, char *); *sval; sval++)
                putchar(*sval);
            break;
        default:
            putchar(*p);
            break;
        }
    }
    va_end(ap); /* clean up when done */
}

int main() {
    minprintf("%d", 12);
    return 0;
}

参考:The C Programming Language, 2nd, Section 7.3

9.2.2. 函数变长参数实例 2

函数变长参数实例 2:

#include <stdarg.h>
#include <stdio.h>

/* this function will take the number of values to average
   followed by all of the numbers to average */
double average ( int num, ... )
{
    va_list arguments;
    double sum = 0;

    /* Initializing arguments to store all values after num */
    va_start ( arguments, num );
    /* Sum all the inputs; we still rely on the function caller to tell us how
     * many there are */
    int x;
    for ( x = 0; x < num; x++ )
    {
        sum += va_arg ( arguments, double );
    }
    va_end ( arguments );                  // Cleans up the list

    return sum / num;
}

int main()
{
    /* this computes the average of 13.2, 22.3 and 4.5 (3 indicates the number of values to average) */
    printf( "%f\n", average ( 3, 12.2, 22.3, 4.5 ) );
    /* here it computes the average of the 5 values 3.3, 2.2, 1.1, 5.5 and 3.3 */
    printf( "%f\n", average ( 5, 3.3, 2.2, 1.1, 5.5, 3.3 ) );
}

参考:http://www.cprogramming.com/tutorial/c/lesson17.html

9.2.3. 变长参数传递给其它函数(vprintf)

如何把变长参数传递给其它函数?

比如,我们想给 printf 写个包装函数 faterror

void faterror(const char *fmt, ...)
{
    va_list argp;
    va_start(argp, fmt);
    puts("Other message.");
    printf(fmt, argp);        /* WRONG */
    va_end(argp);
    exit(EXIT_FAILURE);
}

上面的写法是错误的。正确的写法是用 vprintf 替换上面的 printf 即可。

参考:http://c-faq.com/varargs/handoff.html

9.3. 宏的变长参数

如何让宏也支持变长参数呢?

在 C89 中无法优雅地实现,C FAQ 10.26 中有一些不太好的方法。

在 C99 中能很好地支持宏的变长参数。
C99 introduces formal support for function-like macros with variable-length argument lists. The notation ... can appear at the end of the macro ``prototype'' (just as it does for varargs functions), and the pseudomacro __VA_ARGS__ in the macro definition is replaced by the variable arguments during invocation.

实例:带变长参数的宏

#define XXX(...) fun1(__VA_ARGS__)

参考:http://c-faq.com/cpp/varargs.html

9.4. Function Pointers(函数指针)

A function pointer is a variable that stores the address of a function that can later be called through that function pointer.

参考:
http://www.cprogramming.com/tutorial/function-pointers.html
http://www.newty.de/fpt/fpt.html#chapter2

9.4.1. 函数名隐式转换为函数指针

C 语言中, 函数名会隐式地转换为函数的指针。

#include <stdio.h>

void fun1(void){
    printf("this is fun1\n");
}

int main()
{
    printf("%p\n", fun1);
    printf("%p\n", &fun1);     //和上一句输出的地址是相同的。
    //printf("%p\n", &&fun1);  //语法错误。

    fun1();            //输出this is fun1
    (*fun1)();         //输出this is fun1
    (**fun1)();        //输出this is fun1
    (***fun1)();       //输出this is fun1
    return 0;
}

参考:
http://stackoverflow.com/questions/6893285/why-do-all-these-crazy-function-pointer-definitions-all-work-what-is-really-goi
http://stackoverflow.com/questions/840501/how-do-function-pointers-in-c-work

9.4.2. 函数指针基本用法

函数指针的基本用法可参见下面实例:

#include <stdio.h>
void my_func(int x)
{
  printf("%d\n", x);
}

int main()
{
  void (*foo)(int);     /* 声明foo为函数指针 */
  foo = &my_func;       /* 由于函数名会转换为函数指针,所以也可简写为 foo = my_func; */

  /* call my_int_func (note that you do not need to write (*foo)(2) ) */
  foo(2);
  /* but if you want to, you may */
  (*foo)(2);

  return 0;
}

9.4.3. 函数指针实例——库函数 qsort 的最后一个参数

库函数 qsort 的声明如下(最后一个参数是一个函数指针):

#include <stdlib.h>

void qsort(void *base, size_t nmemb, size_t width, int (*compar)(const void *,const void *));
/* 参数:
base 待排序数组首地址
nmemb 数组中待排序元素数量
width 各元素的占用空间大小
compar 指向一个比较函数的指针
*/

说明:qsort 的最后一个参数是一个函数指针,这使得 qsort 有很好的通用性。我们可以对简单的整形数组排序,也可以按结构体的某个字段对结构体数组进行排序。

实例:用 qsort 从小到大排序 double 数组

#include <stdio.h>
#include <stdlib.h>

int cmp(const void *x, const void *y)
{
  double xx = *(double*)x, yy = *(double*)y;
  if (xx < yy) return -1;    /* 第1个参数小于第2个参数时返回负数,可按从小到大的顺序排序 */
  if (xx > yy) return  1;
  return 0;
}

int main() {
  double arr[] = {9.03, 5, 1.56, 2, 0.2};
  int num = sizeof(arr)/sizeof(arr[0]);

  qsort(arr, num, sizeof(arr[0]), cmp);

  int i;
  for (i=0; i<num; i++) {
    printf("%f\n", arr[i]);
  }
  return 0;
}
/* 输出:
0.200000
1.560000
2.000000
5.000000
9.030000
*/

9.4.4. 函数指针实例——Array of function pointers

#include <stdio.h>

void fun1() { printf("fun1\n"); }
void fun2() { printf("fun2\n"); }
void fun3(int x, int y) { printf("fun3, %d\n", x+y); }

int main() {

  /* 声明并初始化函数指针数组 */
  void (*handlers[3])() = {
    fun1,
    fun2,
    (void (*)())fun3
  };

  /* 调用各个函数 */
  handlers[0]();         /*  也可以这样调用 (*handlers[0])();  */
  (*handlers[1])();
  handlers[2](3, 4);

  return 0;
}

9.4.5. 函数指针实例——返回函数指针

下面将演示一个返回函数指针的实例:

#include <stdio.h>

float plus(float a, float b) { return a + b; }
float minus(float a, float b) { return a - b; }

float (*getFunc(const char opCode))(float, float)
{
   if(opCode == '+')
       return plus;           /* 也可写为 return &Plus */
   else if (opCode == '-')
       return minus;
   else
       return NULL;
}

int main() {
    float (*fp)(float, float);

    fp = getFunc('+');
    printf("%f\n", fp(1.5, 1.2));      /* 输出 2.700000 */

    fp = getFunc('-');
    printf("%f\n", (*fp)(1.5, 1.2));   /* 输出 0.300000 */

    return 0;
}

上例中,函数 getFunc 会返回函数指针,但它的定义太复杂,可用 typedef 简化为下面更易读的形式:

typedef float (*funcSig)(float, float);

funcSig getFunc(const char opCode) {
   if(opCode == '+')
       return plus;
   else if (opCode == '-')
       return minus;
   else
       return NULL;
}

10. C 预处理

参考:
"The C Programming Language, 2nd" 4.11 The C Preprocessor
"ISO&IEC-9899-1999(E)" 6.10 Preprocessing directives
The GNU C Preprocessor: https://gcc.gnu.org/onlinedocs/cpp/

10.1. 文件包含 (#include)

Any source line of the form:

#include "filename"   /* 先搜索当前文件所在目录,再搜索系统目录 */

or

#include <filename>   /* 仅搜索系统目录 */

is replaced by the contents of the file filename.

10.2. 宏替换 (#define, #undef)

#define token replacement

Each token will be replaced by the replacement text.
Substitutions are made only for tokens, and do not take place within quoted strings.

Names may be undefined with #undef, usually to ensure that a routine is really a function, not a macro:

#undef getchar
int getchar(void) { ... }

10.2.1. Macro with arguments

It is also possible to define macros with arguments, so the replacement text can be different for different calls of the macro.
As an example, define a macro called max:

#define max(A, B) ((A) > (B) ? (A) : (B))

Each occurrence of a formal parameter (here A or B) will be replaced by the corresponding actual argument.

The line:

x = max(p+q, r+s);

will be replaced by the line:

x = ((p+q) > (r+s) ? (p+q) : (r+s));

If you examine the expansion of max, you will notice some pitfalls. The expressions are evaluated twice; this is bad if they involve side effects like increment operators or input and output. For instance

max(i++, j++);  /* WRONG */

will increment the larger twice.

10.2.2. The # operator

A parameter name is preceded by a # in the replacement text, the combination will be expanded into a quoted string with the parameter replaced by the actual argument.

For example, a debugging print macro:

#define dprint(expr) printf(#expr " = %g\n", expr)

When this is invoked, as in

 dprint(x/y);

the macro is expanded into

 printf("x/y" " = &g\n", x/y);

and the strings are concatenated, so the effect is

 printf("x/y = &g\n", x/y);

10.2.3. The ## operator

The preprocessor operator ## provides a way to concatenate actual arguments during macro expansion.
If a parameter in the replacement text is adjacent to a ##, the parameter is replaced by the actual argument, the ## and surrounding white space are removed, and the result is rescanned.

For example, the macro paste concatenates its two arguments:

#define paste(front, back) front ## back

so paste(name, 1) creates the token name1.

10.3. 条件包含 (#if)

The #if line evaluates a constant integer expression (which may not include sizeof, casts, or enum constants). If the expression is non-zero, subsequent lines until an #endif or #elif or #else are included.

The #ifdef and #ifndef lines are specialized forms that test whether a name is defined.

10.4. Line control (#line)

A #line directive sets the compiler's setting for the current file name and line number.

Directives #line alter the results of the __FILE__ and __LINE__ predefined macros from that point on.

It's also used by other tools that generate C source code, such as lex/flex and yacc/bison, so that error messages can refer to the input file rather than the (temporary) generated C code.

A line of the form:

#line number

sets the current line number.

A line of the form:

#line number "file-name"

sets both the line number and the file name.

参考:
https://gcc.gnu.org/onlinedocs/cpp/Line-Control.html
http://stackoverflow.com/questions/7109540/line-keyword-in-c

10.5. Error directive (#error)

A preprocessing directive of the form:

#error error-message

causes the implementation to produce a diagnostic message that includes the specified sequence of preprocessing tokens.

10.6. Pragmas (#pragma, _Pragma)

10.6.1. #pragma directive

The #pragma directive is the method specified by the C standard for providing additional information to the compiler, beyond what is conveyed in the language itself.

10.6.2. _Pragma operator

C99 introduces the _Pragma operator. This feature addresses a major problem with #pragma: being a directive, it cannot be produced as the result of macro expansion. _Pragma is an operator, much like sizeof or defined, and can be embedded in a macro.

Its syntax is

_Pragma (string-literal)

where string-literal is destringized, by replacing all '\\' with a single '\' and all '\"' with a '"'.

For example,

_Pragma ("GCC dependency \"parse.y\"")

has the same effect as

#pragma GCC dependency "parse.y".

The same effect could be achieved using macros, for example

#define DO_PRAGMA(x) _Pragma (#x)
DO_PRAGMA (GCC dependency "parse.y")

说明: #pragma 是 directive,出现在宏展开中无意义; _Pragma 是操作符,可出现在宏展开中。

11. C 预定义的宏和标识符

11.1. C 预定义的宏

C 中有一些预定义的宏,方便用户程序使用,如表 7 所示。

Table 7: C 预定义宏
C predefined macro 描述
__DATE__ A character string literal of the form "Mmm dd yyyy"
__FILE__ Current source file (a character string literal)
__LINE__ Line number (within the current source file) of the current source line (an integer constant)
__STDC__ The integer constant 1, intended to indicate a conforming implementation
__STDC_HOSTED__ C99 中增加。The integer constant 1 if the implementation is a hosted implementation or the integer constant 0 if it is not.
__STDC_VERSION__ C99 中增加。The integer constant 199901L.
__TIME__ A character string literal of the form "hh:mm:ss" as in the time generated by the asctime function.

说明:什么是 hosted environment?
A hosted environment has the complete facilities of the standard C library available.

8 中三个宏在 C99 中增加,它们的定义由实现决定,编译器如果支持,就会设置。

Table 8: C99 中增加的由实现决定的预定义宏
C 由实现决定的预定义宏 描述
__STDC_IEC_599__ 若支持 IEC 60559 浮点运算,则为 1
__STDC_IEC_599_COMPLEX__ 若支持 IEC 60599 复数运算,则为 1
__STDC_ISO_10646__ 由编译程序支持,用于说明 ISO/IEC 10646 标准的年和月格式:yyymmmL

参考:ISO&IEC-9899-1999(E) 6.10.8 Predefinded macro names

11.2. C 预定义的标识符

__func__ 是 C 语言中函数体内预定义的标识符,它会被自动设置为当前函数的名字。相当于在每个函数定义的第一行就有下面的声明一样。

static const char __func__[] = "function-name";

实例:调用下面函数时会输出函数名 myfunc

#include <stdio.h>
void myfunc(void)
{
    printf("%s\n", __func__);
    /* ... */
}

参考:ISO&IEC-9899-1999(E) 6.4.2.2 Predefinded identifiers

12. Formatted Input/Output

参考:ISO&IEC-9899-1999(E) 7.19.6 Formatted input/output functions

12.1. Formatted output

The printf functions provide formatted output conversion.

int fprintf(FILE *stream, const char *foramt, ...)

The format string contains two types of objects: ordinary characters, which are copied to the output stream, and conversion specification.
Each conversion specification begins with the character % and ends with a conversion specifier character (see table 9). Between the % and the conversion specifier there may be, in order:

  • Zero or more flags (in any order).
  • An optional minimum field width. The field width takes the form of an asterisk (*) or a decimal integer. 其中使用星号时的含义参考节 12.1.3
  • An optional precision that gives the minimum number of digits to appear for the d, i, o, u, x, and X conversions, the number of digits to appear after the decimal-point character for a, A, e, E, f, and F conversions, the maximum number of significant digits for the g and G conversions, or the maximum number of bytes to be written for s conversions. The precision takes the form of a period (.) followed either by an asterisk (*) or by an optional decimal integer; if only the period is specified, the precision is taken as zero. 其中使用星号时的含义参考节 12.1.3
  • An optional period, which separates the field width from the precison.
  • An optional precision.
  • An optional length modifier.
Table 9: PRINTF conversion specifier
Conversion sepcifier Argument Type; Converted to
d, i int; 带符号整数,十进制
o unsigned int; 无符号整数,八进制
x, X unsigned int; 无符号整数,十六进制
u unsigned int; 无符号整数,十进制
f double; decimal notation of the form [-]mmm.ddd, where the number of d's is specified by the precision. The default precision is 6; a precison of 0 suppresses the decimal point.
e, E double; decimal notation of the form [-]m.dddddd e±xx or [-]m.dddddd E±xx, where the number of d's is specified by the precision. The default precision is 6; a precison of 0 suppresses the decimal point.
g, G double; %e or %E is used if the exponen is less that -4 or greater than or equal to the precision; otherwise %f is used. Trailing zeros and a trailing decimal point are not printed.
a, A double; decimal notation of the form [−]0xh.hhhhp±d
c int; the int argument is converted to an unsigned char.
s char *; characters from the string are printed until a '\0' is reached or until the number of characters indicated by the precision have been printed.
p void *; print as a pointer.
n int *; the argument shall be a pointer to signed integer into which is written the number of characters written to the output stream so far by this call to printf functions. No argument is converted, but one is consumed.
% no argument is converted; print a %.

下面是 printf%n 用法实例:

#include <stdio.h>
int main()
{
    int count1;
    int count2;

    printf("ABCDE%nFGHI%n\n", &count1, &count2);
    printf("First count is %d, second count is %d.\n", count1, count2);
    // 上行会输出 First count is 5, second count is 9.
    return 0;
}

12.1.1. flags

格式字符串中 % 和 conversion specifier 之间的 flags 如表 10 所示。

Table 10: flags (Formatted output)
flag characters meanings
- The result of the conversion is left-justified within the field. (It is right-justified if this flag is not specified.)
+ The result of a signed conversion always begins with a plus or minus sign.
space If the first character of a signed conversion is not a sign, or if a signed conversion results in no characters, a space is prefixed to the result. If the space and + flags both appear, the space flag is ignored.
# The result is converted to an alternative form. For o conversion, it increases the precision, if and only if necessary, to force the first digit of the result to be a zero (if the value and precision are both 0, a single 0 is printed). For x (or X) conversion, a nonzero result has 0x (or 0X) prefixed to it. For a, A, e, E, f, F, g, and G conversions, the result of converting a floating-point number always contains a decimal-point character, even if no digits follow it. (Normally, a decimal-point character appears in the result of these conversions only if a digit follows it.) For g and G conversions, trailing zeros are not removed from the result. For other conversions, the behavior is undefined.
O For d, i, o, u, x, X, a, A, e, E, f, F, g, and G conversions, leading zeros (following any indication of sign or base) are used to pad to the field width rather than performing space padding, except when converting an infinity or NaN. If the 0and-flags both appear, the0flag is ignored. Ford,i,o,u,x, andX conversions, if a precision is specified, the 0 flag is ignored. For other conversions, the behavior is undefined.

12.1.2. length modifier

格式字符串中 % 和 conversion specifier 之间的 length modifier 如表 11 所示。

Table 11: length modifier (Formatted output)
length modifier characters meanings
hh Indicates the argument is a signed char or unsigned char
h Indicates the argument is a short int or unsigned short
l Indicates the argument is a long int or unsigned long int
ll Indicates the argument is a long long int or unsigned long long int
j Indicates the argument is a intmax_t or uintmax_t (In <inttypes.h>)
z Indicates the argument is a size_t
t Indicates the argument is a ptrdiff_t (In <stddef.h>)
L Indicates the argument is a long double

用十进制输出 size_t 类型,C99 中可用 %zu ,其中 z 是 length modifier,而 u 是 conversion specifier character。

12.1.3. minimum field width 或者 precision 可使用星号

12.1 中介绍过,minimum field width 或者 precision 除了使用十进制数字外,还可以使用星号。使用星号时表示这个值从参数中获取,也就是说会额外消耗一个参数。

下面以 precision 使用星号为例进行说明:

#include <stdio.h>

int main() {
    fprintf(stderr, "%.5s\n", "abcdefghijk");        // .5 表示 precision 为 5

    int x = 5;
    fprintf(stderr, "%.*s\n", x, "abcdefghijk");     // .* 表示 precision 取参数中的 x 的值
}

上面程序的输出如下:

abcde
abcde

12.2. Formatted input

The scanf functions deal with formatted input conversion.

int fscanf(FILE *stream, const char *format, ...)

fscanf reads from stream under control of format, and assigns converted values through subsequent arguments, each of which must be a pointer. It returns when format is exhausted. fscanf returns EOF if end of file or an error occurs before any conversion; otherwise it returns the number of input items converted and assigned.

注意:scanf 系列函数中的 format 和 printf 中有些类似,但也有很多不同。如 scanf 不能做精度控制等等。

参考:ISO&IEC-9899-1999(E) 7.19.6.2 The fscanf function

12.2.1. assignment suppression (*)

如果想忽略读入的部分输入,可以用星号 * 标记,如 %*c 可忽略 1 个字符。

下面是用 %*c 忽略 1 个字符的例子:

#include<stdio.h>
int main(void)
{
    int x, y;
    scanf("%d%*c%d",&x,&y);               /* 读入123/456时,123会赋值给x,456则赋值给y */
    print("x is %d, y is %d\n", x, y);
    return 0;
}

12.2.2. scanset ([...], [^...])

扫描集(scanset)定义一个字符集合,可由 scanf() 读入其中允许的字符并赋给对应字符数组。
扫描集合由一对方括号中的一串字符定义,左方括号前必须有百分号。
另外,如果方括号中第 1 个字符为 ^ ,则表示“取反”的意思。

下面是用 scanf() 接受带空格字符串的例子:

#include<stdio.h>
int main(void)
{
    char str[100];
    scanf("%[^\n]",str);  /* scanf("%s",string); 不能接收字符串中的空格 */
    printf("%s\n",str);
    return 0;
}

13. 处理字符串和文本文件

13.1. strncat

strncat 的原型为:

char *strncat(char *dest, const char *src, size_t n);

它能保证 dest 最后一个字符为'\0'。
但如果 src 等于或大于 n 时,它会复制 n+1(最后一个为'\0')个字符到 dest。

特别说明:strncat 并不是为了提供一安全版本的 strcat(如果 dest 不够长,strncat 将不安全),而仅仅只是提供“连接字符串 src 中前 n 个字符到 dest 末尾”的功能。

char buf[4] = "ab";
char *src = "1234567890";
strncat(buf, src, 8);     /* 错误用法!不安全,因为buf太小! */

下面用法是错误的,当 src 很长时,可能会非法写入 dest 之后的一个字节。
strncat(dest, src, sizeof(dest) - strlen(dest));

strncat 的常见用法(这种用法是安全的,但 src 很长时,可能只会复制 src 的部分内容到 dest 末尾):

strncat(dest, src, sizeof(dest) - 1 - strlen(dest));

13.1.1. strncat 的简单实现

下面的实现摘自 man strncat

char*
strncat(char *dest, const char *src, size_t n)
{
    size_t dest_len = strlen(dest);
    size_t i;

    for (i = 0 ; i < n && src[i] != '\0' ; i++) {
        dest[dest_len + i] = src[i];
    }
    dest[dest_len + i] = '\0';

    return dest;
}

13.2. strncpy, snprintf

strncpy 的原型为:

char *strncpy(char *dest, const char *src, size_t n);

把字符串 src 的前 n 个字节复制到 dest 开始的地址空间中,并返回 dest。

特别注意,strncpy 不会自动添加\0,如:

#include<stdio.h>
#include<string.h>

int main() {

  char a[10] = "aaaaaaaaaa";
  strncpy(a, "01234567890abcd", 5);

  printf("%s\n", a);                /* 会输出 01234aaaaa */
  return 0;
}

strncpy 可能不正确的用法(如果 dest 在使用前已经初始化过,则没有问题):
strncpy(dest, src, sizeof(dest) - 1);
它不能保证 dest 以 0 结束,除非调用函数前 dest 的最后一个字节为 0。

strncpy 的安全用法:

strncpy(dest, src, sizeof(dest));
dest[sizeof(dest)-1] = '\0';       // 防止安全隐患,请手工把最后一字节填充\0

strncpy 说明:
1.如果想要防止溢出,size 应该写为 sizeof(dest)或 sizeof(dest)-1,不可误用 sizeof(src)。
2.它有安全隐患,务必要把 dest 的最后一个字节手工设置为\0。strncpy 仅在 src 的长度小于 n 时,才会填充 n-strlen(str)个字节的\0。
3.性能问题。 当 dest 长度远大于 src 时,strncpy(dest, src, sizeof(dest));会对多余的每个字节填\0,会有性能损失。
4.返回值。strncpy 返回 dest,因而无法知道拷贝了多少个字节。

snprintf 也可用来复制字符串。
snprintf 的正确用法:

snprintf(dest, sizeof(dest), "%s", src);

snprintf 说明:
1.不可省略第三个参数"%s",因为存在隐患: 省略第三个参数时,如果 src 中包含%,会引发 core。
2.性能问题。当 src 长度远大于 dest 时,由于 snprintf 要返回 src 的字节数,需要扫描 src,会有性能损失。
3.返回值。如果当前 buf 够用,返回实际写入的字符数;如果不够用,返回将要写入的字符数。

strncpy 和 snprintf 总结:
1.snprintf 使用比 strncpy 简洁。
2.snprintf 可以获取被拷贝的字节数。
3.二者都有性能问题。
4.strncpy 不安全!snprintf 安全(能保证在字符串结尾一定有'\0')。

参考:
http://www.jb51.net/article/39994.htm

13.2.1. strncpy 的简单实现

下面的实现摘自 man strncpy

char*
strncpy(char *dest, const char *src, size_t n){
    size_t i;

    for (i = 0 ; i < n && src[i] != '\0' ; i++) {
        dest[i] = src[i];
    }
    for ( ; i < n ; i++) {
        dest[i] = '\0';
    }

    return dest;
}

13.2.2. snprintf 返回值

snprintf 原型为:

int snprintf(char *str, size_t size, const char *format, ...);

snprintf 最多写入 size 个字节(包含最后一个\0,所以有效的字节数仅为 size - 1); 它的返回值是将要写入的字节数。
可以用 snprintf 的返回值测试输出有没有被截断——如果返回值等于或者大于第二个参数 size 值,则说明输出被截断!

char abc[20] = "1234567";
printf("snprintf return %d\n", snprintf(abc, 10, "xyz_%s", "12345678"));
// 上面语句会输出 "snprintf return 12",12即xyz_12345678的长度
printf("abc is %s\n",abc);
// 上面语句会输出 "abc is xyz_12345",仅输出9个字节,第10个字节为\0

13.3. strspn, strcspn, strpbrk

strspn stands for string span
strcspn stands for string complement span
strpbrk stands for string pointer break

参考:http://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_node/Search-Functions.html

13.3.1. strspn

strspn 原型:

size_t strspn(const char *s1, const char *s2);

The strspn() function returns the number of bytes in the initial segment of s1 which consist only of bytes from s2.

strspn 实例:

#include <string.h>
#include <stdio.h>
int main()
{
    printf("%lu\n",strspn("LinuxLinux","niLux"));   // 输出 10
    printf("%lu\n",strspn("LinuxLinux","Lkix"));    // 输出 2
    printf("%lu\n",strspn("aLinuxLinux","Lkix"));   // 输出 0
    return 0;
}

13.3.2. strcspn

strcspn 原型:

size_t strcspn(const char *s1, const char *s2);

The strcspn() function returns the number of bytes in the initial segment of s1 which are not in the string s2.

#include <stdio.h>
#include <string.h>
int main()
{
    char *s="Golden Global View";
    char *r="new";
    int n=strcspn(s,r);
    printf("The first char both in s1 and s2 is: %c\n",s[n]); // The first char both in s1 and s2 is: e
    return 0;
}

参考:http://baike.baidu.com/view/1028539.htm

13.3.3. strpbrk

strpbrk 原型:

char *strpbrk(const char *s1, const char *s2);

The strpbrk() function returns a pointer to the byte in s1 that matches one of the bytes in s2, or NULL if no such byte is found.

13.4. strtok

strtok 用于按指定的分隔符把某字符串分解为一组字符串。

在第一次调用时,strtok 必需给予参数 s 字符串,往后的调用则将参数 s 设置成 NULL。每次调用成功则返回指向被分割出片段的指针。

注:strtok 是一个线程不安全的函数,因为它使用了静态分配的空间来存储被分割的字符串位置。 strtok_r 函数是 strtok 函数的可重入版本。

13.4.1. strtok 例子 1

#include <stdio.h>
#include <string.h>

int main()
{
    char str[] = "Life is like, a box of chocolate, you never, know what you're, gonna get";
    char delims[] = ",";
    char *result;

    result = strtok( str, delims );
    while(result != NULL){
        printf("%s \n", result);
        result = strtok( NULL, delims);
    }
    return 0;
}

13.4.2. strtok 例子 2

#include <stdio.h>
#include <string.h>

int main()
{
    char str[] = "Life is like, a box of chocolate, you never, know what you're, gonna get";
    char delims[] = ",";
    char *result;

    for( result = strtok(str, delims); result != NULL; result = strtok(NULL, delims))
        printf("%s \n", result);
    return 0;
}

13.4.3. strtok 函数会破坏待分解的字符串

strtok 调用前和调用后的第一个参数已经不一样了。

例子:

#include<string.h>
#include<stdio.h>
int main(void)
{
    char input[16]="abc,d";
    char*p;
    printf("befor strtok. input is %s\n", input);

    p=strtok(input,",");
    if(p)
        printf("%s\n",p);
    printf("after strtok. input is %s\n", input);

    p=strtok(NULL,",");  //如果第二次调用strtok时第一个参数没有设置为NULL,即p=strtok(input,","); 则它返回还是abc
    if(p)
        printf("%s\n",p);
    printf("after strtok. input is %s\n", input);

    return 0;
}

上面程序的输出为:

befor strtok. input is abc,d
abc
after strtok. input is abc
d
after strtok. input is abc

13.4.4. strtok 只会返回非空字符串(skip over empty fields)

strtok 处理时,连续的空 field 会被自动忽略。

The tokens returned by strtok() are always nonempty strings.
Thus, for example, given the string "aaa;;bbb,", successive calls to strtok() that specify the delimiter string ";," would return the strings "aaa" and "bbb", and then a NULL pointer.

13.4.5. strsep (可移植性不好)

strsep 比 strtok 更好,这是线程安全的。唯一不足的是可移植性不如 strtok。

The strsep() function was introduced as a replacement for strtok(), since the latter cannot handle empty fields. However, strtok() conforms to C89/C99 and hence is more portable.

13.4.6. strtok_r

strtok 不是线程安全的,strtok_r 是线程安全的。

strtok_r 实现实例:

#include <string.h>
/* Parse s into tokens separated by characters in delim.
   If s is NULL, the saved pointer in save_PTR is used as
   the next starting point.  For example:
     char s[] = "-abc-=-def";
     char *sp;
     x = strtok_r(s, "-", &sp);      // x = "abc", sp = "=-def"
     x = strtok_r(NULL, "-=", &sp);  // x = "def", sp = NULL
     x = strtok_r(NULL, "=", &sp);   // x = NULL
                                     // s = "abc\0-def\0"
*/
char *strtok_r(char *s, const char *delim, char **save_ptr) {
    char *token;

    if (s == NULL) s = *save_ptr;

    /* Scan leading delimiters.  */
    s += strspn(s, delim);
    if (*s == '\0')
        return NULL;

    /* Find the end of the token.  */
    token = s;
    s = strpbrk(token, delim);
    if (s == NULL)
        /* This token finishes the string.  */
        *save_ptr = strchr(token, '\0');
    else {
        /* Terminate the token and make *SAVE_PTR point past it.  */
        *s = '\0';
        *save_ptr = s + 1;
    }

    return token;
}

参考:http://blog.csdn.net/sjin_1314/article/details/8242098

13.4.6.1. 保留空字段版本的 strtok_r

下面实现一个保留空字段版本的 strtok_r:

/*
 * This function like strtok_r, but reserve empty fields(strtok_r skip empty fields).
 *
 * For example:
 * strtok_r:         a;;;d => "a", "d"
 * this function:    a;;;c => "a", "", "", "d"
 */
char *strtok_r_no_skip(char *str, const char *delims, char **store) {

    char *ret;

    if (str == NULL)
        str = *store;

    if (*str == '\0')
        return NULL;

    ret = str;

    str += strcspn(str, delims);

    if (*str != '\0')
        *str++ = '\0';

    *store = str;

    return ret;
}

13.5. memcpy

memcpy 从内存地址 src 处复制 n 字节到地址 dest 处。

#include <string.h>
void *memcpy(void *dest, const void *src, size_t n);

                                 /* returns a pointer to dest. */

说明:如果 src 和 dest 内存有重叠(这时应该使用 memmove),memcpy 能否正常工作取决于具体的实现,不过很多 memcpy 实现都能正确处理内存重叠的情况。

13.5.1. memcpy 最基本实现

下面是 memcpy 的一个基本实现,当 src 和 dest 内存重叠时它可能出错。

void * memcpy (void * dest, const void *src, size_t n) {
  char *pDest = (char *) dest;
  const char *pSrc = (const char *) src;

  size_t i=0;
  for (; i < n; i++) {
    pDest[i] = pSrc[i];
  }

  return dest;
}

13.5.2. memcpy 实现(能处理重叠内存)

下面是 memcpy 一个实现,它能处理 src 和 dest 内存有重叠的情况。

void * memcpy (void * dest, const void *src, size_t n) {
  char *pDest = (char *) dest;
  const char *pSrc = (const char *) src;

  size_t i=0;
  if ( (unsigned long)pDest < (unsigned long)src ) { /* Copy forward */
    for (; i < n; i++) {
      pDest[i] = pSrc[i];
    }
  } else {                                           /* Copy backward */
    for (i=n-1; i<=0; i--) {
      pDest[i] = pSrc[i];
    }
  }

  return dest;
}

13.5.3. memcpy 的更快实现(一次复制多字节)

下面的实例一次复制 4 个字节,最后不足 4 字节部分按字节复制。

// 暂时没有考虑对齐的问题。
void memcpy(void* dest, void* src, int size)
{
  uint8_t *pdest = (uint8_t*) dest;
  uint8_t *psrc = (uint8_t*) src;

  int loops = (size / sizeof(uint32_t));
  int index;
  for(index = 0; index < loops; ++index) {
    *((uint32_t*)pdest) = *((uint32_t*)psrc);
    pdest += sizeof(uint32_t);
    psrc += sizeof(uint32_t);
  }

  loops = (size % sizeof(uint32_t));
  for (index = 0; index < loops; ++index) {
    *pdest = *psrc;
    ++pdest;
    ++psrc;
  }
}

参考:
http://stackoverflow.com/questions/11876361/implementing-own-memcpy-size-in-bytes
http://opensource.apple.com//source/xnu/xnu-2050.18.24/libsyscall/wrappers/memcpy.c
http://codereview.stackexchange.com/questions/41094/memcpy-implementation

13.6. 实例:去掉字符串前后空格

去掉字符串前后空格:

// Note: This function returns a pointer to a substring of the original string.
// If the given string was allocated dynamically, the caller must not overwrite
// that pointer with the returned value, since the original pointer must be
// deallocated using the same allocator with which it was allocated.  The return
// value must NOT be deallocated using free() etc.
char *trimwhitespace(char *str)
{
  char *end;

  // Trim leading space
  while(isspace(*str))
     str++;

  if(*str == 0)  // All spaces?
    return str;

  // Trim trailing space
  end = str + strlen(str) - 1;
  while(end > str && isspace(*end))
    end--;

  // Write new null terminator
  *(end+1) = 0;

  return str;
}

参考:http://stackoverflow.com/questions/122616/how-do-i-trim-leading-trailing-whitespace-in-a-standard-way

13.7. 读写文本文件

13.7.1. 处理单个字符

处理单个字符常用函数:

int fgetc(FILE *stream);
int getc(FILE *stream);         //和fgetc相同,但可能用宏实现
int getchar(void);              //等于getc(stdin)
int ungetc(int c, FILE *stream);

int fputc(int c, FILE *stream);
int putc(int c, FILE *stream);  //和fputc相同,但可能用宏实现
int putchar(int c);             //等于putc(c, stdout)

fgetc/fputc 和 getc/putc 有什么不同呢?
参考:http://stackoverflow.com/questions/14008907/fputc-vs-putc-in-c

13.7.2. 处理多个或一行字符

处理多个或一行字符常用函数:

char *fgets(char *s, int size, FILE *stream); //当读到换行符或EOF时fgets会结束,以换行符结束时包含换行符,
                                              //fgets最多读size-1个字符(因为它以'\0'结束)。
char *gets(char *s);      //从stdin中得到一行,注意它得到的字符串中不含换行符

int fputs(const char *s, FILE *stream);
int puts(const char *s);   //输出一行到stdout,注意它会增加trailing newline

例子:一行一行读文本文件并输出(假设每行不超过 1024 字节)

#include <stdio.h>

int main() {
    char line[1024];
    FILE *fp = fopen("filename.txt","r");
    if( fp == NULL ) {
        return 1;
    }
    while( fgets(line,1024,fp) ) {
        printf("%s\n",line);
    }
    return 0;
}

13.7.3. fflush vs fsync

fflushfsync 的声明如下:

#include <stdio.h>
int fflush(FILE *stream);

#include <unistd.h>
int fsync(int fd);

fflush() works on FILE* , it just flushes the internal buffers in the FILE* of your application out to the OS.
fsync() works on a lower level, it tells the OS to flush its buffers to the physical media.

要关闭一个 FILE,并使文件内容立刻写入到硬盘中,可以这样做:

fflush(gfile);
tmp_fd = fileno(gfile);
if (tmp_fd != -1) {
    fsync(tmp_fd);
}
fclose(gfile);

参考:http://stackoverflow.com/questions/2340610/difference-between-fflush-and-fsync

14. 强符号和弱符号

在 C 语言中,函数和初始化的全局变量(包括初始化为 0)是强符号,未初始化的全局变量是弱符号。

对于强弱符号,有三条规则:

  1. 同名的强符号只能有一个,否则编译器报“重定义”错误。
  2. 允许一个强符号和多个弱符号,但定义会选择强符号的。
  3. 当有多个弱符号相同时,链接器选择占用内存空间最大的那个。

参考:http://blog.csdn.net/astrotycoon/article/details/8008629

15. Standard Library (C90)

Reference: "The C Programming Language, 2nd" Appendix B - Standard Library

15.1. Input and Output: <stdio.h>

See table 12, table 13, table 14, table 15, table 16, table 17.

For a summary of input and output functions, please refers to man stdio

Table 12: File Operations
Function Description
FILE *fopen(const char *filename, const char *mode) fopen opens the named file, and returns a stream, or NULL if the attempt fails.
FILE *freopen(const char *filename, const char *mode, FILE *stream) freopen opens the file with the specified mode and associates the stream with it. freopen is normally used to change the files associated with stdin, stdout, or stderr.
int fflush(FILE *stream) On an output stream, fflush causes any buffered but unwritten data to be written; on an input stream, the effect is undefined. fflush(NULL) flushes all output streams.
int fclose(FILE *stream) fclose flushes any unwritten data for stream, discards any unread buffered input, frees any automatically allocated buffer, then closes the stream.
int remove(const char *filename) remove removes the named file.
int rename(const char *oldname, const char *newname) rename changes the name of a file.
FILE *tmpfile(void) tmpfile creates a temporary file of mode "wb+" that will be automatically removed when closed or when the program terminates normally.
char *tmpnam(char s[L_tmpnam]) tmpnam(NULL) creates a string that is not the name of an existing file, and returns a pointer to an internal static array. tmpnam(s) stores the string in s as well as returning it as the function value.
int setvbuf(FILE *stream, char *buf, int mode, size_t size) setvbuf controls buffering for the stream; it must be called before reading, writing or any other operation. A mode of _IOFBF causes full buffering, _IOLBF line buffering of text files, and _IONBF no buffering. If buf is not NULL, it will be used as the buffer, otherwise a buffer will be allocated. size determines the buffer size.
void setbuf(FILE *stream, char *buf) If buf is NULL, buffering is turned off for the stream. Otherwise, setbuf is equivalent to (void) setvbuf(stream, buf, _IOFBF, BUFSIZ).
Table 13: Formatted Output/Input
Function Description
int fprintf(FILE *stream, const char *format, ...) Converts and writes output to stream under the control of format.
int printf(const char *format, ...) Equivalent to fprintf(stdout, ...).
int sprintf(char *s, const char *format, ...) Same as printf except that the output is written into the string s.
int vprintf(const char *format, va_list arg) See stdarg(3).
int vfprintf(FILE *stream, const char *format, va_list arg) See stdarg(3).
int vsprintf(char *s, const char *forma, va_list arg) See stdarg(3).
int fscanf(FILE *stream, const char *format, ...) Reads from stream under control of format.
int scanf(const char *format, ...) Identical to fscanf(stdin, ...)
int sscanf(const char *s, const char *format, ...) Same as scanf except input is taken from string s.
Table 14: Character Input and Output Functions
Function Description
int fgetc(FILE *stream) fgetc returns the next character of stream as an unsigned char (converted to an int), or EOF if end of file or error occurs.
char *fgets(char *s, int n, FILE *stream) fgets reads at most the next n-1 characters into the array s, stopping if a newline is encountered; the newline is included in the array
int fputc(int c, FILE *stream) fputc writes the character c (converted to an unsigend char) on stream.
int fputs(const char *s, FILE *stream) fputs writes the string s (which need not contain \n) on stream.
int getc(FILE *stream) getc is equivalent to fgetc except that if it is a macro, it may evaluate stream more than once.
int getchar(void) getchar is equivalent to getc(stdin).
char *gets(char *s) gets reads the next input line into the array s; it replaces the terminating newline with '\0'.
int putc(int c, FILE *stream) putc is equivalent to fputc except that if it is a macro, it may evaluate stream more than once.
int putchar(int c) putchar(c) is equivalent to putc(c,stdout).
int puts(const char *s) puts writes the string s and a newline to stdout.
int ungetc(int c, FILE *stream) ungetc pushes c (converted to an unsigned char) back onto stream, where it will be returned on the next read.
Table 15: Direct Input and Output Functions
Function Description
size_t fread(void *ptr, size_t size, size_t nobj, FILE *stream) fread reads from stream into the array ptr at most nobj objects of size size.
size_t fwrite(const void *ptr, size_t size, size_t nobj, FILE *stream) fwrite writes, from the array ptr, nobj objects of size size on stream.
Table 16: File Positioning Functions
Function Description
int fseek(FILE *stream, long offset, int origin) fseek sets the file position for stream.
long ftell(FILE *stream) ftell returns the current file position for stream, or -1 on error.
void rewind(FILE *stream) rewind(fp) is equivalent to fseek(fp, 0L, SEEK_SET); clearerr(fp).
int fgetpos(FILE *stream, fpos_t *ptr) fgetpos records the current position in stream in *ptr, for subsequent use by fsetpos.
int fsetpos(FILE *stream, const fpos_t *ptr) fsetpos positions stream at the position recorded by fgetpos in *ptr.
Table 17: Error Functions
Function Description
void clearerr(FILE *stream) clearerr clears the end of file and error indicators for stream.
int feof(FILE *stream) feof returns non-zero if the end of file indicator for stream is set.
int ferror(FILE *stream) ferror returns non-zero if the error indicator for stream is set.
void perror(const char *s) prints message corresponding to errno, likes fprintf(stderr, "%s: %s\n", s, "error message");

15.2. Character Class Test: <ctype.h>

See table 18.

Table 18: Character Class Test: <ctype.h>
Function Description
isalnum(c) isalpha(c) or isdigit(c) is true
isalpha(c) isupper(c) or islower(c) is true
iscntrl(c) control character
isdigit(c) decimal digit
isgraph(c) printing character except space
islower(c) lower-case letter
isprint(c) printing character including space
ispunct(c) printing character except space or letter or digit
isspace(c) space, formfeed, newline, carriage return, tab, vertical tab
isupper(c) upper-case letter
isxdigit(c) hexadecimal digit
int tolower(c) convert c to lower case
int toupper(c) convert c to upper case

15.3. String Functions: <strings.h>

See table 19.

Table 19: String Functions
Function Description
char *strcpy(s,ct) copy string ct to string s, including '\0'; return s.
char *strncpy(s,ct,n) copy at most n characters of string ct to s; return s. Pad with '\0''s if ct has fewer than n characters.
char *strcat(s,ct) concatenate string ct to end of string s; return s.
char *strncat(s,ct,n) concatenate at most n characters of string ct to string s, terminate s with '\0'; return s.
int strcmp(cs,ct) compare string cs to string ct, return <0 if cs<ct, 0 if cs==ct, or >0 if cs>ct.
int strncmp(cs,ct,n) compare at most n characters of string cs to string ct; return <0 if cs<ct, 0 if cs==ct, or >0 if cs>ct.
char *strchr(cs,c) return pointer to first occurrence of c in cs or NULL if not present.
char *strrchr(cs,c) return pointer to last occurrence of c in cs or NULL if not present.
size_t strspn(cs,ct) return length of prefix of cs consisting of characters in ct.
size_t strcspn(cs,ct) return length of prefix of cs consisting of characters not in ct.
char *strpbrk(cs,ct) return pointer to first occurrence in string cs of any character string ct, or NULL if not present.
char *strstr(cs,ct) return pointer to first occurrence of string ct in cs, or NULL if not present.
size_t strlen(cs) return length of cs.
char *strerror(n) return pointer to implementation-defined string corresponding to error n.
char *strtok(s,ct) strtok searches s for tokens delimited by characters from ct.
void *memcpy(s,ct,n) copy n characters from ct to s, and return s.
void *memmove(s,ct,n) same as memcpy except that it works even if the objects overlap.
int memcmp(cs,ct,n) compare the first n characters of cs with ct; return as with strcmp.
void *memchr(cs,c,n) return pointer to first occurrence of character c in cs, or NULL if not present among the first n characters
void *memset(s,c,n) place character c into first n characters of s, return s.

15.4. Utility Functions: <stdlib.h>

See table 20.

Table 20: Utility Functions
Function Description
double atof(const char *s) atof converts s to double; it is equivalent to strtod(s, (char**)NULL).
int atoi(const char *s) converts s to int; it is equivalent to (int)strtol(s, (char**)NULL, 10).
long atol(const char *s) converts s to long; it is equivalent to strtol(s, (char**)NULL, 10).
double strtod(const char *s, char **endp) strtod converts the prefix of s to double, ignoring leading white space; it stores a pointer to any unconverted suffix in *endp unless endp is NULL.
long strtol(const char *s, char **endp, int base) strtol converts the prefix of s to long, ignoring leading white space; it stores a pointer to any unconverted suffix in *endp unless endp is NULL.
unsigned long strtoul(const char *s, char **endp, int base) strtoul is the same as strtol except that the result is unsigned long and the error value is ULONG_MAX.
int rand(void) rand returns a pseudo-random integer in the range 0 to RAND_MAX, which is at least 32767.
void srand(unsigned int seed) srand uses seed as the seed for a new sequence of pseudo-random numbers. The initial seed is 1.
void *calloc(size_t nobj, size_t size) calloc returns a pointer to space for an array of nobj objects, each of size size, or NULL if the request cannot be satisfied. The space is initialized to zero bytes.
void *malloc(size_t size) malloc returns a pointer to space for an object of size size, or NULL if the request cannot be satisfied. The space is uninitialized.
void *realloc(void *p, size_t size) realloc changes the size of the object pointed to by p to size.
void free(void *p) free deallocates the space pointed to by p; it does nothing if p is NULL.
void abort(void) abort causes the program to terminate abnormally, as if by raise(SIGABRT).
void exit(int status) exit causes normal program termination.
int atexit(void (*fcn)(void)) atexit registers the function fcn to be called when the program terminates normally.
int system(const char *s) system passes the string s to the environment for execution.
char *getenv(const char *name) getenv returns the environment string associated with name, or NULL if no string exists.
void *bsearch(const void *key, const void *base, size_t num, size_t size, int (*cmp)(const void *, const void *)); Searches the given key in the array pointed to by base (which is formed by num elements, each of size bytes), and returns a void* pointer to a matching element, if found.
void qsort(void *base, size_t n, size_t size, int (*cmp)(const void *, const void *)) Sorts the num elements of the array pointed to by base, each element size bytes long, using the compar function to determine the order.
int abs(int n) abs returns the absolute value of its int argument.
long labs(long n) labs returns the absolute value of its long argument.
div_t div(int num, int denom) div computes the quotient and remainder of num/denom. The results are stored in the int members quot and rem of a structure of type div_t.
ldiv_t ldiv(long num, long denom) ldiv computes the quotient and remainder of num/denom. The results are stored in the long members quot and rem of a structure of type ldiv_t.

15.5. Non-local Jumps: <setjmp.h>

goto 语句只能在一个函数的内部跳转,可称为 local jumps。用 setjmp, longjmp 可实现 non-local jumps。

The declarations in <setjmp.h> provide a way to avoid the normal function call and return sequence, typically to permit an immediate return from a deeply nested function call.

int setjmp(jmp_buf env)
    The macro setjmp saves state information in env for use by longjmp. The return is zero from a direct call of setjmp, and non-zero from a subsequent call of longjmp. A call to setjmp can only occur in certain contexts, basically the test of if, switch, and loops, and only in simple relational expressions.
    if (setjmp(env) == 0)
        /* get here on direct call */
    else
        /* get here by calling longjmp */

void longjmp(jmp_buf env, int val)
    longjmp restores the state saved by the most recent call to setjmp, using the information saved in env, and execution resumes as if the setjmp function had just executed and returned the non-zero value val. The function containing the setjmp must not have terminated. Accessible objects have the values they had at the time longjmp was called, except that non-volatile automatic variables in the function calling setjmp become undefined if they were changed after the setjmp call.

setjmp, longjmp 实例:

/* setjmp example: error handling */
#include <stdio.h>      /* printf, scanf */
#include <stdlib.h>     /* exit */
#include <setjmp.h>     /* jmp_buf, setjmp, longjmp */

int main()
{
  jmp_buf env;
  int val;

  val = setjmp (env);
  if (val) {
    fprintf (stderr, "Error %d happened\n", val);
    exit (val);
  }

  /* code here */

  longjmp (env, 101);   /* signaling an error when something wrong. 会跳到之前调用setjmp的位置处执行 */

  return 0;
}

上面程序会输出:

Error 101 happened

参考:http://www.cplusplus.com/reference/csetjmp/setjmp/

15.5.1. 实现 setjmp, longjmp

setjmp, longjmp 的实现依赖于具体的平台。

FreeBSD 在 x86-64 平台对 setjmp 和 longjmp 的实现如下:

/*****************************************************************************/
/* setjump, longjump                                                         */
/*****************************************************************************/

ENTRY(setjmp)
        movq    %rbx,0(%rdi)                    /* save rbx */
        movq    %rsp,8(%rdi)                    /* save rsp */
        movq    %rbp,16(%rdi)                   /* save rbp */
        movq    %r12,24(%rdi)                   /* save r12 */
        movq    %r13,32(%rdi)                   /* save r13 */
        movq    %r14,40(%rdi)                   /* save r14 */
        movq    %r15,48(%rdi)                   /* save r15 */
        movq    0(%rsp),%rdx                    /* get return address */
        movq    %rdx,56(%rdi)                   /* save return address */
        xorl    %eax,%eax                       /* return(0); */
        ret
END(setjmp)

ENTRY(longjmp)
        movq    0(%rdi),%rbx                    /* restore rbx */
        movq    8(%rdi),%rsp                    /* restore rsp */
        movq    16(%rdi),%rbp                   /* restore rbp */
        movq    24(%rdi),%r12                   /* restore r12 */
        movq    32(%rdi),%r13                   /* restore r13 */
        movq    40(%rdi),%r14                   /* restore r14 */
        movq    48(%rdi),%r15                   /* restore r15 */
        movq    56(%rdi),%rdx                   /* get return address */
        movq    %rdx,0(%rsp)                    /* restore return address */
        xorl    %eax,%eax                       /* return(1); */
        incl    %eax
        ret
END(longjmp)

All setjmp is doing is saving a bunch of registers including %rsp (the stack pointer) and the return address into the jmp_buf array that was passed as a parameter. All longjmp is doing is restoring those registers and the return address of the original setjmp call.

x86-64 有 16 个通用寄存器,在上面的实现中 setjmp 仅保存了 7 个寄存器(callee-save registers),为什么其它的寄存器不用保存以待恢复呢?
Well recall that setjmp and longjmp are implemented as functions and as such they follow the standard x86-64 calling convention. The x86-64 calling convention dictates that the 7 registers above are owned by the caller (also known as callee-save), which means that setjmp is responsible for restoring these registers before it returns. The other registers are owned by the callee function, the callee is allowed to clobber them however it likes before returning. So we're under no obligation to restore these registers before returning, and the caller can't make any assumptions about what they will contain.

参考:
http://svnweb.freebsd.org/base/head/sys/amd64/amd64/support.S?revision=249439&view=markup#l657
http://blog.reverberate.org/2013/05/deep-wizardry-stack-unwinding.html

15.5.2. setjmp,longjmp 可能导致内存泄露

setjmp 和 longjmp 之间的栈空间在调用 longjmp 时被丢弃了,这可能会导致内存泄露。

void f1(void) {
    char *p = malloc(1024);

    f2();

    free(p);     /* never called, memory leak! */
}

void f2(void) {
    longjmp(env, 1);
}

15.6. Signals: <signal.h>

See table 20.

Table 21: Signals
Function Description
void (*signal(int sig, void (*handler)(int)))(int) signal determines how subsequent signals will be handled.
int raise(int sig) raise sends the signal sig to the program; it returns non-zero if unsuccessful.

15.7. Date and Time Functions: <time.h>

See table 22.

Table 22: Date and Time Functions
Function Description
clock_t clock(void) clock returns the processor time used by the program since the beginning of execution.
time_t time(time_t *tp) time returns the current calendar time or -1 if the time is not available.
double difftime(time_t time2, time_t time1) difftime returns time2-time1 expressed in seconds.
time_t mktime(struct tm *tp) mktime converts the local time in the structure *tp into calendar time.
char *asctime(const struct tm *tp) asctime converts time into a string of the form "Sun Jan 3 15:14:13 1988\n\0"
char *ctime(const time_t *tp) It is equivalent to asctime(localtime(tp))
struct tm *gmtime(const time_t *tp) gmtime converts the calendar time *tp into Coordinated Universal Time (UTC).
struct tm *localtime(const time_t *tp) localtime converts the calendar time *tp into local time.
size_t strftime(char *s, size_t smax, const char *fmt, const struct tm *tp) strftime formats date and time information from *tp into s according to fmt.

struct tm is defined in time.h

struct tm {
	int	tm_sec;		/* seconds after the minute [0-60] */
	int	tm_min;		/* minutes after the hour [0-59] */
	int	tm_hour;	/* hours since midnight [0-23] */
	int	tm_mday;	/* day of the month [1-31] */
	int	tm_mon;		/* months since January [0-11] */
	int	tm_year;	/* years since 1900 */
	int	tm_wday;	/* days since Sunday [0-6] */
	int	tm_yday;	/* days since January 1 [0-365] */
	int	tm_isdst;	/* Daylight Savings Time flag */
	long tm_gmtoff;	/* offset from CUT in seconds */
	char *tm_zone;	/* timezone abbreviation */
};

实例 1:计算程序中执行某段代码的 CPU 时间

#include <stdio.h>
#include <time.h>

int main() {
    clock_t begin = clock();
    // Do stuff
    clock_t end = clock();
    double elapsed = (double)(end - begin) * 1000.0 / CLOCKS_PER_SEC;
    printf("CPU Time elapsed in milliseconds: %f", elapsed);
}

参考:http://www.gnu.org/software/libc/manual/html_node/CPU-Time.html

实例 2:计算程序中执行某段代码的时间

#include <time.h>

int main() {
    time_t start, end;
    time(&start);
    // Do stuff
    time(&end);
    double duration = difftime(end, start);
    printf("Time elapsed in seconds: %f", elapsed);
}

15.8. Diagnostics: <assert.h>

void assert(int expression)
If expression is zero, the assert macro will print on stderr a message, such as

Assertion failed: expression, file filename, line nnn

If NDEBUG is defined at the time <assert.h> is included, the assert macro is ignored.

16. C FAQs

16.1. Library Functions

16.1.1. 怎么在 C 中处理正则表达式和通配符

16.1.1.1. POSIX regex in C

Unix-like 系统中一般提供了 POSIX 正则处理的相关库。

#include <sys/types.h>
#include <regex.h>

int regcomp(regex_t *preg, const char *regex, int cflags);
    /* Prepare your regex for fast processing */

int regexec(const regex_t *preg, const char *string, size_t nmatch, regmatch_t pmatch[], int eflags);
    /* Do the matching */

void regfree(regex_t *preg);
    /* Free your compiled regex for a new "compiling" */

size_t regerror(int errcode, const regex_t *preg, char *errbuf, size_t errbuf_size);
    /* Retrieve some more information on why the regexec() failed */

POSIX regex 处理实例:

#include <stdio.h>

#include <sys/types.h>
#include <regex.h>

int main(int argc, char *argv[])
{
    regex_t regex;
    int reti;
    char msgbuf[100];

    /* Compile regular expression */
    reti = regcomp(&regex, "^a[[:alnum:]]", 0);
    if( reti ) { fprintf(stderr, "Could not compile regex\n"); exit(1); }

    /* Execute regular expression */
    reti = regexec(&regex, "abc", 0, NULL, 0);
    if( !reti ) {
            puts("Match");
    } else if( reti == REG_NOMATCH ) {
            puts("No match");
    } else {
            regerror(reti, &regex, msgbuf, sizeof(msgbuf));
            fprintf(stderr, "Regex match failed: %s\n", msgbuf);
            exit(1);
    }

    /* Free compiled regular expression if you want to use the regex_t again */
    regfree(&regex);

    return 0;
}

参考:
http://www.peope.net/old/regex.html
http://www.gnu.org/software/libc/manual/html_node/Regular-Expressions.html

16.1.1.2. 通配符匹配

在通配符中, ? 表示 1 个任意字符, * 表示 0 个或多个任意字符。

Here is a quick little wildcard matcher by Arjan Kenter:

int match(char *pat, char *str)
{
    switch(*pat) {
    case '\0':  return !*str;
    case '*':   return match(pat+1, str) || *str && match(pat, str+1);
    case '?':   return *str && match(pat+1, str+1);
    default:    return *pat == *str && match(pat+1, str+1);
    }
}

/* (Copyright 1995, Arjan Kenter) */
/* With this definition, the call match("a*b.c", "aplomb.c") would return 1. */

16.1.2. 处理命令行参数 (getopt)

POSIX 系统中有个 getopt 函数可以处理命令行参数,它要求程序的命令行参数符合下面约定:

  1. 每个选项仅为单个字母或数字;
  2. 所有选项以连字符 - 开始。

getopt 的原型如下:

#include <unistd.h>
int getopt(int argc, char * const argv[], const char *optstring);

extern char *optarg;
extern int optind;
extern int optopt;
extern int opterr;

参数 argcargv 与传给 main 函数的相同,参数 optstring 是包含所有支持选项的字符串,如果一个选项字符后面紧跟着一个冒号,那么这个选项带一个参数;否则,只是一个开关选项。
例如,如果一个命令的用法为:

command [-i] [-u username] [-z] filename

则应该将 iu:z 作为 optstring 传给 getopt

getopt 的通常用法是一个循环,当 getopt 返回-1 时结束循环。 在每次循环中, getopt 会返回下一个处理的选项。当遇到不合法的选项,或者带参数选项的参数缺失时, getopt 会返回一个问号 ?

命令行中的 -- 会让 getopt 停止处理并返回 -1。例如,删除文件名为 -bar 的文件:

$ rm -- -bar   # right
$ rm -bar      # wrong

getopt 支持四个外部变量:

optarg
如果选项带有参数,处理该选项时,getopt 将 optarg 指向该选项对应的参数字符串。
opterr
设置 opterr 为 0 时,可以禁止 getopt 在遇到错误选项时输出日志。
optind
下一个要处理的字符串的 argv 数组的下标。默认从 1 开始(即忽略分析 argv[0] ),当参数被 getopt 处理后,optind 相应增加。
optopt
如果在处理时遇到错误,getopt 将 optopt 指向引起错误的选项字符串。

注:如果要处理选项不为单个字母的情况(如 --version ),可以使用函数 getopt_long

参考:
Advanced Programming in the UNIX Environment, 2nd Edition. page 773.
Advanced Programming in the UNIX Environment, 3rd Edition. page 662.

16.1.2.1. getopt 实例

下面程序(testopt)将演示 getopt 的典型使用场景:

#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main (int argc, char **argv)
{
  int aflag = 0;
  int bflag = 0;
  char *cvalue = NULL;
  int index;
  int c;

  opterr = 0;
  while ((c = getopt (argc, argv, "abc:")) != -1) {
    switch (c) {
      case 'a':
        aflag = 1;
        break;
      case 'b':
        bflag = 1;
        break;
      case 'c':
        cvalue = optarg;
        break;
      case '?':
        if (optopt == 'c')
          fprintf (stderr, "Option -%c requires an argument.\n", optopt);
        else if (isprint (optopt))
          fprintf (stderr, "Unknown option `-%c'.\n", optopt);
        else
          fprintf (stderr,
                   "Unknown option character `\\x%x'.\n",
                   optopt);
        return 1;
      default:
        abort ();
    }
  }
  printf ("aflag = %d, bflag = %d, cvalue = %s\n",
          aflag, bflag, (cvalue == NULL) ? "null" : cvalue);

  for (index = optind; index < argc; index++)
    printf ("Non-option argument %s\n", argv[index]);
  return 0;
}

测试实例:

$ ./testopt -abc test file1 file2
aflag = 1, bflag = 1, cvalue = test
Non-option argument file1
Non-option argument file2

参考:http://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_node/Example-of-Getopt.html

16.1.2.2. 实现 getopt

下面是 getopt 的一个实现,来自http://note.sonots.com/Comp/CompLang/cpp/getopt.html

#include <stdio.h>

#define ERR(s, c)   if(opterr){\
    char errbuf[2];\
    errbuf[0] = c; errbuf[1] = '\n';\
    fputs(argv[0], stderr);\
    fputs(s, stderr);\
    fputc(c, stderr);}

int opterr = 1;
int optind = 1;
int optopt;
char *optarg;

int
getopt(int argc, char **argv, char *opts)
{
    static int sp = 1;
    register int c;
    register char *cp;

    if(sp == 1)
        if(optind >= argc ||
           argv[optind][0] != '-' || argv[optind][1] == '\0')
            return(EOF);
        else if(strcmp(argv[optind], "--") == NULL) {
            optind++;
            return(EOF);
        }
    optopt = c = argv[optind][sp];
    if(c == ':' || (cp=strchr(opts, c)) == NULL) {
        ERR(": illegal option -- ", c);
        if(argv[optind][++sp] == '\0') {
            optind++;
            sp = 1;
        }
        return('?');
    }
    if(*++cp == ':') {
        if(argv[optind][sp+1] != '\0')
            optarg = &argv[optind++][sp+1];
        else if(++optind >= argc) {
            ERR(": option requires an argument -- ", c);
            sp = 1;
            return('?');
        } else
            optarg = argv[optind++];
        sp = 1;
    } else {
        if(argv[optind][++sp] == '\0') {
            sp = 1;
            optind++;
        }
        optarg = NULL;
    }
    return(c);
}

16.2. Miscellaneous

16.2.1. 如何检测机器字节序

对整数的存储有两种方式: 小端法(最低有效字节在最低地址处),大端法(最高有效字节在最低地址处)。
假设一个 int 位于地址 0x100 处,它的十六进制表示为 0x01234567,用小端法、大端法分别表示为:

小端法:
地址    对应的值
0x100  0x67
0x101  0x45
0x102  0x23
0x103  0x01

大端法:
地址    对应的值
0x100  0x01
0x101  0x23
0x102  0x45
0x103  0x67

如何检测机器的字节序?可以用下面程序:

int x = 1;
if(*(char *)&x == 1)
    printf("little-endian\n");
else    printf("big-endian\n");

或者:

union {
    int i;
    char c[sizeof(int)];
} x;
x.i = 1;
if(x.c[0] == 1)
    printf("little-endian\n");
 else printf("big-endian\n");

参考:http://c-faq.com/misc/endiantest.html

17. C Tips

17.1. malloc 是否为线程安全

On any modern UNIX you'll get a thread-safe malloc by default. On Windows, use /MT, /MTd, /MD or /MDd flags to get thread-safe runtime library.

参考:
http://stackoverflow.com/questions/855763/is-malloc-thread-safe
http://www.360doc.com/content/12/0420/23/168576_205320609.shtml

17.2. 不要用 calloc 来初始化 NULL 指针

不要用 calloc 来初始化 NULL 指针, C 语言的标准并没有规定 NULL 指针一定是 all bits zero(每字节都为 0)。 事实上,有些平台存在 nonzero NULL。

参考:
http://c-faq.com/null/machexamp.html
http://stackoverflow.com/questions/13251499/calloc-pointers-and-all-bits-zero

Author: cig01

Created: <2010-07-02 Fri>

Last updated: <2020-08-09 Sun>

Creator: Emacs 27.1 (Org mode 9.4)