Type Systems for Programming Languages

Table of Contents

1. Type System

In programming languages, a type system is a collection of rules that assign a property called a type to the various construct—​such as variables, expressions, functions or modules—​that a computer program is composed of. The main purpose of a type system is to reduce bugs in computer programs by defining interfaces between different parts of a computer program, and then checking that the parts have been connected in a consistent way.

Reference:
http://en.wikipedia.org/wiki/Type_system
http://blog.steveklabnik.com/posts/2010-07-17-what-to-know-before-debating-type-systems
Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. ISBN 0-262-16209-1.

1.1. 从 4 个维度描述类型系统

一般地,我们可以从 4 个维度来描述类型系统。
维度一:Static vs Dynamic Typing
静态类型和动态类型,区分的关键点为编译期或运行期确定类型(这个判断依据不够准确,参见后文):静态类型在编译期确定,动态类型在运行期确定。

维度二:Strong vs Weak Typing
强类型和弱类型,区分的关键点为运行时是否自动转换到与实际类型不符的类型:强类型要求手工类型转换,弱类型自动转换。

维度三:Latent (Implicit) vs Manifest (Explicit) Typing
隐式类型和显式类型,区分的关键点为是否要在源码中声明类型:隐式类型不需要,显式类型需要。

维度四:Nominal vs Structural Typing
名义类型和结构类型,区分的关键点为类型判定是根据标称还是根据内容:名义类型根据标称,结构类型根据内容。

说明 1:维度一和维度二更加“深刻地”区分着类型系统。
说明 2:维度三往往仅描述了一种表观现象,对于有 Type inference(类型推导)的语言来说,往往既可以显式地声明类型,也可以省略类型声明。

参考:
https://www.reddit.com/r/programming/comments/63tnv/bruce_eckel_33104_im_over_it_java/c02qx55
http://www.blogjava.net/sean/archive/2009/09/28/296825.html
http://c2.com/cgi/wiki?TypingQuadrant

2. Dynamical typing and Statical typing

静态类型是指在编译时(compile-time)检测其类型;而动态类型不在编译时检测类型,而是推迟到运行时(run-time)检测类型。
但脚本语言是不需要编译的,上面这个说法不是太准确。Scheme 在其 R7RS 规范中对动态语言的说明比较好理解:
Scheme is a dynamically typed language. Types are associated with values (also called objects) rather than with variables. Statically typed languages, by contrast, associate types with variables and expressions as well as with values.

一句话总结动态类型就是:
Variables don't have types. Only values have types.
摘自:Let Over Lambda, 2.2 Environments and extent

Reference:
http://trac.sacrideo.us/wg/wiki/R7RSHomePage
http://tratt.net/laurie/research/pubs/html/tratt__dynamically_typed_languages/

3. Strong and Weak typing

In computer programming, programming languages are often colloquially referred to as strongly typed or weakly typed. These terms do not have a precise definition, but in general a strongly typed language is more likely to generate an error or refuse to compile if the argument passed to a function does not closely match the expected type. On the other hand, a very weakly typed language may produce unpredictable results or may perform implicit type conversion.

Reference: http://en.wikipedia.org/wiki/Strong_and_weak_typing

3.1. 判断强类型还是弱类型

If simple operations do not behave in a way that one would expect, a programming language can be said to be "weakly typed". For example, consider the following program:

x = "5" + 6

Different languages will assign a different value to 'x':

  • One language might convert 6 to a string, and concatenate the two arguments to produce the string "56" (e.g. JavaScript)
  • Another language might convert "5" to a number, and add the two arguments to produce the number 11 (e.g. Perl, PHP)
  • Yet another language might convert the string "5" to a pointer representing where the string is stored within memory, and add 6 to that value to produce an address in memory (e.g. C)
  • In yet another language, the + operation might fail during execution, saying that the two operands have incompatible type (e.g. Ruby)
  • And in many compiled languages, the compiler would reject this program because the addition is ill-typed, without ever running the program (e.g. BASIC)

Languages that work like the first three examples have all been called "weakly typed" at various times, even though only one of them (the third) represents a possible safety violation.

Reference: http://en.wikipedia.org/wiki/Strong_and_weak_typing#Predictability

4. 常见语言的动态/静态类型,强/弱类型

Reference:
http://en.wikipedia.org/wiki/Category:Statically_typed_programming_languages
http://en.wikipedia.org/wiki/Category:Dynamically_typed_programming_languages

Table 1: 常见语言的动态/静态类型,强/弱类型
Language Dynamically or Statically typed Strong or Weak Appeared in
Fortran Static Strong 1957
Lisp Dynamic Strong 1958
COBOL Static Weak 1959
C Static Weak 1972
Scheme Dynamic Strong 1975
C++ Static Strong 1983
Objective-C Dynamic Weak 1983
Erlang Dynamic Strong 1986
Perl Dynamic Weak 1987
Haskell Static Strong 1990
Python Dynamic Strong 1991
Lua Dynamic Strong 1993
Java Static Strong 1995
JavaScript Dynamic Weak 1995
PHP Dynamic Weak 1995
Ruby Dynamic Strong 1995
OCaml Static Strong 1996
C# Static Strong 2000
Go Static Strong 2009

注:上表中的 Dynamic 或 Static 是从类型检测(type-checking)的角度来说的。

一般地,我们称某个语言是动态语言是指它能在运行时动态地变化和扩展。
动态语言往往是动态类型的,但也不全是。如 Java 利用 Reflection 机制可实现动态语言特性,但 Java 是静态类型的。
Dynamic programming language is a term used in computer science to describe a class of high-level programming languages which, at runtime, execute many common programming behaviors that static programming languages perform during compilation. These behaviors could include extension of the program, by adding new code, by extending objects and definitions, or by modifying the type system.
Most dynamic languages are also dynamically typed, but not all are.

Reference: https://en.wikipedia.org/wiki/Dynamic_programming_language

5. Nominal and Structural typing

5.1. Nominal type system (name-based)

In computer science, a nominal or nominative type system (or name-based type system) is a major class of type system, in which compatibility and equivalence of data types is determined by explicit declarations and/or the name of the types. Nominal systems are used to determine if types are equivalent, as well as if a type is a subtype of another.

Nominal typing means that two variables are type-compatible if and only if their declarations name the same type. For example, in C++, two struct types with different names are never considered compatible, even if they have identical field declarations.

Reference: http://en.wikipedia.org/wiki/Nominal_type_system

5.2. Structural type system (property-based)

A structural type system (or property-based type system) is a major class of type system, in which type compatibility and equivalence are determined by the type's actual structure or definition, and not by other characteristics such as its name or place of declaration. Structural systems are used to determine if types are equivalent and whether a type is a subtype of another.

As an example, OCaml uses structural typing on methods for compatibility of object types. Go uses structural typing on methods to determine compatibility of a type with an interface. C++ template functions exhibit structural typing on type arguments.

Reference: http://en.wikipedia.org/wiki/Structural_type_system

6. Duck typing

In computer programming with object-oriented programming languages, duck typing is a style of typing in which an object's methods and properties determine the valid semantics, rather than its inheritance from a particular class or implementation of an explicit interface. The name of the concept refers to the duck test, attributed to James Whitcomb Riley, which may be phrased as follows: When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.

In duck typing, a programmer is only concerned with ensuring that objects behave as demanded of them in a given context, rather than ensuring that they are of a specific type.

总结:一般来说,Duck typing 往往意味着同时满足"dynamical typing"和"structural typing".

Reference:
http://en.wikipedia.org/wiki/Duck_typing
http://en.wikipedia.org/wiki/Duck_typing#Implementations

Author: cig01

Created: <2014-03-16 Sun>

Last updated: <2015-12-08 Tue>

Creator: Emacs 27.1 (Org mode 9.4)