OCaml

Table of Contents

1. OCaml 简介

OCaml, originally known as Objective Caml, is the main implementation of the Caml programming language. In recent years, many new languages have drawn elements from OCaml, most notably F# and Scala.

Caml (originally an acronym for Categorical Abstract Machine Language) is a dialect of the ML programming language family, developed at INRIA and formerly at ENS.

参考:
OCaml Manual: http://caml.inria.fr/pub/docs/manual-ocaml/
OCaml tutorials: http://ocaml.org/learn/tutorials/
OCaml Cheat Sheets: http://ocaml.org/docs/cheat_sheets.html
Companies using OCaml: http://ocaml.org/learn/companies.html
Using, Understanding, and Unraveling The OCaml Language: http://caml.inria.fr/pub/docs/u3-ocaml/index.html
Real World OCaml: https://realworldocaml.org/v1/en/html/index.html
Developing Applications With Objective Caml: http://caml.inria.fr/pub/docs/oreilly-book/html/index.html
A Concise Introduction to Objective Caml, by David Matuszek: http://www.csc.villanova.edu/~dmatusze/resources/ocaml/ocaml.html
CS 3110 Spring 2014 :: Data Structures and Functional Programming: http://www.cs.cornell.edu/courses/cs3110/2014sp/lecture_notes.php

2. Starting Off

安装好 OCaml 环境后,在终端输入 ocaml 即可启动 OCaml 交互界面。如:

$ ocaml
        OCaml version 4.01.0

# 1 + 2;;
- : int = 3
# 5 * 8;;
- : int = 40
# 5 * 8.3;;
Error: This expression has type float but an expression was expected of type
         int
# 5.0 * 8.3;;
Error: This expression has type float but an expression was expected of type
         int
# 5 *. 8.3;;
Error: This expression has type int but an expression was expected of type
         float
# 5.0 *. 8.3;;
- : float = 41.5
# exit 0;;

说明 1:在上面例子中, OCaml 提示符以井号(#)开始。
说明 2:交互界面中输入的命令需要以两个分号(;;)结束,在程序源码中不必这么做。
说明 3:执行完表示式后,先会输出结果的类型,然后再输出结果。
说明 4:浮点数和整数在 OCaml 中非常不同,不会自动转换,运算符也不同。浮点数对应的加减乘除运算符为: +., -., *., /.
说明 5:要退出 OCaml 交互界面,可以输入 exit 0;;

2.1. 注释格式

OCaml comments are delimited by (* and *), like this:

(* This is a single-line comment. *)

(* This is a
 * multi-line
 * comment.
 *)

注释可以嵌套:

(* This code is broken ...

(* Primality test. *)
let is_prime n =
  (* note to self: ask about this on the mailing lists *) XXX;;

*)

参考:http://ocaml.org/learn/tutorials/basics.html

2.2. Hello World

准备文件 hello.ml,包含如下内容:

(* file hello.ml *)
print_string "Hello world!\n";;

要执行上面程序,有很多方法。
方式一:直接解释执行:

$ ocaml hello.ml
Hello world!

也可以这样解释执行,只是会输出其它多余的信息:

$ ocaml <hello.ml
        OCaml version 4.01.0

#   Hello world!
- : unit = ()
#

方式二:用 ocamlc 编译为可执行程序(运行需要 ocaml 环境):

$ ocamlc -o hello hello.ml
l$ ls
hello     hello.cmi hello.cmo hello.ml
$ ./hello
Hello world!

上面生成的 hello 可执行程序,由/usr/bin/ocamlrun 来执行的。验证如下:

$ file hello
hello: a /usr/bin/ocamlrun script executable (binary data)
$ head -n 1 hello             # 程序hello的第一行是一个Shebang,指向/usr/bin/ocamlrun
#!/usr/bin/ocamlrun
$ ocamlrun hello
Hello world!

方式三:用 ocamlopt 编译为 native 可执行程序(无需 ocaml 环境,可独立运行):

$ ocamlopt -o hello hello.ml
$ ./hello
Hello world!

上面生成的 hello 程序在 Linux 中它是标准的 ELF 可执行程序,无需 ocaml 环境,可独立运行。

$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=346ed175c24a69545d7d28d79722b94e843102e4, not stripped

3. 变量

At its simplest, a variable is an identifier whose meaning is bound to a particular value. In OCaml these bindings are often introduced using the let keyword.

3.1. let 定义变量(不可修改的)

let 绑定的基本形式为:

let <variable> = <expr>

如:

# let x = 100;;
val x : int = 100
# x + 1;;
- : int = 101

定义变量时不用指定类型,因为有类型推导(Type Inference)。若要显式地指定类型,可以在变量名后紧接着写冒号和类型名,如:

# let x : int = 100;;
val x : int = 100

说明:let 绑定变量的值是无法修改的,是不可变类型。

# let x = 1;;
val x : int = 1
# let x = 2;;       (* 前面定义的x无法修改,但我们可以重新定义x *)
val x : int = 2

3.1.1. 局部 binding(let ... in)

let can also be used to create a variable binding whose scope is limited to a particular expression, using the following syntax:

let <variable> = <expr1> in <expr2>

This first evaluates expr1 and then evaluates expr2 with variable bound to whatever value was produced by the evaluation of expr1.

let ... in 可以限制 binding 仅在接下来的表达式中有效。
局部 binding 实例:

#let area_of_ring inner_radius outer_radius =
  let pi = acos (-1.) in
  let area_of_circle r = pi *. r *. r in
  area_of_circle outer_radius -. area_of_circle inner_radius;;
val area_of_ring : float -> float -> float = <fun>
# area_of_ring 1.0 3.0;;
- : float = 25.1327412287183449

说明:绑定 pi 和 area_of_circle 在外层环境中(toplevel)中是不可见的。

参考:https://realworldocaml.org/v1/en/html/variables-and-functions.html#variables

4. 基本类型

Expressions evaluate to values. Values can be classified according to their type:

Type Example values
int 0,1,2,-1,-2
bool true, false
float 3.141592, 1.618034
string "Hello world!","xyzzy"
char 'A','Z'

5. 函数

OCaml is a functional language: functions in the full mathematical sense are supported and can be passed around freely just as any other piece of data.

let 可以定义函数。要调用函数,使用这种形式: fun1 arg1 arg2 ,函数名和参数之间以及参数和参数之间用空格分开即可。

如:

# let add x y = x + y;;
val add : int -> int -> int = <fun>
# add 2 3;;
- : int = 5

函数调用更多的实例:

f 5 (g "hello") 3    (* f has three arguments, g has one argument *)
f (g 3 4)            (* f has one argument, g has two arguments *)

5.1. 函数的类型

在定义函数时,不用指定参数和返回值的类型,因为有类型推导(Type Inference)。定义完函数后,OCaml 会显示自动推导出来的参数和返回值的类型。如:

# let add x y = x + y;;
val add : int -> int -> int = <fun>

其中,"int -> int -> int"中的前两个 int 表示函数两个参数的类型,最后一个 int 表示函数返回值的类型。之所以推导出参数和返回值都为 int 类型,是因为在 OCaml 中 + 只能用于 int 类型。 +. 可用于浮点类型,如:

# let add x y = x +. y;;
val add : float -> float -> float = <fun>

5.2. 显式指定函数类型

我们可以显式地指定函数参数和返回值的类型。如作用于 int 类型的函数 add 也可以定义为:

let add (x: int) (y: int) : int = x + y

5.3. Polymorphic Functions (多态函数:能接受任意类型参数)

In other languages, "polymorphic" means that you have two or more functions with the same name; in OCaml it means that a single function can handle more than one type of parameter.

An example of a built-in polymorphic function is the list function List.hd; it returns the first element of any type of list.

5.3.1. 任意类型 ('a, 'b, 'c)

如果一个函数能接受任意值作为它的参数,那么这个参数的类型是什么呢?答案是任意类型:'a
如:

# let return1 x = 1;;
val return1 : 'a -> int = <fun>

如果接受多个任意值作为它的参数,则它们的类型分别表示为'a, 'b, ...

# let return1 x y z = 1;;
val return1 : 'a -> 'b -> 'c -> int = <fun>

5.3.2. 编写多态函数

OCaml functions that you write will be polymorphic if their parameters are used only with polymorphic functions and operators. For example, the following function to reverse the two components of a 2-tuple is polymorphic:

let revPair (x, y) = (y, x);;

To write a polymorphic function, you need to avoid:

  • arithmetic operators (because OCaml needs to know whether the arithmetic is integer or floating point),
  • string concatenation
  • boolean operators
  • type conversion operators.

You can use:

  • lists and the list operators hd, tl, ::, @, along with the constant []
  • the equality tests = and !=.

参考:http://www.csc.villanova.edu/~dmatusze/resources/ocaml/ocaml.html#Polymorphic%20functions

5.4. Anonymous Functions (fun)

An anonymous function is a function that is declared without being named. These can be declared using the fun keyword.

匿名函数的定义形式为: fun arg1 arg2 -> body

匿名函数实例:

# fun x -> x + 1;;
- : int -> int = <fun>
# (fun x -> x + 1) 7;;
- : int = 8
# List.map (fun n -> n * 2 + 1) [0;1;2;3;4];;
- : int list = [1; 3; 5; 7; 9]

可以把匿名函数用 let 绑定到一个变量,这就是定义一个函数。如:

# let plusone = fun x -> x + 1;;
val plusone : int -> int = <fun>

下面的函数定义形式是上面定义形式的“语法糖”:

# let plusone x = x + 1;;
val plusone : int -> int = <fun>

5.5. Currying (柯里化:把多参函数变为单参函数)

Currying is the technique of translating the evaluation of a function that takes multiple arguments (or a tuple of arguments) into evaluating a sequence of functions, each with a single argument. It was introduced by Moses Schönfinkel and later developed by Haskell Curry.

把接受多个参数的函数转换为接受单个参数的函数的过程,称为柯里化。
柯里化实例一:

let add = fun x y -> x + y
let add = fun x -> (fun y -> x + y)

柯里化后的函数 add,返回另外一个函数。

柯里化实例二:

let abs_diff = fun x y -> abs (x - y)
let abs_diff = fun x -> (fun y -> abs (x - y))

5.5.1. Currying 的好处(使函数更加通用,可特化出很多函数)

Currying is more than just a theoretical curiosity. You can make use of currying to specialize a function by feeding in some of the arguments. Here’s an example where we create a specialized version of abs_diff that measures the distance of a given number from 3:

# let dist_from_3 = abs_diff 3;;
val dist_from_3 : int -> int = <fun>
# dist_from_3 10;;
- : int = 7

The practice of applying some of the arguments of a curried function to get a new function is called partial application.

5.5.2. OCaml 能自动 Currying

在 OCaml 中不用显式地定义柯里化版本的函数,OCaml 可以自动 Currying。如:

# let add = fun x y -> x + y;;
val add : int -> int -> int = <fun>
# (add 1) 3;;
- : int = 4

5.5.3. 其它函数式语言中的 Currying

函数式语言都支持 Currying。

如:scheme 中 currying 的例子:

(define (add a b)
  (+ a b))

对应的 curried 版本为:

(define (add a)
  (lambda (b)
    (+ a b)))

Standard ML 和 Haskell 语言支持 auto currying:
The nice thing about functional languages like Standard ML or Haskell is that you can get currying "for free". You can define a multi-argument function as you would in any other language, and you automatically get a curried version of it, without having to throw in a bunch of lambdas yourself.

参考:http://stackoverflow.com/questions/36314/what-is-currying

5.6. Labeled Arguments (给参数命名,参数前加 ~)

OCaml supports labeled arguments, which let you identify a function argument by name. You can give such names to arguments in your programs, by prefixing them with a tilde ~.

如:

# let f ~x ~y = x - y;;
val f : x:int -> y:int -> int = <fun>

Labeled Arguments 的好处是在使用函数时不用记住参数的位置(即是第几个参数)。

# f 3 1;;
- : int = 2
# f ~x:3 ~y:1;;
- : int = 2
# f ~y:1 ~x:3;;     (* Labeled Arguments的顺序无关紧要 *)
- : int = 2

5.7. Optional Arguments (参数前加 ?)

函数的参数可以是可选的,在参数前加 ? 可以指定一个可选参数。如:

# let concat ?sep x y =
    let sep = match sep with None -> "" | Some x -> x in x ^ sep ^ y;;
val concat : ?sep:string -> string -> string -> string = <fun>
# concat "foo" "bar";;
- : string = "foobar"

使用可选参数时,必须明确指定可选参数的名字(和 Labeled Arguments 类似)。如:

# concat "+" "foo" "bar";;            (* 可选参数的顺序无关紧要 *)
Error: The function applied to this argument has type ?sep:string -> string
This argument cannot be applied without label
# concat ~sep:"+" "foo" "bar";;
- : string = "foo+bar"
# concat "foo" "bar" ~sep:"+";;        (* 可选参数的顺序无关紧要 *)
- : string = "foo+bar"

5.8. 递归函数 (let rec)

递归函数要用 let rec 来定义。

# let rec factorial a =
    if a = 1 then 1 else a * factorial (a - 1);;
val factorial : int -> int = <fun>
# factorial 4;;
- : int = 24

说明:使用 let 或者 let rec 定义函数的区别仅仅在于函数名的作用域不同。如果上面的函数 factorial 用 let 定义,则会尝试查找已经存在的函数名 factorial ,如果 factorial 之前没有定义过则直接报错,如果 factorial 是一个已经定义过的函数,则会使用旧的 factorial 函数来重新定义函数 factorial.

6. Pattern Matching (match ... with ...)

OCaml 中最常见的分支结构是 Pattern Matching,其形式为:

match expression with
 | pattern1 -> expression1
 | pattern2 -> expression2
 ...
 | patternN  -> expressionN

例如,下面的 if ... then .. else ... 可以改为对应的 Pattern Matching 形式。

let rec factorial a =
    if a = 1 then 1 else a * factorial (a - 1)

上面 factorial 函数对应的 Pattern Matching 形式为:

let rec factorial a =
    match a with
    | 1 -> 1
    | _ -> a * factorial (a - 1)

说明 1:第 1 个 pattern 前面的 | 符号可以省略。所以上面 factorial 函数也可写为:

let rec factorial a =
    match a with
     1 -> 1
    | _ -> a * factorial (a - 1)

说明 2:The pattern _ is special – it matches anything.

参考:http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora016.html

6.1. 更精练的匹配形式 (function)

Ocaml reserved a special keyword function that defines a function with a single argument treated as pattern match.

实例:

let rec fib i =
   match i with
      0 -> 0
    | 1 -> 1
    | j -> fib (j - 2) + fib (j - 1);;

上面函数也可以这样定义(代码更精练):

let rec fib = function
     0 -> 0
   | 1 -> 1
   | i -> fib (i - 1) + fib (i - 2);;

6.2. Combining patterns

多个模式可以用 | 组合在一起。

let is_a_vowel c = match c with
    'a' -> true
  | 'e' -> true
  | 'i' -> true
  | 'o' -> true
  | 'u' -> true
  | _ -> false ;;

上面函数可以这样定义:

let is_a_vowel c = match c with
    'a' | 'e' | 'i' | 'o' | 'u' -> true
  | _ -> false ;;

6.3. Naming a value being matched (as)

During pattern matching, it is sometimes useful to name part or all of the pattern. The following syntactic form introduces the keyword as which binds a name to a pattern.

关键字 as 使用实例:

let rec fib i =
   match i with
   (0 | 1) as i -> i
 | i -> fib (i - 1) + fib (i - 2);;

6.4. Pattern matching with guards (when)

Pattern matching with guards corresponds to the evaluation of a conditional expression immediately after the pattern is matched. The when keyword evaluates a condition immediately after the pattern match.

关键字 when 使用实例:

let rec fib i =
   match i with
      i when i < 2 -> i
    | i -> fib (i - 2) + fib (i - 1);;

6.5. Nested Pattern Matching

如何写一个嵌套的 Pattern Matching 呢?

可以像下面这样吗?

let f = function
  | 0 -> match ... with | a -> ... | b -> ...
  | 1 -> ...
  | 2 -> ...

事实上,上面的结构被处理为下面形式:

let f = function
  | 0 ->
     match ... with
     | a -> ...
     | b -> ...
     | 1 -> ...
     | 2 -> ...

这并不是我需要的形式。怎么办?用 beginend 把嵌套的 Pattern Matching 包围起来即可。如:

let f = function
  | 0 ->
     begin match ... with
     | a -> ...
     | b -> ...
     end
  | 1 -> ...
  | 2 -> ...

参考:https://ocaml.org/learn/faq.html#Patternmatching

6.6. Range patterns (..)

In patterns, OCaml recognizes the form 'c' .. 'd' as shorthand for characters that occur between 'c' and 'd' in the ASCII character set. For instance, the pattern '0'..'9' matches all characters that are digits.

let is_uppercase = function
   'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H'
 | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P'
 | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X'
 | 'Y' | 'Z' ->
    true
 | _ ->
    false;;

上面的函数可简写为下面形式:

let is_uppercase = function
    'A' .. 'Z' -> true
  | _ -> false;;

参考:http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec214

7. Data structures

7.1. Lists ([expr1; expr2; ...])

A list is a collection of elements. Here is a list of three integers:

# let x = [1; 2; 3];;
val x : int list = [1; 2; 3]

We write a list between square brackets [ and ], separating the elements with semicolons. All elements of the list must have the same type.

7.1.1. Add an element to the front (::)

The :: operator (pronounced “cons”) is used to add a single element to the front of an existing list:

# 1 :: [2; 3];;
- : int list = [1; 2; 3]

7.1.2. Deconstruct list by ::

We can also use :: to deconstruct (rather than construct) a list:

let rec length l =
  match l with
    [] -> 0
  | _::tail -> 1 + length tail

上面例子实现了求解列表长度的函数。其中, :: 用于分解一个 list,而不是构造 list。

说明 1:上面的实现可改造为更好的“尾递归”实现:

let rec length_inner l n =
  match l with
    [] -> n
  | _::t -> length_inner t (n + 1)

let length list = length_inner list 0

说明 2:用 List.length 可直接得到列表的长度。

# List.length [1; 2; 4; 8];;
- : int = 4

7.1.3. Combine two lists together (@)

The @ operator (pronounced “append”) is used to combine two lists together:

# [1; 2; 3] @ [4];;
- : int list = [1; 2; 3; 4]

7.2. Tuples ((expr1, expr2, ...))

In general, an n-tuple is an ordered sequence of n values written in parenthesis and separated by commas as (expr, expr, ..., expr). For instance, (42, "hello", true) is a 3-tuple that contains the integer 42 as its first component, the string "hello" as its second component, and the boolean value true as its third component.

Tuples 实例:

# let x = (1, true, 1.2);;
val x : int * bool * float = (1, true, 1.2)

Tuples 的类型以这种形式表示:"type1 * type2 * type3"

参考:http://www.cs.cornell.edu/courses/cs3110/2014sp/recitations/2/tuples_records_data.html

7.2.1. unit (empty tuple ())

The empty tuple () is called unit in OCaml.

# let xx = ();;
val xx : unit = ()

7.2.2. Tuples 单参函数像多参函数

用 Tuples 做函数参数看起来像多参函数。如:

let max1 (r1, r2) : float =
  if r1 < r2 then r2 else r1

其实,max1 只是一个单参(参数类型为 tuple)函数。可以这样使用函数 max1:

max1 (3.1415, 2.718)

7.2.3. fst and snd

You can extract the first two components of a tuple by using the fst and snd operators, which retreive the first and second elements of a tuple, respectively.

前面定义的函数 max1 可重新定义为:

let max1 (pair : float * float) : float =
  if (fst pair) < (snd pair) then (snd pair) else (fst pair)

8. Define new data types

OCaml allows us to define new data types. New types are introduced using type.

8.1. Records ({field1:type1; field2:type2; ...})

Records are similar to tuples in that they are data structures for holding multiple values. A record represents a collection of values stored together as one, where each component is identified by a different field name.

The basic syntax for a record type declaration is as follows:

type <record-name> =
{ <field> : <type>;
  <field> : <type>;
  ...
}

定义 Records 的例子:

type host_info =
{ hostname  : string ;
  os_name   : string ;
  cpu_arch  : string ;
}

使用 Records 的例子:

# let my_os = { os_name = "Linux"; cpu_arch = "unknown"; hostname = "ocaml.org" };;
val my_os : host_info =
  {hostname = "ocaml.org"; os_name = "Linux"; cpu_arch = "unknown"}

参考:https://realworldocaml.org/v1/en/html/records.html

8.2. Variants

Variant lists all possible shapes for values of that type.

The basic syntax of a variant type declaration is as follows:

type <variant> =
  | <Tag> [ of <type> [* <type>]... ]
  | <Tag> [ of <type> [* <type>]... ]
  | ...

Note that variant tags must be capitalized. 注意:Tags 的第一个字母必须为大写。

定义 Variants 的例子:

type day = Sun | Mon | Tue | Wed | Thu | Fri | Sat

使用 Variants 的例子:

let int_to_day (i : int) : day =
   match i mod 7 with
     0 -> Sun
   | 1 -> Mon
   | 2 -> Tue
   | 3 -> Wed
   | 4 -> Thu
   | 5 -> Fri
   | _ -> Sat

内置的类型 bool 可以认为是这样定义的:

type bool = true | false

参考:
http://www.cs.cornell.edu/courses/cs3110/2014sp/lectures/4/variant-types-and-polymorphism.html
http://caml.inria.fr/pub/docs/manual-ocaml-400/manual003.html#toc7
https://realworldocaml.org/v1/en/html/variants.html

8.2.1. Polymorphic Variants (`xxx)

和一般的 Variants 不同,Polymorphic Variants 以反引号开始,且无需显式的 type 声明。

Polymorphic Variants 例子:

type basic_color =
  [ `Black   | `Blue | `Cyan  | `Green
    | `Magenta | `Red  | `White | `Yellow ];;

参考:http://caml.inria.fr/pub/docs/manual-ocaml-400/manual006.html

9. Option Type (None or Some x)

Options are an Ocaml standard type that can be either None (undefined) or Some x where x can be any value. Options are widely used in Ocaml to represent undefined values (a little like NULL in C, but in a type and memory safe way).

# let divide x y =
  if y = 0 then None else Some (x/y) ;;
val divide : int -> int -> int option = <fun>
# divide 4 2;;
- : int option = Some 2
# divide 1 0;;
- : int option = None

参考:https://en.wikipedia.org/wiki/Option_type

10. OCaml 实例

求斐波那契数的例子:

(* File fib.ml *)

let rec fib n =
  if n < 2 then 1 else fib (n-1) + fib (n-2);;

let main () =
  let arg = int_of_string Sys.argv.(1) in
  print_int (fib arg);
  print_newline ();
  exit 0;;

main ();;
$ ./fib 5
8
$ ./fib 10
89

11. Precedence and associativity

The table below shows the relative precedences and associativity of operators and non-closed constructions. The constructions with higher precedence come first. For infix and prefix symbols, we write “*…” to mean “any symbol starting with *”.

Table 1: Precedence and associativity
Construction or operator Associativity
prefix-symbol
. .( .[ .{
# –  
function application, constructor application, tag application, assert, lazy left
- -. (prefix)
**… lsl lsr asr right
*… /… %… mod land lor lxor left
+… -… left
:: right
@… ^… right
=… <… >… |… &… $… != left
& && right
or || right
,
<- := right
if
; right
let match fun function try

参考:http://caml.inria.fr/pub/docs/manual-ocaml/expr.html

12. Tips

12.1. fun 和 function 的区别

Basically, "function" is a more fundamental function definition, it takes only one parameter, and immediately goes into pattern matching mode.

"fun" is a shorthand for creating functions with more than one parameter.

参考:http://caml.inria.fr/pub/ml-archives/ocaml-beginners/2003/11/b8036b7a0c1d082111d7a83c8f6dbfbb.en.html

Author: cig01

Created: <2015-11-01 Sun>

Last updated: <2017-12-08 Fri>

Creator: Emacs 27.1 (Org mode 9.4)