Programming Languages, UW, Part B - Static Checking

Posted on 2023-07-19 繁/简： set

这一节可以说是整套系列课程中含金量最高的一部分，学习的时候甚至能感受到获取知识的快感......！

一方面，在我们已经建立起对 ML 与 Racket 相当认识的基础之上，这一节由浅入深的展示了 static typing system 与 dynamic typing system 之间的区别与联系。
另一方面，本节的作业是用 Racket 写一个 MUPL (made-up programming language) 的解释器。

If you want to understand something, implement it.

This article is a self-administered course note.

It will NOT cover any exam or assignment related content.

Interpreter & Compiler

We can describe a typical workflow for a language implementation as follows.

We take a string holding the concrete syntax of a program in the language. Typically this string would be the contents of one or more files.
The parser (解析器，语法分析器) gives errors if this string is not syntactically well-formed. (no misused keywords, misplaced parentheses, etc.)
If there are no such errors, the parser produces a tree that represents the program. This is called abstract-syntax tree (抽象语法树), or AST for short.
If our language includes type-checking rules, the type-checker will use this AST to either produce error messages or not.

解析器将代码解析为 AST；这是源代码语言结构的一种树形抽象表示，树上的每一个节点代表源代码中的一种结构。若语言有类型检查规则，类型检查器将对 AST 进行类型检查。若通过检查，AST 将被进一步处理。

There are basically two approaches to this rest-of-the-implementation for implementing some programming language B, or sometimes of the mix.

interpreter (解释器)：我们用另一种语言 A 编写语言 B 的解释器。解释器读入 B 语言程序的 AST 并生成结果。更好的名字 (但并非学术共识)：executor for B 或 evaluator for B.
compiler (编译器)：我们用另一种语言 A 编写语言 B 的编译器。编译器读入 B 语言程序的 AST 并生成一段与其完全等价的 C 语言 (not necessarily the language C) 程序，再利用某些已有的 C 语言实现 (pre-existing implementation for C) 生成结果。更好的名字 (但并非学术共识)：translator for B.

无论采用解释器还是编译器，我们把语言 A 称之为 metalanguage (元语言)。For compilation, 我们把语言 B 称为 source language (源语言)，语言 C 称为 target language (目标语言)。

A one-sentence sermon: Interpreter versus compiler is a feature of a particular programming-language implementation, NOT a feature of the programming language.

采用编译器或解释器是语言实现的两种方式，与语言本身无关。我们不能将使用编译器或解释器当作语言本身的 feature。

我们常说的所谓编译型语言 (compiled languages) 与解释型语言 (interpreted languages) 其实是毫无道理的：这种分类方法的出现有着很复杂的历史原因，例如 C 语言长期以来都是采用编译器实现，函数式语言通常采用解释器实现。(因此 C 语言常被称作编译型语言，而函数式语言常被称作解释型语言)。

但这并不代表我们不能编写 C 语言的解释器，或是编写函数式语言的编译器 (事实上，SML/NJ 就采用了部分编译器实现)：无论采用哪一种实现方式，都与源语言本身无关。

ML versus Racket

General semantics 上的异同：

similarities:
- have constructs that encourage a functional style.
- avoiding mutations, while allowing it where appropriate.
- using first-class closures.
- ...
differences:
- different syntax: ML's pattern matching, Racket's accessor functions for structs.
- Racket's mutiple variants of let-expressions (let, let*, letrec)
- (most important) ML has a static type system that Racket does not!

ML's static type system rejects lots of programs before running them by doing type-checking and reporting errors. To do so, ML enforces certain restrictions (e.g., all elements of a list must have the same type). As a result, ML ensures the absence of certain errors at compile time.

理解两个语言区别的最好方式是从其中一个语言的角度审视另一个语言。抛开语法的桎梏，我们站在 ML 的视角，研究它的静态类型系统会对 Racket 的程序作何反应。

1	(define (f y) (+ y (car y)))

ML 的静态类型系统可以识别到以上程序的类型错误并在编译时报告错误，阻止该程序进入运行阶段。而 Racket 只有该函数被调用时才会弹出 run-time error。

1	(define xs (list 1 #t "hi"))

然而 ML 也会 reject 一些在 Racket 视角下没有 bug 的程序；如上例，这个程序违反了 ML 的类型规则：list 中的所有元素类型应该保持一致。但这样的 list 定义在 Racket 中完全没有问题。

从这个角度看，我们能够得出这样的结论：ML is roughly a subset of Racket. Programs that run produce similar answers, but ML rejects many more programs as illegal, i.e., not part of the language.

那么，一个 ML 程序员又会如何看待 Racket 呢？他可能会认为 Racket is just ML where every expression is part of one big datatype.

datatype theType = Int of int
                 | String of string
                 | Pair of theType of theType * theType
                 | Fun of theType -> theType
                 | ... (* one constructor per built-in type *)

Racket 中的每个表达，都被某个 constructor 隐式的 (implicitly) 包装入这个 big datatype 中。例如 42 实际上是 (Int 42)；这能够使得每个表达的结果都是 theType 类型。

并且，函数将检查其参数是否有正确的 constructors (in other words, "tags")，如果不合法，弹出错误；反之将调用函数的结果也隐式的包装在对应的 theType 类内。例如，(+ a b) 将检查 a 与 b 是否有标签 Int；如果有，则返回结果 (Int a+b)。

1 2	fun car v = case v of Pair(a, b) => a \| _ => raise ... (* give some error *) fun pair? v = case v of Pair _ => true \| _ => false

这是另外一个例子：我们用 ML 来展示 Racket 中存在的 implicit run-time pattern matching。由于这种 "secret pattern-matching" 是不对程序员暴露的，Racket 提供了类型检查函数例如 pair?。

The fact that we can think of Racket in terms of theType suggests that anything you can do in Racket can be done, perhaps more awkwardly, in ML: The ML programmer could just program explicitly using something like theType definition above.

What is Static Checking?

What is usually meant by "static checking" is anything done to reject a program at compile-time.

Compile-time is the time after a program (successfully) parses but before it runs.

注意，compile-time 是一个时间概念，虽然名字中含有 compile，但它与语言实现采用的是 interpreter 还是 compiler 无关。这进一步印证了我们之前所提到的语言实现并不是语言本身的 feature 这一观念。

The most common approach to define a language's static checking is via a type system.

type system 中包含一系列 typing rules，例如 (以 ML 为背景)：每个变量都有一个类型，条件语句的分支有着相同的类型，list 中的元素类型一致等等。static checking 则检查程序是否遵循了这些规则。

注意在 ML 中，type inferencer 也是 static checking 的一部分，它对程序进行推断，判断是否存在一种可能的 type annotations，使得程序能够遵循 type system 中定义的规则。

The purpose of static checking is to reject programs that "make no sense" or "may try to misuse a language feature".

static checking 的确能够达到它设计之初的目的，但它并不是完美的；

There are errors a type system typically does not prevent (such as array-bounds errors).
There are errors a type system cannot prevent unless given more information about what a program is supposed to do. (ex. wrongly calls + instead of *)
The necessary trade-off. the static checker has to reject some programs that would not do anything wrong.

值得注意的是第三点：Racket 的动态检查 (dynamic checking, i.e., run-time checking) 通过赋予并检测值的标签实现 (即上一节提到的 secret pattern-matching)，这使得它在运行时发现程序的错误。

而静态检查则能在程序运行之前排除某些有问题的程序，这样做的代价 (trade-off) 是它将会拒绝某些实际上没有问题的程序 (例如拥有不同类型元素的 list)。

The typical points at which to prevent an error are compile-time and run-time. However, it is worth realizing that there is really a continuum of eagerness about when we declare something an error.

实际上，静态检查与动态检查并非是对立的概念，对于不同类型的错误，我们可以选择在 compile-time 将其检出，也可以选择等到 run-time 时再弹出错误；在这两个时间段之外进行处理也是一个可行但非典型的选择。

如何掌握好 static checking 的 trade-off，平衡其检出错误的效率与语言的灵活性，才是一个更加实际且有意义的命题。

Correctness: Soundness, Completeness, Undecidability

设我们想要避免的错误为 X，我们定义一个 static checker 的正确性 (correctness) 为：

A type system is sound if it never accepts a program that, when run with some input, does X, i.e., soundness prevents false negatives.
A type system is complete if it never rejects a program that, no matter what input it is run with, will not do X, i.e., completeness prevents false positives.

实际上，soundness 与 completeness 这两个术语来自逻辑学，其被广泛的应用在编程语言的理论设计中。

A sound logic proves only true things.
A complete logic proves all true things.

如果一个 type system 能够兼具 soundness 与 completeness，那么它所接受的程序集合与所有不执行 X 的程序集合是完全等价的。可惜，理论是美好的，现实是残酷的：

In modern languages, type system are sound (they prevent what they claim to) but not complete (they reject program that need not reject).

以 ML 的 static checker 为例，若我们想避免的问题 X 为 "string 类型的值作为除法运算的运算数"：

1
2
3

fun f1 x = 4 div "hi"                         (* f1 never get called *)
fun f2 x = if true then 0 else 4 div "hi"
fun f3 = if x <= abs x then 0 else 4 div "hi"

注意，上述的三个函数均不会执行 4 div "hi" 这一语句，但 ML 的 static checker 却会检出类型错误并拒绝程序运行。这也就是说，ML 并不能很好的处理假阳性程序，它的 static checker 并不是 complete 的。

It is impossible to implement a static checker that given any program in your language (a) always terminates, (b) is sound, (c) is complete.

为了确保 X 不出现，现代语言的设计者选择了 soundness 与 always terminates 而牺牲了 completeness。接受假阴性程序的后果要远大于拒绝假阳性程序的后果。

与停机问题 (halting problem) 类似，一个总能停机的 static checker 不能兼具 soundness 与 completeness 的原因也是由图灵机的不可决定性 (undecidability) 决定的。

非正式的设想一下，若我们想避免的问题 X 为程序进入死循环，那么一个既 sound 又 complete 的 static checker 本身就是对停机问题的通用解法。这样，我们把该问题的不可能性归约到了停机问题的不可能性上。

Weak Typing

接下来我们再来介绍一个语言概念，strongly typed language 与 weakly typed language。

首先要强调，语言的强/弱类型与静态/动态类型是没有必然联系的两个概念。ML 与 Racket 虽然分别采用静态与动态类型系统，但它们都可被视为是强类型语言。

对于想要阻止的问题 X，不同的语言有不同的处理方式：

static checking: 利用 static checker 在 compile-time 检出。
dynamic checking: 若 static checker 无法检出 X (语言不使用 static checker 或其 static checker 对问题 X 是 unsound 的)，则在语言实现中加入一些 dynamic checks 在 run-time 阻止 X 发生。
......
weak typing: 语言实现根本不对程序进行检查：认定 X 的出现是程序员本身的问题。若程序员本身没有发现问题，X 将会发生。(你程序员觉得是我的锅，那就是我的锅)

C/C++ 是很典型的弱类型语言：举一个非常熟悉的例子，C/C++ 中的数组越界问题。即使我们访问了不合法的数组下标如 a[-1]，C/C++ 也不会阻止程序的运行，最后程序可能会因为访问某些未申请的内存而导致崩溃。

所谓强类型语言，就是其类型系统 (type system) 中存在较多的 restrictions，阻止潜在 bug 程序的运行。而弱类型语言的 restrictions 就相对较少。

相对于强类型语言，弱类型语言省去了进行检查的时间成本，存储数据 (例如 dynamic checks 所需要记录的 tags 信息) 的空间成本；但所付出的代价是，程序的 bug 更需要程序员本身进行检出而不能依靠强类型语言中的各种检查来发现。(相信我们都遇到过出现难以 de 出的奇怪 bug，最后发现是数组越界的抓狂时刻)

因此，在实际应用中存在很多外置的 checking tools 对弱类型语言程序进行检查；这是一个折中的方案。

另外需要强调的是，语言的强/弱类型比起是一种定义，不如说是一种相对的概念，具体可以参考这个帖子，回答者讲的很好。在中文语境下对强/弱类型似乎还有另外的定义，与变量的定义与类型转换规则相关。

Extended Semantics

对于某个想要避免的问题 X，一些语言并不纠结是否将其检出的问题，它们选择扩展其 primitives 的语义。例如，传统的 + 只允许两个数字进行相加，一个 string 类型的参数是不合法的。

而在 Python 中，"Hello" + "World" 这样的表达是合法的；这是因为 Python 扩展了 + 的 semantics，使其具有了连接两个 string 类型字符串的语义。

关于是否将问题 X 所对应的本来应当视作 bug 的语句赋予新的意义这个问题见仁见智：一方面，这样的处理使得一些 bug 更难被检出。一方面，它又确实给程序员带来了很多方便。

Reference

This article is a self-administered course note.

References in the article are from corresponding course materials if not specified.

Supplementary notes:

My notes in CnBlogs. My implementation.

Is C# a strongly typed or weakly typed language?.

Course info:

Programming Languages, Part B, University of Washington, Lecturer: Professor Dan Grossman.

-----------------------------------そして、次の曲が始まるのです。-----------------------------------