# 「软件基础 - PLF」 5. Small-Step Operational Semantics

Posted by Hux on March 5, 2019

## Big-step

### Cons - not catch the essence of how program behave

not just input state get mapped to output state. but also intermediate state (which could be observed by concurrent code!)

### Cons - not technically expressive enough to express exception / crash / non-termination

1. 不停机 nontermination - we want to allow this (infinite loop is the price paid for usability)
2. 卡住 getting stuck / undefiend behaviour 未定义行为 - we want to prevent (wrong)
• WHILE_true_nonterm 仅仅表达了「程序不能再 take step」，无法与「卡住」区分
• WHILE_true 更是直接让任何「无限循环」的程序都「等价」了…而忽略了中间状态和 effect (作用)

we need a way of presenting semantics that distinguish nontermination from erroneous “stuck states”

## A Toy Language

Only Constant and Plus

1
2
3
Inductive tm : Type :=
| C : nat → tm (* Constant *)
| P : tm → tm → tm. (* Plus *)


### Big-Step

==> is really ⇓

1
2
3
4
5
6
7
---------        (E_Const)
C n ==> n

t1 ==> n1
t2 ==> n2
-------------------  (E_Plus)
P t1 t2 ==> n1 + n2


### Small-Step

single reduction step find leftmost redex

1
2
3
4
5
6
7
8
9
10
-------------------------------   (ST_PlusConstConst)
P (C n1) (C n2) --> C (n1 + n2)

t1 --> t1'
--------------------          (ST_Plus1)
P t1 t2 --> P t1' t2

t2 --> t2'
----------------------------      (ST_Plus2)
P (C n1) t2 --> P (C n1) t2'


## Relations

Check notes of rel and tactics for more details about bi-relation.

### Deterministic 确定性

a.k.a Partial Function. in terms of its right uniqueness under mathematical context, not its emphasise on partial under programming context)

1
2
Definition deterministic {X : Type} (R : relation X) :=
∀x y1 y2 : X, R x y1 → R x y2 → y1 = y2.


deterministic step can be proved by induction on derivation x --> y1

• use generalize dependent y2!
• in informal proof, we usually just take ∀ y2 by default.

### Ltac solve_by_inverts n

1
2
3
4
5
6
7
Ltac solve_by_inverts n :=
match goal with | H : ?T ⊢ _ ⇒
match type of T with Prop ⇒
solve [
inversion H;
match n with S (S (?n')) ⇒ subst; solve_by_inverts (S n') end ]
end end.


### Values 值

#### Abstract Machine 抽象机!

think of the --> relation as defining an abstract machine:

• term = state of machine 项 = 机器状态
• step = atomic unit of computation (think as assembly opcode / CPU instructrion)
• halting state = no more computation. 停机状态

execute a term t:

• starting state = t
• repeatedly use -->
• when halt, read out the final state as result of execution

Intutively, we call such (final state) terms values. Okay so the point is…this language is simple enough (no stuck state). and in this lang, value can only be Const:

1
2
Inductive value : tm → Prop :=
| v_const : ∀n, value (C n).


and we can write ST_Plus2 more elegant: well…in this lang, not really, since only one form of value to write out. in cases we have multiple form of value, by doing this we don’t have to write out any cases.

1
2
3
4
value v1
t2 --> t2'
--------------------  (ST_Plus2)
P v1 t2 --> P v1 t2'


### Strong Progress and Normal Forms 强可进性和正规式

strong progress: every term either is a value or can “make progress”

1
2
Theorem strong_progress : ∀t,
value t ∨ (∃t', t --> t').


terms that cannot make progress. for an arbitrary relation R over an arbitrary set X

normal form: term that cannot make progress (take a step) 其实我个人比较喜欢理解为「常态」或「无能量稳定态」

1
2
Definition normal_form {X : Type} (R : relation X) (t : X) : Prop :=
¬∃t', R t t'.


theorem: in this language, normal forms and values are actually the same thing.

1
2
3
Lemma value_is_nf : v, value v → normal_form step v.
Lemma nf_is_value : ∀t, normal_form step t → value t.
Corollary nf_same_as_value : ∀t, normal_form step t ↔ value t.


#### Value != Normal Form (not always)

value is a syntactic concept : it is defined by looking at the form of a term normal form is a semantic one : it is defined by looking at how the term steps.

E.g. we can defined term that can take a step as “value”: 添加一个不是 normal form 的 value

1
2
3
Inductive value : tm → Prop :=
| v_const : ∀n, value (C n)
| v_funny : ∀t1 n2, value (P t1 (C n2)). (* <--- it can actually progress! *)


1
2
3
Inductive step : tm -> tm -> Prop :=
| ST_Funny : forall n,
C n --> P (C n) (C 0)                (* <--- or a weird  *)


## Multi-Step Reduction -->* 多步规约

relation multi R: multi-step closure of R same as clos_refl_trans_1n in Rel chapter.

1
2
3
4
5
6
Inductive multi {X : Type} (R : relation X) : relation X :=
| multi_refl : ∀(x : X), multi R x x
| multi_step : ∀(x y z : X),
R x y →
multi R y z →
multi R x z.


1
2
3
4
5
6
7
8
Theorem multi_R : ∀(X : Type) (R : relation X) (x y : X),
R x y →
multi R x y.

Theorem multi_trans : ∀(X : Type) (R : relation X) (x y z : X),
multi R x y →
multi R y z →
multi R x z.


### Normal Forms Again

1
2
3
4
5
6
Definition step_normal_form := normal_form step.  (** 这个是一个「性质」 Property : _ -> Prop , 从 polymorphic 的 [normal_form] 以 [step] 实例化而来 **)
Definition normal_form_of (t t' : tm) :=          (** 是两个项之间的（i.e. 定义在 [tm] 集合上的) 二元关系, 即 t' 是 t 的正规式 **)
(t -->* t' ∧ step_normal_form t').

Theorem normal_forms_unique:                      (** single-step reduction is deterministic 可以推出 normal form is unique for a given term **)
deterministic normal_form_of.


### Normalizing 总是可正规化得 – “Evaluating to completion”

something stronger is true for this language (though not for all languages) reduction of any term t will eventually reach a normal form (我们知道 STLC 也有这个特性)

1
2
3
Definition normalizing {X : Type} (R : relation X) :=
∀t, ∃t',
(multi R) t t' ∧ normal_form R t'.


To prove this, we need lemma showing some congruence of -->*: 同余关系，不过这次是定义在 -->* 这个关系上，again，同余指的是「关系对于结构上的操作保持」

1
2
3
4
5
6
7
8
Lemma multistep_congr_1 : ∀t1 t1' t2,
t1 -->* t1' →
P t1 t2 -->* P t1' t2.

Lemma multistep_congr_2 : ∀t1 t2 t2',
value t1 →
t2 -->* t2' →
P t1 t2 -->* P t1 t2'.


Then we can prove…

1
2
Theorem step_normalizing :
normalizing step.


### Equivalence of Big-Step and Small-Step

1
2
3
4
5
Theorem eval__multistep : ∀t n,
t ==> n → t -->* C n.

Theorem multistep__eval : ∀t t',
normal_form_of t t' → ∃n, t' = C n ∧ t ==> n.    (* might be better to say value here? *)


What if we combined the lang Arith and lang Boolean? Would step_deterministic and strong_progress still holds?

Intuition:

• step_deterministic should still hold
• but strong_progress would definitely not!!
• now we mixed two types so we will have stuck terms e.g. test 5 or tru + 4.
• we will need type check and then we would be able to prove progress (which require well-typeness)
1
2
3
4
5
6
7
8
9
10
11
Theorem strong_progress :
(forall t, value t \/ (exists t', t --> t')) \/
~ (forall t, value t \/ (exists t', t --> t')).
Proof.
right. intros Hcontra.
remember (P tru fls) as stuck.   (** 类似 disprove equiv = 举一个反例就好 **)
specialize (Hcontra stuck).
destruct Hcontra as [Hcvalue | Hcprogress]; subst.
- inversion Hcvalue; inversion H.
- destruct Hcprogress. inversion H. inversion H3. inversion H4.
Qed.


## Small-Step IMP

### aexp, bexp

1
2
Inductive aval : aexp → Prop :=
| av_num : ∀n, aval (ANum n).


bexp 不需要 value 因为在这个语言里 BTrueBFalse 的 step 总是 disjointed 得，所以并没有任何复用 value predicate 的时候

### -->a, -->b

1. 大步语义要短得多
2. aexp, bexp 其实并不会出
• 「不停机」: 没有 jump 等控制流结构
• 「异常」/「卡住」: 我们在 meta-language 的 AST 里就区分了 aexpbexp，相当于主动约束了类型，所以不会出现 5 || 3 这样 type error 的 AST

### cmd, -->

• 赋值命令归约到 SKIP （和一个新的 state）。
• 顺序命令等待其左侧子命令归约到 SKIP，然后丢弃它，并继续对右侧子命令归约。

WHILE 命令的归约是把 WHILE 命令变换为条件语句，其后紧跟同一个 WHILE 命令。

## Concurrent IMP

• unpredictable scheduling (subcommands may be interleaved)
• share same memory

It’s slightly confusing here to use Par (meaning in parallel) I mean, concurrency could be in parallel but it doesn’t have to…

1
2
3
4
5
6
7
8
9
10
11
12
13
Inductive com : Type :=
| CPar : com → com → com. (* <--- NEW *)

Inductive cstep : (com * state) → (com * state) → Prop :=
(* New part: *)
| CS_Par1 : ∀st c1 c1' c2 st',
c1 / st --> c1' / st' →
(PAR c1 WITH c2 END) / st --> (PAR c1' WITH c2 END) / st'
| CS_Par2 : ∀st c1 c2 c2' st',
c2 / st --> c2' / st' →
(PAR c1 WITH c2 END) / st --> (PAR c1 WITH c2' END) / st'
| CS_ParDone : ∀st,
(PAR SKIP WITH SKIP END) / st --> SKIP / st


## A Small-Step Stack Machine 小步栈机

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Definition stack := list nat.
Definition prog  := list sinstr.

Inductive stack_step : state -> prog * stack -> prog * stack -> Prop :=
| SS_Push : forall st stk n p',
stack_step st (SPush n :: p', stk)      (p', n :: stk)
| SS_Load : forall st stk i p',
stack_step st (SLoad i :: p', stk)      (p', st i :: stk)
| SS_Plus : forall st stk n m p',
stack_step st (SPlus :: p', n::m::stk)  (p', (m+n)::stk)
| SS_Minus : forall st stk n m p',
stack_step st (SMinus :: p', n::m::stk) (p', (m-n)::stk)
| SS_Mult : forall st stk n m p',
stack_step st (SMult :: p', n::m::stk)  (p', (m*n)::stk).

(** closure of stack_step **)
Definition stack_multistep st := multi (stack_step st).


### Compiler Correctness

「编译器的正确性」= the notion of semantics preservation (in terms of observable behaviours) S = e C = s_compile e B(S) = aeval st e B(C) = functional s_execute | relational stack_multistep

1
2
3
4
5
6
7
8
Theorem s_compile_correct : forall (st : state) (e : aexp),
s_execute st [] (s_compile e) = [ aeval st e ].

(** 重要的是这个更一般的「描述了 prog 如何与 stack 交互」的定理 **)
Theorem s_execute_theorem : forall (st : state) (e : aexp) (stack : list nat) (prog : list sinstr),
s_execute st                  stack  (s_compile e ++ prog)
= s_execute st ((aeval st e) :: stack)                 prog.



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Theorem stack_step_theorem : forall (st : state) (e : aexp) (stack : list nat) (prog : list sinstr),
stack_multistep st
((s_compile e ++ prog),                 stack)
(                prog , (aeval st e) :: stack).      (** 这里 prog 和 stack 的交互本质上和上面是一样的 **)
Proof.
unfold stack_multistep.
induction e; intros; simpl in *;        (** 证明 induction on aexp，然后利用 transivitiy、constructor 与 IH 即可，非常快 **)
try (apply multi_R; constructor);
try (
repeat (rewrite <- app_assoc);
eapply multi_trans; try apply IHe1;
eapply multi_trans; try apply IHe2;
eapply multi_R; constructor
).

Definition compiler_is_correct_statement : Prop := forall (st : state) (e : aexp),
stack_multistep st (s_compile e, []) ([], [aeval st e]).


## Aside: A normalize Tactic

Even with eapply and auto…manual normalization is tedious:

1
2
3
4
5
6
7
8
Example step_example1' :
(P (C 3) (P (C 3) (C 4)))
-->* (C 10).
Proof.
eapply multi_step. auto. simpl.
eapply multi_step. auto. simpl.
apply multi_refl.
Qed.


We could write custom Tactic Notation…(i.e. tactic macros)

1
2
3
4
5
6
7
Tactic Notation "print_goal" :=
match goal with ⊢ ?x ⇒ idtac x end.

Tactic Notation "normalize" :=
repeat (print_goal; eapply multi_step ;
[ (eauto 10; fail) | (instantiate; simpl)]);
apply multi_refl.


instantiate seems here for intros ∃?

1
2
3
4
5
6
Example step_example1''' : exists e',
(P (C 3) (P (C 3) (C 4)))
-->* e'.
Proof.
eapply ex_intro. normalize.
Qed.


But what surprise me is that we can eapply ex_intro, which leave the ∃ as a hole ?ex (unification variable).