Mosh compiler/VM のデバッグを1週間続けた話

Mosh は R7RS に対応しつつあるので ecraven/r7rs-benchmarks というベンチマークを走らせてみた。今回は速度は気にせずにR7RS Scheme 処理系としてコードを間違いなく正しく実行できるかに注力。結局57個中で3つのベンチマークで実行に失敗することが分かった。うち2つは比較的簡単に修正できた。最後の1つが手強かったので記録を残す。

スタート

失敗するのは conform というベンチマーク。期待される結果とは違うものを Mosh が返してしまう。そして実行時間がやけに短い。conform ベンチマークスクリプトr7rs-benchmarks/conform.scm)を見てみるとグラフ構造を作って何かやっているらしい。正直コードを理解しようとするのは3秒で諦めた。

この時点ではデバッグが難航する事は知る由もないのでデバッグ print を入れてなぜ期待と違うかを調べようとするがすぐに挫折。なぜならば

  • A) グラフ構造を扱っているので自分の中で自分を参照していてデバッグ print と相性が悪いこと。write/ss で shared structure を print できるがそれでも視認性が悪い。
  • B) データ構造が大きく print しきれない
  • C) そもそも何をやっているのかコードから読み取れない。

この状態で2-3時間無駄に使った。

心理的安全性

これは難航しそうだとようやく気付いたので少し落ち着くことにした。正しい状態がよく分からないのが問題なので Gauche で実行して比べることとした。次に処理内容が分からないのは良いとして、メインの処理を何となく名前から特定した。ここでようやくメインの処理にデバッグ print を入れて Gauche と比較できるようになり、ある関数で一瞬で間違った値が返っていることが分かった。

間違うポイントが分かったので勝利を確信。その関数への入力を維持したまま再現コードを小さくしていくことにした。ところがこれがかなり難しい。入力も出力もグラフなので文字列や数字を扱うのとは別の難しさがある。色々やっているうちにぐちゃぐちゃになってしまった。元に戻らなくなってしまい大反省。debug という git branch を作り少しずつ進むようにしたら急に捗ったし壊すことも無くなった。チェックポイントあるし壊しても大丈夫という心理的安全性大事。1日かけて小さなコードに絞ることができた。

(define (foo)
  (define (bar  n)
    (cond ((= n 1) #t)
          (else
           (let loop ((lst '(1)))
             (if (null? lst)
                 #t
                 (and 
                  (display "=> recusive1\n")
                  (bar 1)
                  (display "=> loop\n")                         
                  (loop (cdr lst))))))))
    (bar 0)
    )
(foo)

このコードの (bar 1) の呼び出しの後に (display "=> loop\n") が呼ばれないことが分かった。これは明らかにおかしいなぜならば (bar 1) は #t を返すから。

それで誰が悪いのか?

Scheme 処理系を書いたことのある人なら分かると思うが、これは色々怪しい。define の中に define があり named let があり末尾再帰があり。どこでバグっていてもおかしくない。

この時点でバグの原因候補は実行順に

というのを切り替えないといけない。その上この辺りを触るのは10年以上ぶりである!

コンパイラVM を調べる

最適化を OFF にすると再現しなくなったので

  • 最適化そのものが間違っている
  • 最適化で出てくるコードを正しく実行できないパスがある

あたりが怪しい。

pass2 の最適化の途中。1週間のデバッグを終えた今の目で見ればおかしいところは明らか。

($if ($asm NUMBER_EQUAL
       ($lref n[1;0])
       ($const 1))
  ($const #t)
  ($letrec ((loop[0;0] ($lambda[loop;2] (lst[2;0])
                         ($label #0
                           ($if ($asm NULL_P
                                  ($lref lst[2;0]))
                             ($const #t)
                             ($if ($call ($gref display)
                                    ($const "=> recusive1\n"))
                               ($if ($call[tail-rec] ($lref bar[2;0])
                                      ($const 1))
                                 ($if ($call ($gref display)
                                        ($const "=> loop\n"))
                                   ($call[jump] ($call[embed] ($lambda[loop;2] (lst[2;0])
                                                                label#0)
                                                  ($const (1)))
                                     ($asm CDR
                                       ($lref lst[2;0])))
                                   ($it))
                                 ($it))
                               ($it))))))
            )
    ($call[embed] ($lambda[loop;2] (lst[2;0])
                    label#0)
      ($const (1)))))

pass2 の最適化後の iform は以下の通り。ここで時間を使いすぎた。

($define () foo
  ($lambda[foo;0] ()
    ($call[embed 4 0] ($lambda[bar;2] (n[1;0])
                        ($label #0
                          ($if ($asm NUMBER_EQUAL
                                 ($lref n[1;0])
                                 ($const 1))
                            ($const #t)
                            ($call[embed 7 0] ($lambda[loop;2] (lst[2;0])
                                                ($label #1
                                                  ($if ($asm NULL_P
                                                         ($lref lst[2;0]))
                                                    ($const #t)
                                                    ($if ($call ($gref display)
                                                           ($const "=> recusive1\n"))
                                                      ($if ($call[jump 0 0] ($call[embed 4 0] ($lambda[bar;2] (n[1;0])
                                                                                                label#0)
                                                                              ($const 0))
                                                             ($const 1))
                                                        ($if ($call ($gref display)
                                                               ($const "=> loop\n"))
                                                          ($call[jump 0 0] ($call[embed 7 0] ($lambda[loop;2] (lst[2;0])
                                                                                               label#1)
                                                                             ($const (1)))
                                                            ($asm CDR
                                                              ($lref lst[2;0])))
                                                          ($it))
                                                        ($it))
                                                      ($it)))))
                              ($const (1))))))
      ($const 0))))

結局 pass2 の最適化で local call を埋め込む処理あたりで何かがおかしい事は分かるのだが。この iform がおかしいのか。後続の処理がおかしいのか分からないので後続も見る。 実際の VM instruction 列を表示してみると。ますます分からない。

  CLOSURE
  81
  0
  #f
  0
  11
  ((reproduce.scm 1) foo)
  LET_FRAME
  7
  CONSTANT_PUSH
  0
  ENTER  ;; Label #0
  1
  REFER_LOCAL_PUSH_CONSTANT
  0
  1
  BRANCH_NOT_NUMBER_EQUAL ;; if (= n 1)
  5
  CONSTANT
  #t
  LOCAL_JMP ;; goto label #1
  57
  LET_FRAME 5
  REFER_LOCAL_PUSH
  0
  DISPLAY
  1
  CONSTANT_PUSH
  (1)
  ENTER
  1
  REFER_LOCAL_BRANCH_NOT_NULL
  0
  5
  CONSTANT
  #t
  LOCAL_JMP
  38
  FRAME
  6
  CONSTANT_PUSH
  => recusive1
  REFER_GLOBAL_CALL
  display  ;; (display "=> recusive1\n")
  1        ;; # of args
  TEST     ;; (display ...) return value is true. So we skip the +1 next line and go to +2 line.
  29
  CONSTANT_PUSH ;; Come here after (display ...) call
  1
  SHIFTJ ;; adjust SP and FP
  1      ;; depth
  4      ;; diff
  0      ;; How many closure to go up?
  LOCAL_JMP ;; Jump to label #0
  -42
  TEST
  19
  FRAME
  6
  CONSTANT_PUSH
  => loop
  REFER_GLOBAL_CALL
  display
  1
  TEST
  10
  REFER_LOCAL
  0
  CDR_PUSH
  SHIFTJ
  1
  1
  0
  LOCAL_JMP
  -43
  LEAVE
  1
  LEAVE ;; label #2
  1     ;; adjust stack
  RETURN ;; return to the code (the code is taken from the top of stack). ** But we don't have the code in the stack?***
  0
  DEFINE_GLOBAL
  foo
  HALT NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP)

動的に VM で実行される様子を stack と共に。

========================
FRAME
  FP|0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
========================
REFER_GLOBAL
  FP|0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
========================
CALL
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
========================
LET_FRAME                   # LET_FRAME for lambda[foo]
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
  FP|4: 4                   # Save fp
    |5: "#(#(CLOSURE ...))" # Closure
========================
CONSTANT
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
  FP|4: 4
    |5: "#(#(CLOSURE ...))"
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
  FP|4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0                  # Constant 0
========================
ENTER                      # Adjust fp
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
========================
REFER_LOCAL               # a=(Constant 0)
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0                  # Constant 0
    |7: 0                  # Constant 0
========================
CONSTANT                   # a=(Constant 1)
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 0
========================
BRANCH_NOT_NUMBER_EQUAL   # a != stack-bottom(Constant 0)
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0                  # Discarded stack top.
========================
LET_FRAME                  # LET_FRAME for loop.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 6                   # Save fp
    |8: "#(#(CLOSURE ...))" # Push closure
========================
REFER_LOCAL                 # a=(Constant 0) REALLY???
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 0                  # push (Constant 0)
========================
DISPLAY                    # Create a display and set it to closure.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"# Note stack is popped.
========================
CONSTANT                   # a=(1)
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1                  # (1)
========================
ENTER
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1                 # New FP.
========================
REFER_LOCAL               # a=(1)
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
========================
BRANCH_NOT_NULL          # Go to else.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
========================
FRAME                     # Push stuff.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
    |10: "(#(#f ...)"      # closure
    |11: 9                 # fp
    |12: 54                # pc + n
    |13: "(#(CLOSURE ...)" # Current codes
========================
CONSTANT                   # a="=> recursive1"
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
    |14: "=> recusive1\n"
========================
REFER_GLOBAL               # a=<display>
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
    |14: "=> recusive1\n"
========================
CALL                       # call a=<display>. Note codes is now body of display.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
    |15: ()               # display has optional-arg '()
========================
REFER_LOCAL               # a="=> recursive1\n"
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
    |15: ()
========================
BRANCH_NOT_NULL           # a is not null so we go to Else. But this is really weird.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
    |15: ()
========================
REFER_LOCAL                # a="=> recursive1"
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
    |15: ()
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
    |15: ()
    |16: "=> recusive1\n"
========================
REFER_FREE 0               # Point codes + 6
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
    |15: ()
    |16: "=> recusive1\n"
========================
TAIL_CALL                  # call <display> and jump to #(RETURN 1 ...)
=> recusive1
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
    |9: 1
    |10: "(#(#f ...)"
    |11: 9
    |12: 54
    |13: "(#(CLOSURE ...)"
  FP|14: "=> recusive1\n"
========================
RETURN
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1                   # this is ()
========================
TEST                        # Return value of display is #t
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
========================
CONSTANT
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
    |6: 0
    |7: 6
    |8: "#(#(CLOSURE ...))"
  FP|9: 1                    # (1)
    |10: 1                   # 1
========================
SHIFTJ 1 4 0                 # Adjust frame for jump. Stack is now for bar call.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
========================
LOCAL_JMP
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
========================
REFER_LOCAL                # a=1
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
========================
PUSH
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
    |7: 1
========================
CONSTANT
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
    |7: 1
========================
BRANCH_NOT_NUMBER_EQUAL    # Now 1=1. Jump to else.
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
========================
CONSTANT #t
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
========================
LOCAL_JMP 66
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
    |4: 4
    |5: "#(#(CLOSURE ...))"
  FP|6: 1
========================
LEAVE
    |0: "#(#(FRAME ...))"
    |1: 0
    |2: 6
    |3: "(#(FRAME ...)"
========================
RETURN
========================
HALT

これを脳内で何回も再生し iform と行ったり来たりして実行していくと (bar 1) を実行した後に local_jump したら戻ってこれないのは自明であることに気づく。この lambda 埋め込みは間違いである。それに気づけばあとは簡単で (and の途中で埋め込まれてしまったのは (bar 1) 呼び出しが末尾再帰と誤認されたのが原因。

この1行の修正でお仕事完了。