~bfiedler/website

ref: 8a096ff69f7d87fd5a31af0a8bf51205aaf89b30 website/static/typing-is-hard.html -rw-r--r-- 15.9 KiB
8a096ff6 — Ben Fiedler Add copy of typing-is-hard.ch a month ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <title>Typing is Hard</title>
  <style type="text/css">
      code{white-space: pre-wrap;}
      span.smallcaps{font-variant: small-caps;}
      span.underline{text-decoration: underline;}
      div.column{display: inline-block; vertical-align: top; width: 50%;}
  </style>
  <link rel="stylesheet" href="pandoc.css" />
  <link href="" rel="icon" type="image/x-icon"/>
</head>
<body>
<header>
<h1 class="title">Typing is Hard</h1>
</header>
<nav id="TOC">
<ul>
<li><a href="#type-checking-and-type-inference">Type Checking and Type Inference</a></li>
<li><a href="#common-terms">Common terms</a><ul>
<li><a href="#completeness">Completeness</a></li>
<li><a href="#soundness">Soundness</a></li>
<li><a href="#decidability">Decidability</a></li>
<li><a href="#hindley-milner-type-system">Hindley-Milner Type System</a></li>
<li><a href="#dependent-types">Dependent types</a></li>
</ul></li>
<li><a href="#how-hard-is-type-checking-your-favorite-language">How Hard is Type-Checking Your Favorite Language?</a><ul>
<li><a href="#c">C++</a></li>
<li><a href="#c-1">C#</a></li>
<li><a href="#elm">Elm</a></li>
<li><a href="#f">F#</a></li>
<li><a href="#go">Go</a></li>
<li><a href="#haskell">Haskell</a></li>
<li><a href="#idris">Idris</a></li>
<li><a href="#java">Java</a></li>
<li><a href="#ocaml">OCaml</a></li>
<li><a href="#ml">ML</a></li>
<li><a href="#rust">Rust</a></li>
<li><a href="#scala">Scala</a></li>
<li><a href="#swift">Swift</a></li>
<li><a href="#typescript">TypeScript</a></li>
<li><a href="#zig">Zig</a></li>
</ul></li>
<li><a href="#faq">FAQ</a><ul>
<li><a href="#where-are-python-bash-etc.">Where are Python, Bash, etc.?</a></li>
<li><a href="#what-about-unsafe-casts">What about unsafe casts?</a></li>
<li><a href="#ive-spotted-a-mistakeimprecision-what-should-i-do">I’ve spotted a mistake/imprecision, what should I do?</a></li>
</ul></li>
</ul>
</nav>
<h2 id="type-checking-and-type-inference">Type Checking and Type Inference</h2>
<p>Type checking is the process of taking a given program in some programming language and working out whether all variables and expressions have the correct types, i.e. strings are assigned to strings, arithmetic expressions involve only numbers, etc. Some languages also offer <em>type inference</em>, tasking the compiler with the task of figuring out correct types on its own. Depending on a language’s features the type checking and type inference problems range from trivial to undecidable.</p>
<h2 id="common-terms">Common terms</h2>
<h3 id="completeness">Completeness</h3>
<p>A type checker is complete if it can check every correctly typed program.</p>
<h3 id="soundness">Soundness</h3>
<p>A type checker is sound if it only accepts correctly typed programs.</p>
<h3 id="decidability">Decidability</h3>
<p>A decision problem is <strong>decidable</strong> if for any input we can compute whether the input satifies the problem in finite time. Examples of decidable problems include primality testing and boolean satisfiability. The halting problem for example is <strong>undecidable</strong>: We cannot check whether a program runs infinitely long in finite time. We are interested in the type checking and type inference problems for programming languages: Given some input program, does it type check? And given some program, which types should we assign to the expression such that it typechecks?</p>
<h3 id="hindley-milner-type-system">Hindley-Milner Type System</h3>
<p>The Hindley-Milner (HM) type system is a type system for the simple typed lambda calculus with parametric polymorphism, which is used as a base for many functional languages. Type-checking for HM is decidable and efficient, its complexity is linear in the size of the program. Type inference is more difficult than type checking, since the compiler has to solve the type constraints incurred for expressions in the program. It is <strong>decidable</strong> for the HM type system, however the problem itself is PSPACE-hard and EXPTIME-complete, meaning that in the worst case it needs at least a polynomial amount of extra space and exponential time relative to the input size. Fortunately, the type inference algorithm is linear when the nesting depth of polymorphic variables is bounded, as is the case for most applications. There exist many type inference algorithms, the best known one is the so-called <em>Algorithm W</em>. Many functional programming languages implement variants of the HM type system.</p>
<h3 id="dependent-types">Dependent types</h3>
<p>In simple terms dependent types allow types to depend not only on other types, but on values. This is best understood by an example: Normally we can only encode very coarse information, such as “x is of integer type”. Dependent types allow us to define more detailed types. For example we could then create a type “even integer”, whose only inhabitants are even integers. This is strictly more powerful than the previous setting: Dependent types in general make type inference undecidable as <a href="https://link.springer.com/content/pdf/10.1007%2FBFb0037103.pdf">can be shown</a> by reduction to the <a href="https://en.wikipedia.org/wiki/Post_correspondence_problem">Post Correspondence Problem</a>.</p>
<h2 id="how-hard-is-type-checking-your-favorite-language">How Hard is Type-Checking Your Favorite Language?</h2>
<p>Below I’ve compiled a list of languages and how hard type checking/type inference is in these languages. If you find a mistake or are missing a language please <a href="https://todo.sr.ht/~bfiedler/typing-is-hard">file an issue</a> with your language, its type checking complexity and optimally a resource that supports your claim. I do not claim completeness nor correctness of the properties shown here, it is mainly an amalgamation of blog posts I’ve been collecting.</p>
<h3 id="c">C++</h3>
<p><strong>undecidable</strong>, <a href="https://rtraba.files.wordpress.com/2015/05/cppturing.pdf">C++ Templates are Turing Complete, Todd L. Veldhuizen (2003)</a>, even its <a href="https://medium.com/@mujjingun_23509/full-proof-that-c-grammar-is-undecidable-34e22dd8b664">grammar is undecidable</a></p>
<h3 id="c-1">C#</h3>
<p><strong>unsound, undecidable</strong>, there is an <a href="https://stackoverflow.com/questions/23939168/is-c-sharp-type-system-sound-and-decidable">excellent SO answer by Eric Lippert</a> on this topic. Other fun things include <a href="https://docs.microsoft.com/en-us/archive/blogs/ericlippert/lambda-expressions-vs-anonymous-methods-part-five">a SAT solver using the C# type-checker</a></p>
<h3 id="elm">Elm</h3>
<p><strong>decidable</strong>, uses Hindley-Milner, but currently <strong>unsound</strong>, due to an <a href="https://github.com/elm/core/issues/960">interesting compiler bug</a>: <code>(String.length &quot;  &quot;) ^ (-1) : Int</code></p>
<h3 id="f">F#</h3>
<p><strong>undecidable</strong>, <a href="https://github.com/cannorin/FSharp.TypeLevel">GitHub user cannorin implemented the untyped lambda calculus in F#</a></p>
<h3 id="go">Go</h3>
<p><strong>decidable</strong>, since type inference is only used for variable initialization</p>
<h3 id="haskell">Haskell</h3>
<p>1998 standard, no extensions: <strong>decidable</strong>, variant of HM</p>
<p>2010 standard, no extensions: <strong>decidable</strong>, restriction of System F</p>
<p>with sufficient<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a> extensions: <strong>undecidable</strong>, as type checking in System F is undecidable. There exist <a href="https://www.lochan.org/keith/publications/undec.html">Turing Machines in Haskell types</a>. A simpler variant is to implement the <a href="https://mail.haskell.org/pipermail/haskell/2006-August/018355.html">SKI calculus</a>.</p>
<h3 id="idris">Idris</h3>
<p><strong>decidable</strong>, surprisingly. Idris has dependent types, but <a href="https://docs.idris-lang.org/en/latest/faq/faq.html#evaluation-at-the-repl-doesn-t-behave-as-i-expect-what-s-going-on">at compile time it will only evaluate expressions which it knows to be total (terminating and covering all inputs)</a>.</p>
<h3 id="java">Java</h3>
<p><strong>undecidable</strong>, because <a href="https://arxiv.org/pdf/1605.05274.pdf">Java Generics are Turing complete</a>. Java 5 or later is <strong>unsound</strong>, as shown by <a href="https://dl.acm.org/doi/pdf/10.1145/2983990.2984004">Amin and Tate (2015)</a></p>
<h3 id="ocaml">OCaml</h3>
<p><strong>undecidable</strong>, since we can <a href="https://caml.inria.fr/pub/old_caml_site/caml-list/1507.html">cause the type checker to loop infinitely</a></p>
<h3 id="ml">ML</h3>
<p><strong>decidable</strong>, uses Hindley-Milner</p>
<h3 id="rust">Rust</h3>
<p><strong>undecidable</strong>, both type inference (since Rust has rank-3-types) and type checking, as shown by this <a href="https://sdleffler.github.io/RustTypeSystemTuringComplete/">Smallfuck interpreter implemented using traits</a>.</p>
<h3 id="scala">Scala</h3>
<p><strong>undecidable</strong>, since it admits a <a href="https://michid.wordpress.com/2010/01/29/scala-type-level-encoding-of-the-ski-calculus/">type-level SKI calculus</a>, <strong>unsound</strong>, as shown by <a href="https://dl.acm.org/doi/pdf/10.1145/2983990.2984004">Amin and Tate (2015)</a>. Scala 2.13.3 (newest as of writing this) also exhibits the same problem.</p>
<h3 id="swift">Swift</h3>
<p><strong>undecidable</strong>, as proven <a href="https://forums.swift.org/t/swift-type-checking-is-undecidable/39024">here</a> by reduction to the <a href="https://en.wikipedia.org/wiki/Word_problem_for_groups">word problem for finitely generated groups</a><a href="#fn2" class="footnote-ref" id="fnref2"><sup>2</sup></a></p>
<h3 id="typescript">TypeScript</h3>
<p><strong>undecidable, unsound</strong>. The <a href="https://www.typescriptlang.org/docs/handbook/type-compatibility.html#a-note-on-soundness">TypeScript documentation</a> mentions its unsoundness and the <a href="https://www.typescriptlang.org/docs/handbook/type-compatibility.html#function-parameter-bivariance">motivations</a> <a href="https://www.typescriptlang.org/docs/handbook/type-compatibility.html#optional-parameters-and-rest-parameters">behind</a> them.</p>
<p>Undecidability: TypeScript’s type system was <a href="https://gist.github.com/hediet/63f4844acf5ac330804801084f87a6d4">proven Turing complete</a> until <a href="https://github.com/microsoft/TypeScript/issues/14837">they disallowed self-referential types</a>. However <a href="https://ostro.ws/2019/12/09/taking-types-too-far/">Robbie Ostrow</a> wrote a <a href="https://www.typescriptlang.org/play/#code/FAFwngDgpgBAcgQxAVwE4IDYwLwwN4wSpQBuAXPEmpjAL6iSwBaUqA9jvoceTAHalWdYA2gwAlgGcW7AIKIU6DHGQBbAEZDcMjlAAeIKHwAmkyopoB+GCFTJYFAGaZJUEeDEBRAI7IXAHlkYfUMTMwVqDAAaGAAhYIMjU3NIgD4cYBgsmCCQpLN4y0zs+LywnJgi7Js7B2Ks5wxXGDJ6mEbXd0YYAGV-ABUE0OSIpXTcAiJSCkH6UVgAeQFOPp1U+Zh+gHcOXD6lqHWPWH6AC2JYPYGdo+6AMTY0FYHzqEONu-ESS97-B7RbmIeuI9M9Pt9AbAeoI+M9gXpITBPOIAOanEBwmGIuDiZZXeHrAD0hJgADpyTAECYYJIOGxYV0xAAFAZDfIpMacSY8GYAbQARFMSPyALoCoWi4QbWTGYyBNnlUaYGKlRKKqiciZtAAMFFkUTaAEYKH0ZXL9TAWbFUjbgLRearhmYdJUYNqWjBDSLGVDkOpbAgAMYgeVlEYa5VxBXhiwYcb4HV6g3VY0VMPO1gcawCb6oD09P0B4P+FmyVIxK22+2O9ku6zuihen0wACyyAwIHEEAwYFDapjkRV0fCEbjXMTMB0yeyqbN-jbHa7PcCFf81vLOXW1eHk8zrobnu9GwAIl9xMYoH2nRzIzX1bH43gJznWNOsrOd3Xd+x8-5TyRz0vAt-XQYtSw3SsN3XO0HU-Pd6w9JsNk8b4+FZdMbzHLVql1fhBFQN9PRmOCf2sWx7A9QYMIOV0OgcJFUJLJj+htLdeSo-sR1jV1DQ9bUj2OGAAGE2AwDAkAAL3QzjMMfCdyKgQjUxEsTJP8FCjAGdIMIUyo2mqf9AIGGJtjYdIKGJGAAElHGCVCYmMM8LxgdQwBsHZ9OyPoF07bte36EzXkOdJLIWEBTlYLYpEUmBVHbXyexctyAGZKWpBBZU9GCfD8JpjJgA5tJk3SEMbEUYFCgRrH6WoYjYcLIui4TRPEkApJY5tA143AVNaqTCoqkldIACkNABKDZAwAJk4Xq1NMkKhtqGBhpmgBadJxsm1Kepa+agsWpFUHYVAZm6XFJBAKlOyQcR6QkMx9EDKBJEkL4oESi8oAgNLjEINhXvEdREtxRxcXEQxSRWmqKPEWytlgQN6TBlE0FgBqYAvCBwpsDhNBgcTUBRVgAG4VtSjbPXdSmAFYYEpw0ADZ6fSAAOFmYAAFg59bNrGibBMDbndtUtq-keVBDpG7nKd5z0Be6QN2ZFvr1NRdEpeW4b2cpmX0jl8agA">program checking the Collatz conjecture</a>, and as the <a href="http://people.cs.uchicago.edu/%7Esimon/RES/collatz.pdf">generalized form of the Collatz conjecture is undecidable</a><a href="#fn3" class="footnote-ref" id="fnref3"><sup>3</sup></a>, the TypeScript type system is undecidable as well.</p>
<h3 id="zig">Zig</h3>
<p><strong>undecidable</strong>, since evaluation of recursive functions at compile time is possible, thus requiring the compiler to solve the halting problem.</p>
<h2 id="faq">FAQ</h2>
<h3 id="where-are-python-bash-etc.">Where are Python, Bash, etc.?</h3>
<p>Type checking and type inference work primarily for statically typed languages. While there exist extensions to some dynamic languages imbuing them with static type checking these are not part of the language, and the complexity depends not on the language but the extension.</p>
<h3 id="what-about-unsafe-casts">What about unsafe casts?</h3>
<p>Some languages offer explicit unchecked casts, which are accepted by the type checker and may potentially fail at runtime. Examples are casting from <code>Object</code> to some subclass in C# and Java, casting values of type <code>interface{}</code> in Go or using <code>unsafeCoerce</code> in Haskell. I’ve chosen not to account for such casts/coercions since unchecked downcasting is an inherently unsafe operation not covered by type checker guarantees.</p>
<h3 id="ive-spotted-a-mistakeimprecision-what-should-i-do">I’ve spotted a mistake/imprecision, what should I do?</h3>
<p>Great work! Please report it on the <a href="https://todo.sr.ht/~bfiedler/typing-is-hard">official issue tracker</a> detailing what is wrong and I will try to fix it as soon as possible.</p>
<section class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Using only <code>RankNTypes</code> suffices to stump inference. Interestingly enough this is only undecidable for <code>N &gt;= 3</code>.<a href="#fnref1" class="footnote-back"></a></p></li>
<li id="fn2"><p>I think a reduction to the PCP-problem is also possible.<a href="#fnref2" class="footnote-back"></a></p></li>
<li id="fn3"><p>Thanks to reddit user /u/nckl for pointing that out.<a href="#fnref3" class="footnote-back"></a></p></li>
</ol>
</section>
</body>
</html>