web/clog/robust-clojure-nil.org -rw-r--r-- 12.9 KiB View raw
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
#+title: "Robust Clojure: The best way to handle nil"
#+date: April 17, 2017
#+tldr: Treat it as a type, like Nothing, not a value like null.

Large Clojure codebases can become nasty, just as in any other dynamic
language. Fortunately, Clojure isn't as problematic as some other
languages because it is partially inspired by ML. True, it doesn't have
static typing, but the way Clojure treats =nil= allows us to get very
close to the ML way.

** Maybe
   :PROPERTIES:
   :CUSTOM_ID: maybe
   :ID:       c58946b6-83a9-4b3e-9692-0288baac104e
   :END:

In Haskell and other ML-ish languages, the =Maybe= type represents
either =Nothing= or =Just <some_value>=. This becomes super useful when
you need to check if a thing exists and get the value of that thing at
the same time.

For example doing this explicitly in Clojure is cumbersome:

#+BEGIN_EXAMPLE
    (def data {:a 1 :b 2})

    (if (= nil (:a data))
      0 ; default return value
      (:a data))
#+END_EXAMPLE

Checking for =nil= comes at a cost, of course. You're accessing the map
twice, and that's a lot of boilerplate code if you'll be doing this
often. In Haskell it's much cleaner:

#+BEGIN_SRC haskell
    -- let's pretend this is a function that Maybe returns an Int
    getSomeData :: Maybe Int

    -- call the function and handle the return value
    case getSomeData of
      Nothing -> Nothing
      Just a  -> a
#+END_SRC

We can handle both the /getting/ of the value and the /returning/ of the
value in one fell swoop.

Clojure has a function that does basically the same thing. The =get=
function will return =nil= if a key in a dictionary isn't present:

#+BEGIN_SRC clojure
    (get {:a 1 :b 2} :a) ;=> 1
    (get {:a 1 :b 2} :c) ;=> nil
#+END_SRC

This is similar to our imaginary =getSomeData= function in the Haskell
snippet, except the =Just a= is implicit, so we don't have to extract
the value =1= every time.

** Maybe =nil=?
   :PROPERTIES:
   :CUSTOM_ID: maybe-nil
   :ID:       53d725f6-361e-4584-b6a1-b8823d51f130
   :END:

Practically speaking, =nil= is a value in Clojure because you can do
anything with it that you can do with any other value. This idea of
"everything is a value" (commonly expressed as "everything is data")
runs deep in the Lisp tradition and gives Lisp languages a lot of power.
But it also causes problems. Consider:

#+BEGIN_SRC clojure
    (+ 1 (get {:a 1 :b 2} :c))
#+END_SRC

This will evaluate to =(+ 1 nil)= which is nonsensical and will raise an
error. You can't increase nothing by =1=---if you try to, you just end
up with more of nothing!

** The Right Way
   :PROPERTIES:
   :CUSTOM_ID: the-right-way
   :ID:       2f2b6a3a-7b41-4519-8e0f-aa1afd22d4aa
   :END:

The simple fix is to check for =nil= just like you would check for
=Nothing=. Clojure provides the
[[https://clojuredocs.org/clojure.core/if-some][=if-some=]] function to
make this more concise:

#+BEGIN_SRC clojure
    (if-some [it (get {:a 1 :b 2} :c)]
      (+ 1 it)
      nil)
#+END_SRC

Which is more-or-less the same in Haskell:

#+BEGIN_SRC haskell
    case getSomeData of
      Nothing -> Nothing
      Just it -> 1 + it
#+END_SRC

If you remember to write all of your Clojure code like this, your
codebase will become much more robust to =nil=-related errors!

*To sum up: Always treat =nil= as if it means =Nothing=.*

If you're an intermediate Clojure programmer, then you're probably
already familiar with =if-let=/=if-some= and perhaps not impressed. The
big idea, however, is the treatment of =nil= as a type, and not as a
value, which is a subtle but important point.

To avoid these errors once and for all, you need to stop thinking about
=nil= as a /value/. Yes, that is how Clojure treats =nil=, but that
doesn't mean that you, the programmer, must treat it as a value too. If
you come from Java or C, which represents the absence of a value as the
=null= value, then you'll have to update your mental model.

*Realize:* the concept of absence refers to a /type of thing/, not a
/value/.

While you are writing code, you should be thinking, "Is the type of this
thing /always/ an =int=, or could it be =nil=?" When doing numerical or
statistical programming, you can probably guarantee that you'll have a
number type returning in your algorithms. However, when you start
working with networking, or databases, or certain Java libraries, you
often lose the guarantee that functions will return concrete values
(network is down, database exploded, etc.), and then you must think
about =nil=.

** Clojure Idioms and Category Theory
   :PROPERTIES:
   :CUSTOM_ID: clojure-idioms-and-category-theory
   :ID:       27259dc1-181b-4715-ab16-c749ca4acea5
   :END:

In both Haskell and Clojure, manually checking for =nil=/=Nothing=
becomes tedious very fast, especially when you are chaining lots of
functions together. However both languages have solutions for this:
Haskell has category theory, Clojure has idioms.

In Haskell, the "bind" operator is defined basically like this:

#+BEGIN_SRC haskell
    (>>=) m g = case m of
      Nothing -> Nothing  -- if m is Nothing, just return Nothing
      Just x  -> g x      -- otherwise, call the function g on the extracted value
#+END_SRC

Extending the above example, we can call =getSomeData= and increase it
by =1= with the following:

#+BEGIN_SRC haskell
    incIfEven :: Int -> Maybe Int
    incIfEven n =
      if n/2 == 0
      then Just n+1
      else Nothing

    getSomeData >>= incIfEven
#+END_SRC

Clojure has a similar idiom. We use =some->>= to thread the map through
the rest of the functions. First extract a key if it exists, then lift
the value into a vector, so we can use all of the collection-related
functions on it. This allows us to =filter= and =map= over it to
transform the data as we see fit:

#+BEGIN_SRC clojure
    (some->> {:a 2 :b 3} :a vector (filter even?) (map inc) first) ;;=> 3
    (some->> {:a 2 :b 3} :b vector (filter even?) (map inc) first) ;;=> nil
    (some->> {:a 2 :b 3} :c vector (filter even?) (map inc) first) ;;=> nil
#+END_SRC

/Voilà!/ You get the compactness of Haskell, without the overhead of
category theory :)

I kid! Category theory is great. The "bind" operator (=>>==) is very
similar to =some->>= because they both take a value from one monad and
"shove" it into the next monad. ⊕ If you have no idea what a "monad" is,
replace "monad" with "thing" and re-read that sentence. In Haskell, the
monad is the =Maybe= type; in Clojure, the monad is implicit in the
collection interface which is the unifying abstraction in the language.

** Clojure's Most Under-Appreciated Function
   :PROPERTIES:
   :CUSTOM_ID: clojures-most-under-appreciated-function
   :ID:       15cc2567-6994-483a-80af-f7203e413431
   :END:

On IRC, [[http://technomancy.us/][technomancy]] mentioned he was
surprised =fnil= wasn't in this article. I admit that completely forgot
about =fnil=, but it's extremely useful.

=fnil= can be used in our example above like so:

#+BEGIN_EXAMPLE
    (def safe-inc (fnil inc 0))

    (safe-inc (get {:a 1 :b 2} :b)) ;=> 3
    (safe-inc (get {:a 1 :b 2} :c)) ;=> 1
#+END_EXAMPLE

In the above snippet, =safe-inc= is a function just like =(+ 1 x)= in
the earlier example, except if =x= is =nil=, then =safe-inc= will use
=0= as a default value instead. More (better) examples are available at
[[https://clojuredocs.org/clojure.core/fnil][ClojureDocs]].

=fnil= isn't talked about much in the Clojure community, but it is a
handy funciton. Use it whenever you aren't sure if a variable is =nil=
but you do know what the value /should/ be. In fact, the entire problem
of =nil= isn't discussed much at all, but it is a very important issue,
one that the Clojure community should be aware of. Hopefully this
article will at least make you aware of the problems with =nil=, and
start you down the path of thinking critically about =nil= on your own.

** We Still Have Problems
   :PROPERTIES:
   :CUSTOM_ID: we-still-have-problems
   :ID:       2df90330-1699-4c0e-889f-f8415077bdd1
   :END:

The biggest problem is that this practice explicit =nil= handling is a
/convention/---the only thing enforcing it is your habits, and we all
know that we mere humans are fallible. Haskell's approach to =Nothing=
is thus superior because the compiler checks your work automatically,
which is nice.

A second problem is =nil= itself, which is a problem with any
dynamically-typed language. Unforeseen =nil=s can bubble up the stack
and cause a lot of headache. One solution is to use a monad library
(discussed below), but more often than not, in everyday Clojure code, a
monad library is unnecessary.

The core problem is one of language design. Like I said above, Clojure
treats =nil= as a value, when in reality, the concept of absence refers
to a /type/: intuitively, we say "absence of a value" just like we say
"an integer of 5". Clojure, as a lisp, made the choice to keep types an
evaluable construct, so they could be modified at runtime, instead of a
construct of compilation like Haskell. By choosing Clojure over Haskell,
you are choosing the power of metaprogramming, but with that comes the
drawbacks of dynamic typing.

The best solution to this that I've found (in dynamically-typed
languages) is to follow the single-responsibility principle: each
function should just do =1= thing. Then spec that function and catch the
possible =nil=-causing inputs with auto-generated tests (this is an
article for another time). If you have other solutions, please email me
and I will add your contribution here :)

** Other Solutions and =nil=-Punning
   :PROPERTIES:
   :CUSTOM_ID: other-solutions-and-nil-punning
   :ID:       3c72fa62-cb45-471c-a618-7c7fa3d7f7a6
   :END:

As
[[https://blog.skyliner.io/fourteen-months-with-clojure-beb8b3e4bf00][described
by Skyliner]], chaining =if-let='s together like this is annoying:

#+BEGIN_EXAMPLE
    (if-let [x (foo)]
      (if-let [y (bar x)]
        (if-let [z (goo x y)]
          (do (qux x y z)
              (log "it worked")
              true)
          (do (log "goo failed")
              false))
        (do (log "bar failed")
            false))
      (do (log "foo failed")
          false))
#+END_EXAMPLE

It's only mildly less annoying when using cats:

#+BEGIN_EXAMPLE
    (require '[cats.core :as m])
    (require '[cats.monad.either :as either])

    @(m/mlet [x (if-let [v (foo)]
                  (either/right v)
                  (either/left))

              y (if-let [v (bar x)]
                  (either/right v)
                  (either/left))

              z (if-let [v (goo x y)]
                  (either/right v)
                  (either/left))]

      (m/return (qux x y z)))
#+END_EXAMPLE

The benefit with cats is you get fine-grained error handling for each
=left=. Read more about cats error handling and the
[[http://funcool.github.io/cats/latest/#either][Either type]].

If =some->= is out of the question, then personally I prefer the pattern
matching approach:

#+BEGIN_SRC clojure
    (match (foo) ;; pretend `foo` is function that returns a map
      nil (log "foo failed")
      {:ms t :user u :data data} (do (log "User: " u " took " t " seconds.")
                                     data))
#+END_SRC

The benefit is mostly the same as with =if-let=, but you can pattern
match on the return value and then jump right into the next function,
which I find myself doing quite a lot.

Of course you can always tighten this up by defining your own version of
"bind" or =some->= in Clojure:

#+BEGIN_SRC clojure
  (defn >>= [m g]
    (if-let [x (m)]
      (g x)
      (do (log (str (name m) " failed")
               nil))))
#+END_SRC

This is a (/very/) naïve implementation, but you get the idea. Modify to fit
your use-case.

On Reddit, tolitius [[https://www.reddit.com/r/Clojure/comments/65x15k/robust_clojure_the_best_way_to_handle_nil/dgenl5g/?utm_content=permalink&utm_medium=front&utm_source=reddit&utm_name=Clojure][suggested]] the use of =get='s optional third argument (which
I had forgotten about!) and =or=:

#+BEGIN_QUOTE
  =get= has a default value built in:

  #+BEGIN_EXAMPLE
      user=> (get {:a 1 :b 2} :b)
      2
      user=> (get {:a 1 :b 2} :c 0)
      0
  #+END_EXAMPLE

  hence

  #+BEGIN_EXAMPLE
      user=> (-> (get {:a 1 :b 2} :c 0) inc)
      1
  #+END_EXAMPLE

  In case this is a single op, such as =inc=, this would work as well:

  #+BEGIN_EXAMPLE
      user=> (-> (or nil 41) inc)
      42

      user=> (-> :c
                 {:a 1 :b 2}
                 (or 41)
                 inc)
      42
  #+END_EXAMPLE

  i.e. =or= is really handy for default values
#+END_QUOTE

Over at Lispcast, Eric Normand argues for the [[http://www.lispcast.com/nil-punning]["=nil=-punning"]] approach, which is
fine. But I think this approach requires a confused notion of what
=nil=/=Nothing= actually means. According to Eric, =nil= is a type, a value, a
key in a map, a boolean, an empty =seq=. It seems to me that "=nil=-punning" is
really just "=nil=-confusion". It is much simpler to understand =nil= as
=Nothing=, i.e. the absence of a value (which is a type). That said,
=nil=-punning in practice ends up mostly the same as I describe above, so either
technique will work.