~skin/nrdl

70e8c74f64dc2014906bbbc8d973260e5fa1cd2f — Daniel Jay Haskin a month ago ac93afa
2-3 trees, interesting
5 files changed, 112 insertions(+), 158 deletions(-)

D 2
M com.djhaskin.pcoll.asd
A src/2-3-trees.lisp
M src/main.lisp
M src/utils.lisp
D 2 => 2 +0 -99
@@ 1,99 0,0 @@

2024-07-16T19:12:23-0600

The goal is to make a persistent collections library that:

* Respects the CL standard and implements, where possible, a drop-in replacement
  for functions and semantics everyone already knows and so that old code is not
  broken but new code or is easily ported.
* Is in pure common lisp
* Is kinda fast
* But still totally persistent/functional, thus getting the benefits of that
  genre of data structures.
* Would be nice if the implementation was as simple as possible, to help people
  comprehend it and devs to debug it.

Non-goals:

* Needless acrobatics
* Impractical but impressive asymptotic bounds ("Constant time", but what's the
  constant?)
* Your insensibiliities.

Things that mass destructively modify:

Things that modify the original and therefore make no sense:
  - delete, delete-if, delete-if-not
  - nsubstitute-if, nsubstitute-if-not
  - fill
  - map-into
  - replace

Things that don't care what the sequence is, they just need one (and therefore
traversal needs to be kinda fast, hopefully somewhat locality-respecting),
  - map
  - reduce
  - COUNT-IF, COUNT-IF-NOT
  - find-if, find-if-not (??? Again, maybe could be fast?)
  - position-if, position-if-not (ditto)
  - substitute-if, substitute-if-not
  - REMOVE-IF, REMOVE-IF-NOT,

Things that might be fast:
  - elt
  - copy-seq
  - make-sequence
  - subseq
  - count
  - length
  - reverse, "modify" nreverse (but not really)
  - sort(? maybe it's a "don't care what the seq is"), stable-sort
  - search (tries?)
  - find
  - position
  - mismatch (tries?)
  - substitute, nsubstitute
  - concatenate
  - merge
  - remove
  - remove-duplicates

This just in, CL has stuff for [subst](http://clhs.lisp.se/Body/f_substc.htm)
and [`tree-equal`](http://clhs.lisp.se/Body/f_tree_e.htm) that we should totally
just plug into. That way, we won't have to replace the standard, we can just
plug into it.

What else we got here


`tree-equal` is interesting but the trees _must be the same shape_ so that could
be prickly, just watch it

`copy-tree` but I thought we were trying to avoid doing this very thing

`sublis` for making substitutions, but it's a bigger subst.

Nevermind, there's not much here useful.

And now back to your regularly scheduled program.


| Function            | IHTree | RBTree    | Finger Tree | Random Access Skew |
|---------------------|--------|-----------|-------------|--------------------|
| `elt`               | ???    | `O(lg n)` | `O(lg n)`   | `O(lg n)`          |
| `copy-seq`          | `O(1)` | `O(1)`    | `O(1)`      | `O(1)`             |
| `make-sequence`     | `O(1)` | `O(1)`    | `O(1)`      | `O(1)`             |
| `subseq`            | `O(lg n)` | `O(1)`    | `O(1)`      | `O(1)`             |
| `count`             |
| `length`            | `O(1)` | `O(1)`    | `O(1)`      | `O(1)`             |
| `reverse`           | 
| `sort` (ish) | | `(
| `search (tries?)`   |
| `find`              |
| `position`          |
| `mismatch`          |
| `substitute`        |
| `concatenate`       |
| `merge`             |
| `remove`            |
| `remove-duplicates` |

M com.djhaskin.pcoll.asd => com.djhaskin.pcoll.asd +3 -4
@@ 7,14 7,13 @@
               "trivial-features"
               )
  :components
    (
     (:module "src"
    ((:module "src"
      :serial t
      :components
      ((:file "utils")
       (:file "node")
       (:file "2-3-trees")
       (:file "main"))))
  :description "Nestable Readable Document Language"
  :description "Persistent Collections"
  :in-order-to (
                (test-op (test-op "com.djhaskin.pcoll/tests"))))


A src/2-3-trees.lisp => src/2-3-trees.lisp +66 -0
@@ 0,0 1,66 @@
;;;; 23trees.lisp -- Functions and data types around 2-3 trees used in finger
;;;; trees.
;;;;
;;;; SPDX-FileCopyrightText: 2024 Daniel Jay Haskin
;;;; SPDX-License-Identifier: MIT

(in-package #:cl-user)

(defpackage
  #:com.djhaskin.pcoll/2-3-trees (:use #:cl #:com.djhaskin.pcoll/utils)
  (:documentation
    "
    2-3 Tree implementation for use in finger trees.
    ")
    (:import-from #:alexandria)
  (:export
    reduce-node)
    )

(in-package #:com.djhaskin.pcoll/2-3-trees)



(defstruct node
  "
  A 2-3 tree node.
  
  This type has an alist of annotations, a list of children, and a value.

  As an invariant for the type, a node must either have a non-empty list of
  children or a defined value, but not both. Thus, if there is more than zero
  children then the value is consulted.

  The list of annotations is an alist of annotations on the node, such as size
  and sorting information like largest keys.
  "
  (value nil)
  (children nil)
  (annotations nil))

(defun reduce
    (function tree &key key from-end start end initial-value)
  "
  This function is called by the main `reduce` pcoll function, and implements
  the `reduce` sequence API function for 2-3 trees.


  "
  (declare (type node tree)
           (type function key)
           (type boolean from-end)
           (type (or nil integer) end)
           (type integer start)
           (type t initial-value))

  ;; 1. Bound the tree
  (when (>= start (node-size tree))
    (return (single-reduce function unspecified initial-value)))
  (when (>= 

  (if (= 1 (node-height tree))
      ;; Then start is less than the node's number of children
      (cond ((= (node-size tree) 
      (if (large-node-p tree)



M src/main.lisp => src/main.lisp +35 -52
@@ 2,7 2,6 @@
;;;;
;;;; SPDX-FileCopyrightText: 2024 Daniel Jay Haskin
;;;; SPDX-License-Identifier: MIT
;;;;

#+(or)
(progn


@@ 18,65 17,49 @@
  (:documentation
    "
    Fast Persistent Collections in Common Lisp

    This package attempts to implement finger trees, not just _in_ Common Lisp,
    but _for_ Common Lisp.
    
    Specifically, it defines _drop-in replacement_ functions for Common
    Lisp Sequences:
    
    http://clhs.lisp.se/Body/c_sequen.htm
    
    Functions will work as normal, only they will accept finger trees in
    addition to vectors and lists. When they are called on finger trees, a full,
    non-sharing copy will not be made, as per the usual docs, but will share
    structure with other trees. Callers can shadowing import these, but they
    don't have to do so to use them. Either way, there will be "no suprises" in
    the API because everyone will be used to it already.
    
    As a bonus, it will be easy to implement this efficiently, as in order to do
    so we'll need to use some form of `typecase` since finger trees are
    polymorphic. We simply pass the function call down to the standard version
    of the functions when the sequence is not a finger tree in the `:else`
    clause.

    Other functionality, such as finit
    
    This package doesn't implement functions in that API which destructively
    modify the sequence in-place; only functions which typically return a copy
    of the sequence are implemented.

    ")
    (:import-from #:alexandria)
    )

(in-package #:com.djhaskin.pcoll)

;;; The finger trees paper: http://www.staff.city.ac.uk/~ross/papers/FingerTree.pdf
;;; This library attempts to implement finger trees, not just _in_ Common Lisp,
;;; but _for_ Common Lisp.
;;;
;;; Specifically, we're going to define _drop-in replacement_ functions for Common
;;; Lisp Sequences:
;;;
;;; http://clhs.lisp.se/Body/c_sequen.htm
;;;
;;; Callers can shadowing import these, but they don't have to do so to use
;;; them.
;;;
;;; Either way, there will be "no suprises" in the API because everyone will be
;;; used to it already.
;;;
;;; As a bonus, it will be easy to implement this efficiently, as in order to do
;;; so we'll need to use some form of `typecase` since finger trees are
;;; polymorphic. We simply pass the function call down to the standard version
;;; of the functions when the sequence is not a finger tree in the `:else`
;;; clause.
;;;
;;; We can't do this with _every_ function in that API though, since some
;;; functions want to destructively modify the sequence and return-a-copy
;;; analogs for these functions already exist in the API or don't make sense in
;;; a persistent setting anyways.

;;; We must *actually* make structs for nodes,
;;; since we can't use vectors as nodes because what if we are storing vectors.

(defstruct (node)
  (left nil)
  (right nil)
  (size nil)
  (height 0))

(defstruct (large-node
             (:include node))
  (middle nil))


(defun reduce-node
    (function tree key from-end start end initial-value)
  (declare (type node tree))
  ;; 1. Bound the tree
  (when (>= start (node-size tree))
    (return (single-reduce function 'unspecified initial-value)))
  (when (>= 
;;; push


  (if (= 1 (node-height tree))
      ;; Then start is less than the node's number of children
      (cond ((= (node-size tree) 
      (if (large-node-p tree)

;;;
;;; The finger trees paper: http://www.staff.city.ac.uk/~ross/papers/FingerTree.pdf
;;;
;;;  We must *actually* make structs for nodes,
;;;  since we can't use vectors as nodes because what if we are storing vectors.


  (when (>= 

M src/utils.lisp => src/utils.lisp +8 -3
@@ 7,16 7,21 @@
    Fast Persistent Collections in Common Lisp
    ")
    (:import-from #:alexandria)
  (:export
    unspecified)
    )

(defparameter unspecified 'unspecified)

(in-package #:com.djhaskin.pcoll/utils)

(declaim (inline single-reduce))

(defun single-reduce (function thing initial-value)
  (if (eq initial-value 'unspecified)
      (if (eq thing 'unspecified)
  (if (eq initial-value unspecified)
      (if (eq thing unspecified)
          (funcall function)
          thing)
      (if (eq thing 'unspecified)
      (if (eq thing unspecified)
          initial-value
          (funcall function initial-value thing))))