A doc/presentation/epsg-28992.png => doc/presentation/epsg-28992.png +0 -0
A doc/presentation/plot-1000x1000.png => doc/presentation/plot-1000x1000.png +0 -0
A doc/presentation/presentation.org => doc/presentation/presentation.org +424 -0
@@ 0,0 1,424 @@
+#+TITLE: Basisregistratie Adressen en Gebouwen
+#+AUTHOR: Remco van 't Veer
+#+DATE: 2022-10-12
+#+OPTIONS: ':t ^:nil toc:nil
+
+# Use ox-beamer to export
+
+#+ATTR_LATEX: :height .66\linewidth
+[[file:plot-1000x1000.png]]
+See also: [[file:doc/plot.sh]]
+
+
+* Hi, I am Remco
+
+- I like writing software
+
+- I don't like reading documentation
+
+
+* Disclaimer
+
+- I am not a BAG expert
+
+- I am not a GIS nerd
+
+
+* What's BAG?
+
+- BRA: Basisregistratie Adressen
+
+ (base registration addresses)
+
+- BGR: Basisgebouwenregistratie
+
+ (base registration buildings)
+
+- BRA + BGR = BAG
+
+- BAG: Basisregistratie Adressen en Gebouwen
+
+ (base registration addresses and buildings)
+
+- LV BAG: Landelijke Voorziening BAG
+
+ (national facility BAG)
+
+
+* Objects
+
+- NUM: Nummeraanduiding
+
+ (number designation)
+
+- OPR: Openbareruimte
+
+ (public space)
+
+- WPL: Woonplaats
+
+ (city)
+
+- VBO: Verblijfsobject
+
+ (residence)
+
+- PND: Pand
+
+ (building)
+
+- LIG: Ligplaats
+
+ (berth for a houseboat etc.)
+
+- STA: Standplaats
+
+ (pitch of a caravan etc.)
+
+
+* Model
+
+#+begin_src dot :file "presentation-model.png"
+ digraph {
+ NUM -> OPR [label="1"];
+ OPR -> WPL [label="1"];
+ NUM -> WPL [label="0..1"];
+ VBO -> NUM [label="1"];
+ VBO -> PND [label="1..*"];
+ LIG -> NUM [label="1"];
+ STA -> NUM [label="1"];
+ }
+#+end_src
+
+#+ATTR_LATEX: :height .75\linewidth
+#+RESULTS:
+[[file:presentation-model.png]]
+
+
+* Nevenadressen
+
+Primary and secondary addresses.
+
+#+begin_src dot :file "presentation-neven.png"
+ digraph {
+ NUM -> OPR [label="1" color=grey];
+ OPR -> WPL [label="1" color=grey];
+ NUM -> WPL [label="0..1" color=grey];
+ VBO -> NUM [label="1" color=grey];
+ VBO -> PND [label="1..*" color=grey];
+ LIG -> NUM [label="1" color=grey];
+ STA -> NUM [label="1" color=grey];
+ VBO -> NUM [label="0..* (neven)" color=red];
+ LIG -> NUM [label="0..* (neven)" color=red];
+ STA -> NUM [label="0..* (neven)" color=red];
+ }
+#+end_src
+
+#+ATTR_LATEX: :height .75\linewidth
+#+RESULTS:
+[[file:presentation-neven.png]]
+
+
+* Document based
+
+State including validity period.
+
+
+* Number of records
+
+| Object | All | Active |
+|--------+-------+--------|
+| NUM | 12.0m | 11.5m |
+| OPR | 343k | 337k |
+| WPL | 3.8k | 3.7k |
+| VBO | 21.3m | 20.7m |
+| PND | 29.4m | 20.5m |
+| LIG | 17.7k | 16.6k |
+| STA | 50.1k | 44.5m |
+|--------+-------+--------|
+| | | |
+
+Active = current (validity period of record) *AND* an "active" status
+(not revoked, demolished etc.)
+
+
+* Living data
+
+- Bad data:
+
+ #+begin_example
+ SELECT COUNT(*)
+ FROM verblijfsobject
+ WHERE actief AND oppervlakte = 1;
+ count
+ -------
+ 12512
+ #+end_example
+
+- More bad data:
+
+ #+begin_example
+ SELECT COUNT(*)
+ FROM pand
+ WHERE actief AND oorspronkelijk_bouwjaar = 9999;
+ count
+ -------
+ 27
+ #+end_example
+
+- Anecdotal: I live in my house since 2012 but
+ "oorspronkelijk_bouwjaar" in BAG is 2014
+
+
+* Oversharing
+
+#+begin_example
+SELECT COUNT(*)
+FROM nummeraanduiding
+WHERE actief AND postcode IS NULL;
+ count
+--------
+ 983370
+#+end_example
+
+- 3199KM 5
+
+# [[https://www.openstreetmap.org/?mlat=51.96420149343773&mlon=3.9626308066261027#map=20/51.96420149343773/3.9626308066261027]]
+
+- 9143NZ 1
+
+# https://www.openstreetmap.org/?mlat=53.45932339542564&mlon=6.055841821360722#map=20/53.45932339542564/6.055841821360722
+
+
+* History of project
+
+- NLExtract CSV
+
+- Ruby -> Elixer -> Clojure
+
+- NLExtract -> geotoko.nl (€€€)
+
+- Clojure (BAG 1.0)
+
+- Clojure (BAG 2.0)
+
+
+* Source
+
+- Municipalities record BAG data
+
+- Collected by Kadaster into LV BAG
+
+ (Landelijke Voorziening BAG / national facility BAG)
+
+- BAG Extract downloadable from PDOK
+
+ (Publieke Dienstverlening Op de Kaart)
+
+
+* Documentation
+
+- Catalogus BAG 2018
+
+ https://www.kadaster.nl/-/bag-catalogus-basisregistraties-adressen-en-gebouwen
+
+- Praktijkhandleiding BAG
+
+ https://imbag.github.io/praktijkhandleiding/
+
+- XSDs are excellent
+
+ https://www.kadaster.nl/zakelijk/producten/adressen-en-gebouwen/bag-2.0-extract
+
+
+* Code overview
+
+- updater
+
+- importer
+
+- sanity check
+
+- API
+
+
+* Updater
+
+- atom feed (gets updated every 8th of the month)
+
+
+* Importer
+
+- src/straatnaam/data.clj
+
+ - updater
+
+ - sanity check
+
+- src/straatnaam/lvbag.clj
+
+ - extract
+
+ - parse
+
+ - SQL
+
+
+* Zips a zip
+
+#+begin_example
+Archive: lvbag-extract-nl.zip
+ Length Date Time Name
+ --------- ---------- ----- ----
+ 788 09-08-2022 21:37 Leveringsdocument-BAG-Extract.xml
+ 50051 09-08-2022 21:37 GEM-WPL-RELATIE-08092022.zip
+ 1550578 09-08-2022 21:34 9999LIG08092022.zip
+ 17297535 09-08-2022 21:34 9999NietBag08092022.zip
+ 3577291 09-08-2022 21:34 9999STA08092022.zip
+ 983923925 09-08-2022 21:34 9999VBO08092022.zip
+ 20923408 09-08-2022 21:35 9999WPL08092022.zip
+1677454980 09-08-2022 21:35 9999PND08092022.zip
+ 310004529 09-08-2022 21:37 9999NUM08092022.zip
+ 114721 09-08-2022 21:37 9999Inactief08092022.zip
+ 10742024 09-08-2022 21:37 9999OPR08092022.zip
+ 110996158 09-08-2022 21:37 9999InOnderzoek08092022.zip
+ --------- -------
+3136635988 12 files
+#+end_example
+
+
+* XML a zips
+
+#+begin_example
+Archive: lvbag-extract-nl/9999PND08092022.zip
+ Length Date Time Name
+--------- ---------- ----- ----
+ 14795612 09-08-2022 11:41 9999PND08092022-000001.xml
+ 14668254 09-08-2022 11:41 9999PND08092022-000002.xml
+ 14715853 09-08-2022 11:41 9999PND08092022-000003.xml
+....
+ 14142790 09-08-2022 18:07 9999PND08092022-002049.xml
+ 14896460 09-08-2022 18:07 9999PND08092022-002050.xml
+ 10306002 09-08-2022 18:07 9999PND08092022-002051.xml
+--------- -------
+30403929905 2051 files
+#+end_example
+
+
+* XML
+
+- 10000 records per XML
+
+- clojure.data.xml/source-seq event stream
+
+- insert into Postgres in batches
+
+
+* One table per object type
+
+One to many relation (like nevenadres) link tables build using
+`generate_subscripts` and Postgres array datatype.
+
+#+begin_example
+INSERT INTO verblijfsobject_neven
+SELECT id, a[i] FROM (
+ SELECT
+ x.id AS id,
+ x.neven_ids AS a,
+ generate_subscripts(x.neven_ids, 1) AS i
+ FROM verblijfsobject
+)
+#+end_example
+
+
+* Geometry
+
+- Rijksdriehoekscoördinaten aka EPSG:28992
+
+ (governmental triangulation coordinates?)
+
+#+ATTR_LATEX: :height .33\linewidth
+[[file:epsg-28992.png]]
+
+- PostGIS to the rescue
+
+
+* Sanity checks
+
+- Never trust data from the internet!
+
+- Inserts into versioned schema aka namespace
+
+- Counters and searches for some railway stations
+
+
+* Bag view
+
+| Column | Type |
+|----------------------+------------------------|
+| id | numeric |
+| openbareruimte | character varying(120) |
+| huisnummer | integer |
+| huisletter | character varying(1) |
+| huisnummertoevoeging | character varying(4) |
+| postcode | character varying(6) |
+| woonplaats | character varying(80) |
+| x | double precision |
+| y | double precision |
+| gebruiksdoel | character varying(24) |
+| nevenadres | boolean |
+| object_type | character varying(15) |
+| object_id | numeric |
+| lengtegraad | double precision |
+| breedtegraad | double precision |
+
+
+* Eerlijke WOZ extras
+
+#+begin_src sql
+SELECT
+ bag.*, vbo.oppervlakte, pnd.oorspronkelijk_bouwjaar
+FROM
+ bag
+LEFT JOIN
+ verblijfsobject vbo
+ON
+ vbo.id = bag.object_id
+LEFT JOIN
+ verblijfsobject_pand vbo_pnd
+ON
+ vbo_pnd.id = bag.object_id
+LEFT JOIN
+ pand pnd
+ON
+ vbo_pnd.pand_id = pnd.id
+#+end_src
+
+
+* Eerlijke WOZ address
+
+| id | 363200000520973 |
+| openbareruimte | Johan Huizingalaan |
+| huisnummer | 763 |
+| huisletter | A |
+| huisnummertoevoeging | |
+| postcode | 1066VH |
+| woonplaats | Amsterdam |
+| x | 117021 |
+| y | 483978 |
+| gebruiksdoel | industriefunctie |
+| nevenadres | f |
+| object_type | verblijfsobject |
+| object_id | 363010001036463 |
+| lengtegraad | 4.829889464736984 |
+| breedtegraad | 52.34240599219023 |
+| oppervlakte | 39570 |
+| oorspronkelijk_bouwjaar | 1969 |
+
+
+* Please contribute
+
+Discussion and patches to: [[mailto:~jomco/public-inbox@lists.sr.ht][~jomco/public-inbox@lists.sr.ht]]
+
+
+* Questions?
A doc/presentation/presentation.pdf => doc/presentation/presentation.pdf +0 -0