~luxferre/Bopher-NG

b8c8db95402b716982d1c84635e3be00ae38f5da — Luxferre 1 year, 8 months ago f434320
Added first gmi2map version
3 files changed, 135 insertions(+), 1 deletions(-)

M bopher-ng.sh
M tools/README-tools.md
A tools/gmi2map.sh
M bopher-ng.sh => bopher-ng.sh +2 -0
@@ 15,6 15,8 @@
# - ability to force-download pages instead of viewing regardless of their type
# - displaying type 8 entries (as per RFC1436) as the telnet:// URI scheme

shopt -s extglob # enable extended pattern matching (just to be sure)

# save the command line parameters

START_HOST="$1"

M tools/README-tools.md => tools/README-tools.md +40 -1
@@ 44,7 44,7 @@ This tool processes every line of text from the standard input to fit exactly th

As `phlow.sh` does not know anything about hyphenation rules and doesn't do any heuristics, it just breaks at whatever whitespace is the closest to the page's right edge.

Example - fitting the previous paragraph into 30-character width and prepend every line with 5 spaces (CR characters in the output are not shown):
Example - fitting the previous paragraph into 30-character width and prepending every line with 5 spaces (CR characters in the output are not shown):
```
$ echo 'This tool processes every line of text from the standard input to fit exactly the given amount of characters, optionally adding some leading and/or trailing whitespaces **after** the reflow is done. It also replaces all LF line endings with CRLF in the output to achieve maximum compatibility with legacy clients.' | bash phlow.sh 30 5
     This tool processes every     


@@ 62,4 62,43 @@ $ echo 'This tool processes every line of text from the standard input to fit ex
     legacy clients.               
```

## `gmi2map.sh`

This tool converts Gemtext documents into browsable Gophermaps. It basically combines the functionality of both previous tools and adds link processing and triple-backtick removal (because in Gopherspace, all text is supposed to be preformatted) on top of them. Unlike `gopherinfo.sh` though, it doesn't care about the maximum length of the source (if you need to, the reflow logic is there), but it **is** expecting the entire document from the standard input, and as such, **does** add the trailing dot at the end of the generated Gophermap. That's why, for the beginners, this sole script might actually be everything you need to get you started with Gopher publishing.

Note that by default, reflow logic in `gmi2map.sh` is optional and you have to specify the page width yourself if you need it. Actually, full usage syntax looks like this:
```
cat [file] | gmi2map.sh [page_width] [leading_spaces] [trailing_spaces] [placeholder_char]
```

I.e. it looks like the `phlow.sh` and `gopherinfo.sh` options combined, but the spaces part only applies to the reflow, which is also off by default.

Example - imagine we have this rudimentary Gemtext document in the `example.gmi` file:
```
I write short Gemtext here.

I write a bit longer Gemtext here to showcase reflow capabilities of Bopher Tools' gmi2map.sh script, because it really is cool.

=> gopher://happynetbox.com:79/1luxferre Anyway, why not publish it on Gopher?
=> https://chronovir.us Or read my blog on Web...

See ya!
```

When running `cat example.gmi | bash gmi2map.sh 67`, this file translates to the following Gophermap:

```
$ cat example.gmi | bash gmi2map.sh 67
iI write short Gemtext here.                                            ;       ;       0
i                                                                       ;       ;       0
iI write a bit longer Gemtext here to showcase reflow capabilities      ;       ;       0
iof Bopher Tools' gmi2map.sh script, because it really is cool.         ;       ;       0
i                                                                       ;       ;       0
1Anyway, why not publish it on Gopher?  luxferre        happynetbox.com 79
hOr read my blog on Web...      URL:https://chronovir.us        ;       0
i                                                                       ;       ;       0
iSee ya!                                                                ;       ;       0
.
```

Note that this tool hasn't been tested for all possible edge cases yet, so I recommend to always verify the generated maps yourself. But it definitely makes their creation a lot easier than doing it by hand.

A tools/gmi2map.sh => tools/gmi2map.sh +93 -0
@@ 0,0 1,93 @@
#!/bin/bash
# A simple helper tool to create a valid Gophermap from the Gemtext passed via standard input
#
# Usage: cat [file] | gmi2map.sh [page_width] [leading_spaces] [trailing_spaces] [placeholder_char]
# (pass 0 as the reflow width if you want to pass the other parameters but don't want to turn on reflow logic)
#
# Created by Luxferre in 2023, released into public domain

shopt -s extglob # enable extended pattern matching (just to be sure)

TARGET_WIDTH="$1"
LSPACES="$2"
TSPACES="$3"
DELIM="$4"

TAB=$'\t'
SPC=$'\x20'
CRLF=$'\r\n'

[[ -z "$TARGET_WIDTH" ]] && TARGET_WIDTH=0 # reflow off by default
[[ -z "$LSPACES" ]] && LSPACES=0
[[ -z "$TSPACES" ]] && TSPACES=0
[[ -z "$DELIM" ]] && DELIM=';'

# format strings to use in different situations:
reflowfmt="%-$(( LSPACES ))s%-${TARGET_WIDTH}s%-$(( TSPACES ))s\n" # params: smth, line, smth
infofmt="i%s${TAB}%s${TAB}%s${TAB}0${CRLF}"        # params: line, DELIM, DELIM
gopherlinkfmt="%s%s${TAB}%s${TAB}%s${TAB}%d${CRLF}" # params: type, name, selector, host, port
extlinkfmt="h%s${TAB}URL:%s${TAB}%s${TAB}0${CRLF}" # params: name, URL, DELIM

reflow_line() { # single-line logic from phlow.sh, adapted into a function and separating by LF only
  local line="$1"
  local llen="${#line}" # get effective line length
  if (( 0 == TARGET_WIDTH || llen < TARGET_WIDTH )); then # no need to run the logic for smaller lines or if TARGET_WIDTH is 0
    printf "$reflowfmt" '' "$line" ''
    return
  fi
  local lastws=0 # variable to track last whitespace
  local cpos=0 # variable to track current position within the page line
  local pagepos=0 # variable to track the position of new line start
  local outbuf='' # temporary output buffer
  local c='' # temporary character buffer
  for ((i=0;i<llen;i++,cpos++)); do # start iterating over characters
    c="${line:i:1}" # get the current one
    if (( cpos >= TARGET_WIDTH )); then # we already exceeded the page width
      (( lastws == 0 )) && lastws=$TARGET_WIDTH # no whitespace encountered here
      printf "$reflowfmt" '' "${outbuf:0:$lastws}" '' # truncate the buffer
      outbuf=''
      pagepos=$(( pagepos + lastws ))
      cpos=0
      lastws=0
      i=$pagepos # update current iteration index from the last valid whitespace
    else # save the whitespace position if found
      [[ "$c" == "$SPC" ]] && lastws="$cpos"
      outbuf="${outbuf}${c}" # save the character itself
    fi
  done
  [[ ! -z "$outbuf" ]] && printf "$reflowfmt" '' "$outbuf" '' # output the last unprocessed chunk
}

readarray -t LINES -d $'\n' # read the input line array (split by LF)
for line in "${LINES[@]}"; do # iterate over the read text
  line="${line%%$'\r'}" # remove a trailing CR if it is there
  if [[ "${line:0:2}" == $'=>' ]]; then # we have a linkable resource
    linkline="${line##=>*([[:blank:]])}" # remove the link signature and any leading whitespace
    linkurl="${linkline%%[[:blank:]]*}" # treat anything until the next whitespace (or the end of line) as a URL
    linkdesc="${linkline##${linkurl}*([[:blank:]])}" # remove the URL and any other leading whitespace to get the description
    linkdesc="${linkdesc%%*([[:blank:]])}" # remove any trailing whitespace from the description
    if [[ "$linkurl" =~ ^gopher:// ]]; then # now, proceed according to the URL type (just like in Bopher-NG)
      preurl="${linkurl#gopher://}" # remove the scheme to ease parsing
      hostport="${preurl%%/*}" # extract the host+:port part (where :port is also optional)
      selpath="${preurl##$hostport}" # extract the selector+path part
      reshost="${hostport%%:*}" # extract the hostname
      resport="${hostport:(( 1 + ${#reshost} ))}" # extract the port
      restype="${selpath:1:1}" # extract the type character
      ressel="${selpath:2}" # extract the selector
      [[ -z "$resport" ]] && resport=70 # default port is 70
      [[ -z "$ressel" ]] && ressel="/" # default selector is root
      [[ -z "$restype" ]] && restype=1 # default request type is a Gophermap
      printf "$gopherlinkfmt" "$restype" "$linkdesc" "$ressel" "$reshost" "$resport"
    else
      printf "$extlinkfmt" "$linkdesc" "$linkurl" "$DELIM"
    fi
  else # we have an info line
    infoline='' 
    [[ "${line:0:3}" != $'```' ]] && infoline="$line" # ignore the preformatting togglers, pass everything else
    readarray -t reflowed_lines -d $'\n' < <(reflow_line "$infoline")
    for rline in "${reflowed_lines[@]}"; do # iterate over the reflowed line parts
      printf "$infofmt" "$rline" "$DELIM" "$DELIM"
    done
  fi
done
printf '.\r\n' # finish the Gophermap generation