~luxferre/Bopher-NG

7dfc0c44699c2fb7c838037ebaaa7cb5fe43a83f — Luxferre 1 year, 7 months ago eefa916
Added gmi2txt.sh tool
2 files changed, 101 insertions(+), 0 deletions(-)

M tools/README-tools.md
A tools/gmi2txt.sh
M tools/README-tools.md => tools/README-tools.md +29 -0
@@ 104,3 104,32 @@ iSee ya!                                                                ;       
```

Note that this tool hasn't been tested for all possible edge cases yet, so I recommend to always verify the generated maps yourself. But it definitely makes their creation a lot easier than doing it by hand.

## `gmi2txt.sh`

This tool converts Gemtexts into CRLF-terminated and preformatted plaintext files ready to be served on Gopher. It has the same parameters as `gmi2map.sh` sans the placeholder character. Link lines are converted to the "[description]: [url]" format and, unlike `gmi2map.sh`, also are subject to reflows like everything else.

Example - imagine we have this rudimentary Gemtext document in the `example.gmi` file:
```
I write short Gemtext here.

I write a bit longer Gemtext here to showcase reflow capabilities of Bopher Tools' gmi2txt.sh script, because it really is cool.

=> gopher://happynetbox.com:79/1luxferre Visit my Gopher homepage
=> https://chronovir.us Or read my blog on Web

See ya!
```

When running `cat example.gmi | bash gmi2txt.sh 67 5`, this file translates to the following plaintext:
```
     I write short Gemtext here.                                        
                                                                        
     I write a bit longer Gemtext here to showcase reflow capabilities  
     of Bopher Tools' gmi2txt.sh script, because it really is cool.     
                                                                        
     Visit my Gopher homepage: gopher://happynetbox.com:79/1luxferre    
     Or read my blog on Web: https://chronovir.us                       
                                                                        
     See ya!                                                            
```

A tools/gmi2txt.sh => tools/gmi2txt.sh +72 -0
@@ 0,0 1,72 @@
#!/bin/bash
# A simple helper tool to create a formatted and CRLF-terminated plaintext from the Gemtext passed via standard input
#
# Usage: cat [file] | gmi2txt.sh [page_width] [leading_spaces] [trailing_spaces]
# (pass 0 as the reflow width if you want to pass the other parameters but don't want to turn on reflow logic)
#
# Created by Luxferre in 2023, released into public domain

shopt -s extglob # enable extended pattern matching (just to be sure)

TARGET_WIDTH="$1"
LSPACES="$2"
TSPACES="$3"

SPC=$'\x20'

[[ -z "$TARGET_WIDTH" ]] && TARGET_WIDTH=0 # reflow off by default
[[ -z "$LSPACES" ]] && LSPACES=0
[[ -z "$TSPACES" ]] && TSPACES=0

reflowfmt="%-$(( LSPACES ))s%-${TARGET_WIDTH}s%-$(( TSPACES ))s\n" # params: smth, line, smth

reflow_line() { # single-line logic from phlow.sh, adapted into a function and separating by LF only
  local line="$1"
  local llen="${#line}" # get effective line length
  if (( 0 == TARGET_WIDTH || llen < TARGET_WIDTH )); then # no need to run the logic for smaller lines or if TARGET_WIDTH is 0
    printf "$reflowfmt" '' "$line" ''
    return
  fi
  local lastws=0 # variable to track last whitespace
  local cpos=0 # variable to track current position within the page line
  local pagepos=0 # variable to track the position of new line start
  local outbuf='' # temporary output buffer
  local c='' # temporary character buffer
  for ((i=0;i<llen;i++,cpos++)); do # start iterating over characters
    c="${line:i:1}" # get the current one
    if (( cpos >= TARGET_WIDTH )); then # we already exceeded the page width
      (( lastws == 0 )) && lastws=$TARGET_WIDTH # no whitespace encountered here
      printf "$reflowfmt" '' "${outbuf:0:$lastws}" '' # truncate the buffer
      outbuf=''
      pagepos=$(( pagepos + lastws ))
      cpos=0
      lastws=0
      i=$pagepos # update current iteration index from the last valid whitespace
    else # save the whitespace position if found
      [[ "$c" == "$SPC" ]] && lastws="$cpos"
      outbuf="${outbuf}${c}" # save the character itself
    fi
  done
  [[ ! -z "$outbuf" ]] && printf "$reflowfmt" '' "$outbuf" '' # output the last unprocessed chunk
}

readarray -t LINES -d $'\n' # read the input line array (split by LF)
for line in "${LINES[@]}"; do # iterate over the read text
  line="${line%%$'\r'}" # remove a trailing CR if it is there
  if [[ "${line:0:2}" == $'=>' ]]; then # we have a linkable resource
    linkline="${line##=>*([[:blank:]])}" # remove the link signature and any leading whitespace
    linkurl="${linkline%%[[:blank:]]*}" # treat anything until the next whitespace (or the end of line) as a URL
    linkdesc="${linkline##${linkurl}*([[:blank:]])}" # remove the URL and any other leading whitespace to get the description
    linkdesc="${linkdesc%%*([[:blank:]])}" # remove any trailing whitespace from the description
    targetlink="$(printf '%s: %s' "$linkdesc" "$linkurl")" # reformat the link
    rlink="$(reflow_line "$targetlink")" # reflow the final link text
    printf '%s\r\n' "${rlink%%$'\n'}" # remove a trailing LF if it is there, but then add CRLF
  else # we have an info line
    infoline='' 
    [[ "${line:0:3}" != $'```' ]] && infoline="$line" # ignore the preformatting togglers, pass everything else
    readarray -t reflowed_lines -d $'\n' < <(reflow_line "$infoline")
    for rline in "${reflowed_lines[@]}"; do # iterate over the reflowed line parts
      printf '%s\r\n' "$rline"
    done
  fi
done # file output finished