Convert HTML to Markdown-formatted text
Fix long images being wrapped onto separate line from '!'
Fix broken wrapping of long ordered list items
Print two newlines before a blockquote, not just one


nim-html2text is a Nim library that allows for conversion of HTML into clean, easy-to-read ASCII plain text. Better yet, that ASCII happens to be valid Markdown (a text-to-HTML format).

There are many implementations of the html2text idea. This implementation was originally based on a Go library, although it has been extended. The files within tests/files are taken from the Python library, although they have been modified slightly to represent implementation differences.


In sample.nim:

import html2text

echo handle("<ul><li>Item 1</li><li>Item 2</li></ul>")

Then, when you run nim c -r sample.nim, you should see

* Item 1
* Item 2


nim-html2text is still in an early stage of development. You can check on the progress by running nimble test, which will compare a large suite of HTML files against the expected Markdown output.

nim-html2text is being built for eventual use in Roman.


Contributions are welcome! Please send patches, questions, requests, etc. to my public inbox.