Convert HTML to Markdown-formatted text
Fix handling of spaces around emphasis blocks
Fix long images being wrapped onto separate line from '!'
Fix broken wrapping of long ordered list items


browse  log 



You can also use your local clone with git send-email.


nim-html2text is a Nim library that allows for conversion of HTML into clean, easy-to-read ASCII plain text. Better yet, that ASCII happens to be valid Markdown (a text-to-HTML format).

There are many implementations of the html2text idea. This implementation was originally based on a Go library, although it has been extended. The files within tests/files are taken from the Python library, although they have been modified slightly to represent implementation differences.


In sample.nim:

import html2text

echo handle("<ul><li>Item 1</li><li>Item 2</li></ul>")

Then, when you run nim c -r sample.nim, you should see

* Item 1
* Item 2


nim-html2text is still in an early stage of development. You can check on the progress by running nimble test, which will compare a large suite of HTML files against the expected Markdown output.

nim-html2text is being built for eventual use in Roman.


Contributions are welcome! Please send patches, questions, requests, etc. to my public inbox.