~reesmichael1/nim-html2text

Convert HTML to Markdown-formatted text
Ignore CDATA tags
Ignore <script> tags (at least for now)
Add text_after_list to list of passing tests

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~reesmichael1/nim-html2text
read/write
git@git.sr.ht:~reesmichael1/nim-html2text

You can also use your local clone with git send-email.

nim-html2text

nim-html2text is a Nim library that allows for conversion of HTML into clean, easy-to-read ASCII plain text. Better yet, that ASCII happens to be valid Markdown (a text-to-HTML format).

There are many implementations of the html2text idea. This implementation was originally based on a Go library, although it has been extended. The files within tests/files are taken from the Python library, although they have been modified slightly to represent implementation differences.

Usage

In sample.nim:

import html2text

echo handle("<ul><li>Item 1</li><li>Item 2</li></ul>")

Then, when you run nim c -r sample.nim, you should see

* Item 1
* Item 2

Status

nim-html2text is still in an early stage of development. You can check on the progress by running nimble test, which will compare a large suite of HTML files against the expected Markdown output.

nim-html2text is being built for eventual use in Roman.

Contributing

Contributions are welcome! Please send patches, questions, requests, etc. to my public inbox.