~rkta/blog

791528f5a4b6268e3051b81005acc43ab431a4b2 — Rene Kita 5 months ago d16e0e2
Add anti-cf article

Prevent Apache from running the cgi script.
4 files changed, 77 insertions(+), 0 deletions(-)

M .htaccess
M Makefile
A anti-cf.mc
A assets/anti-cf.cgi
M .htaccess => .htaccess +5 -0
@@ 1,4 1,9 @@
ErrorDocument 404 "Not found"

<FilesMatch "\.cgi$">
        SetHandler send-as-is
</FilesMatch>

Redirect 301 /2017/10/03/irssi-awl-setup/index.html /irssi-awl.html"
Redirect 301 /2018/12/27/irssi-with-openurl-and-ssh/index.html /irssi-openurl-ssh.html
Redirect 301 /2018/12/27/wrong-permissions-on-jekyll-_site-files/index.html /jekyll-perms.html

M Makefile => Makefile +1 -0
@@ 16,6 16,7 @@ ARTS += m4fnmacro.mc
ARTS += w3m-gemini.mc
ARTS += zsh-globs.mc
ARTS += ts-bench.mc
ARTS += anti-cf.mc
# EOH

SITES  = about.mc

A anti-cf.mc => anti-cf.mc +55 -0
@@ 0,0 1,55 @@
<!-- vi: set ft=m4html : -->
define(`TITLE', `Circumventing Clownflare with w3m')dnl
define(`DATE', `2024-05-03')dnl
define(`MODIFIED', `2024-05-03T04:52:52Z')dnl
define(`LANG', en)dnl

<!DOCTYPE html>
<html lang="LANG">
include(head.mc)

<body>
include(header.mc)

<h1>TITLE</h1>
<time>DATE</time>

<p>
<b>Note:</b>
At the time of publishing this article I did not get blocked anymore.
</p>

<p>
Recently Cloudflare started blocking w3m when visiting any site from the StackExchange network.
This was confirmed by multiple users and happened with w3m on Debian, but not with w3m on OpenBSD.
</p>

<p>
Some debugging showed that curl still worked, so I utilized w3m's siteconf and a CGI script to circumvent the blocking.
</p>

<p>
The <a href="./assets/anti-cf.cgi">script</a> looks like this:
<pre><code>`
#!/bin/sh

# Circumvent Clownfare with curl

# Put this file in one of the configured cgi-bin directories of w3m and make
# it executable.
# Add the two next lines your ~/.w3m/siteconf omitting the # at the beginning
#url m!^https?://stackoverflow.com/!
#substitute_url "file:///cgi-bin/anti-cf.cgi?"
#url m!^https?://.*stackexchange.com/!
#substitute_url "file:///cgi-bin/anti-cf.cgi?"

printf "%s\n\n" "Content-Type: text/html"

url=$(echo $W3M_CURRENT_LINK | sed &apos;s@\(http.\{0,1\}://[^/]*\)/.*@\1@&apos;)
curl -L ${url}/${QUERY_STRING}
'</code></pre>
</p>

include(artfoot.mc)
</body>
</html>

A assets/anti-cf.cgi => assets/anti-cf.cgi +16 -0
@@ 0,0 1,16 @@
#!/bin/sh

# Circumvent Clownfare with curl

# Put this file in one of the configured cgi-bin directories of w3m and make
# it executable.
# Add the two next lines your ~/.w3m/siteconf omitting the # at the beginning
#url m!^https?://stackoverflow.com/!
#substitute_url "file:///cgi-bin/anti-cf.cgi?"
#url m!^https?://.*stackexchange.com/!
#substitute_url "file:///cgi-bin/anti-cf.cgi?"

printf "%s\n\n" "Content-Type: text/html"

url=$(echo $W3M_CURRENT_LINK | sed 's@\(http.\{0,1\}://[^/]*\)/.*@\1@')
curl -L ${url}/${QUERY_STRING}