Serve robots.txt disallowing all robots

This overrides any robots.txt file in the proxied gemini capsule, on the
basis that this is intended for gemini robots (which can be expected to
follow the robots.txt companion spec) rather than web robots.

The main purpose though for disallowing web robots is to prevent them
from crawling the proxied cross-site geminispace under /x/, since web
robots won't know even to read the robots.txt files for other capsules
proxied this way.
This commit is contained in:
mbays 2021-08-25 12:08:56 +02:00 committed by Drew DeVault
parent 988a00f126
commit a8c54c1a32

View file

@ -583,6 +583,12 @@ func main() {
return
}
if r.URL.Path == "/robots.txt" {
w.WriteHeader(http.StatusOK)
w.Write([]byte("User-agent: *\nDisallow: /\n"))
return
}
req := gemini.Request{}
req.URL = &url.URL{}
req.URL.Scheme = root.Scheme