Re: svn.haxx.se is going away

From: Greg Stein <gstein_at_gmail.com>
Date: Fri, 25 Dec 2020 16:54:11 -0600

On Fri, Dec 25, 2020 at 11:17 AM Daniel Shahaf <d.s_at_daniel.shahaf.name>
wrote:
>...

> > I'll figure out a way to have the mboxes downloadable. If I understand
> > Google's documentation of robots.txt they don't care about robots.txt if
> a
> > specific URL is linked from somewhere indexable, they will index it
> anyway.
> > Maybe just make one big tarball of everything?
>
> One big tarball would be wasteful to consume (would have to download
> everything) and to produce (would need to, basically, Â«cp everything.tgz
> tmp.tgz; tar -zcf - $new >> tmp.tgz; mv tmp.tgz everything.tgzÂ», and you
> can
> see that's O(#everything) rather than O(appended stuff)). Would rather
> avoid
> it if possible.
>
> Not sure what to do about robots. I suppose we could set <link
> rel="canonical"> in the HTTP headers when serving the rfc822 files (example
> in <https://en.wikipedia.org/wiki/Canonical_link_element#HTTP>)?
>

I thought robots.txt can exclude subdirs. So just cut off (say)
svn-haxx.apache.org/mbox/

I'm not too worried about Google crawling the mboxes, as they'll likely do
it just once and never again (by keeping the etag and/or mtime).

>...

> > I couldn't figure out puppet, the links was 404 for me. I've created a
> > request in Jira and I hope someone will take a look:
> > https://issues.apache.org/jira/browse/INFRA-21230
>
> I think the github repository is restricted to Apache committers only, so
> you'll need to enter your github username on id.apache.org in order to get
> access to that URL. If you don't have a github account, there ought to be
> a mirror of the repository on *.apache.org somewhere (at least, if Infra's
> following the same policy PMCs do).
>

Correct: committers only. And only after linking accounts via
https://gitbox.apache.org/setup/ as Nathan noted (and we forgot to mention
to DSahlberg).

If you do not have a GitHub account, or do not want one (say, because you
don't want to accept their T&Cs), then you can use the repository via
gitbox.apache.org (ask on Slack for the link; I prefer not to post it here).

Cheers,
-g
Received on 2020-12-25 23:54:26 CET

This message: [ Message body ]
Next message: Yasuhito FUTATSUKI: "Re: Python bindings API confusion"
Previous message: Nathan Hartman: "Re: svn.haxx.se is going away"
In reply to: Daniel Shahaf: "Re: svn.haxx.se is going away"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]