Re: svn.haxx.se is going away

From: Daniel Sahlberg <daniel.l.sahlberg_at_gmail.com>
Date: Thu, 24 Dec 2020 20:38:17 +0100

Den tis 22 dec. 2020 kl 02:08 skrev Greg Stein <gstein_at_gmail.com>:

> On Mon, Dec 21, 2020 at 4:03 AM Daniel Shahaf <d.s_at_daniel.shahaf.name>
> wrote:
>
>> Daniel Sahlberg wrote on Mon, 21 Dec 2020 08:55 +0100:
>> > Den fre 27 nov. 2020 kl 19:26 skrev Daniel Shahaf <
>> d.s_at_daniel.shahaf.name>:
>> >
>> > > Sounds good. Nathan, Daniel Sahlberg — could you work with Infra on
>> > > getting the data over to ASF hardware?
>> >
>> > I have been given access to svn-qavm and uploaded a tarball of the
>> website
>> > (including mboxes). I'm a bit reluctant to unpack it since it takes
>> almost
>> > 7GB, and there is only 14 GB disk space remaining. Is it ok to unpack or
>> > should we ask Infra for more disk space?
>>
>> I vote to ask for more disk space, especially considering that some
>> percentage is reserved for uid=0's use.
>>
>
> DSahlberg hit up Infra on #asfinfra on the-asf.slack.com, and asked for
> more space. That's been provisioned now.
>

I've unpacked in /home/dsahlberg/svnhaxx

> >...
>
>> > The mboxes will be preserved but I don't plan to make them available for
>> > download (since they are not available from lists.a.o or
>> mail-archives.a.o).
>>
>> Please do make them available for download. Being able to download the
>> raw data is useful for both backup and perusal purposes, and I doubt
>> the bandwidth requirements would be a problem. (Might want
>> a robots.txt entry, though?)
>>
>
> Bandwidth should not be a problem for the mboxes, but yes: a robots.txt
> would be nice. I think search engines spidering the static email pages
> might be useful to the community, but the spiders really shouldn't need/use
> the mboxes.
>

I'll figure out a way to have the mboxes downloadable. If I understand
Google's documentation of robots.txt they don't care about robots.txt if a
specific URL is linked from somewhere indexable, they will index it anyway.
Maybe just make one big tarball of everything?

> I think the first thing is to get httpd up and running with the desired
> configuration. Then step two will be to memorialize that into puppet. Infra
> can assist with the latter. I saw on Slack that Humbedooh gave you a link
> to explore.
>

Since I havn't got root, I can't get any further to install httpd on my own.
I couldn't figure out puppet, the links was 404 for me. I've created a
request in Jira and I hope someone will take a look:
https://issues.apache.org/jira/browse/INFRA-21230

Kind regards,
Daniel
Received on 2020-12-24 20:38:38 CET

This message: [ Message body ]
Next message: Daniel Shahaf: "Re: mailer.py py2/py3 change: non-UTF-8 environments"
Previous message: Nathan Hartman: "Re: Python bindings API confusion"
In reply to: Greg Stein: "Re: svn.haxx.se is going away"
Next in thread: Daniel Shahaf: "Re: svn.haxx.se is going away"
Reply: Daniel Shahaf: "Re: svn.haxx.se is going away"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]