On Mon, Dec 21, 2020 at 4:03 AM Daniel Shahaf <d.s_at_daniel.shahaf.name>
wrote:
> Daniel Sahlberg wrote on Mon, 21 Dec 2020 08:55 +0100:
> > Den fre 27 nov. 2020 kl 19:26 skrev Daniel Shahaf <
> d.s_at_daniel.shahaf.name>:
> >
> > > Sounds good. Nathan, Daniel Sahlberg — could you work with Infra on
> > > getting the data over to ASF hardware?
> >
> > I have been given access to svn-qavm and uploaded a tarball of the
> website
> > (including mboxes). I'm a bit reluctant to unpack it since it takes
> almost
> > 7GB, and there is only 14 GB disk space remaining. Is it ok to unpack or
> > should we ask Infra for more disk space?
>
> I vote to ask for more disk space, especially considering that some
> percentage is reserved for uid=0's use.
>
DSahlberg hit up Infra on #asfinfra on the-asf.slack.com, and asked for
more space. That's been provisioned now.
>...
> > The mboxes will be preserved but I don't plan to make them available for
> > download (since they are not available from lists.a.o or
> mail-archives.a.o).
>
> Please do make them available for download. Being able to download the
> raw data is useful for both backup and perusal purposes, and I doubt
> the bandwidth requirements would be a problem. (Might want
> a robots.txt entry, though?)
>
Bandwidth should not be a problem for the mboxes, but yes: a robots.txt
would be nice. I think search engines spidering the static email pages
might be useful to the community, but the spiders really shouldn't need/use
the mboxes.
Regarding the behaviour of the existing archives, see
> <https://mail-archives.apache.org/mod_mbox/subversion-dev/202012.mbox>
> (which used to also be available via
> https://subversion.apache.org/mail/, but nowadays that just redirects
> to a landing page ☹). I don't know whether lists.a.o has equivalent
> functionality, but then again, lists.a.o has had vendor lock-in baked
> into it from day one, so a lack of a "download raw rfc822 data" feature
> might simply be another form of that.
>
I don't know if our vendor for lists.a.o plans to do an mbox download. I
doubt they retain the data in that format. The Foundation has "all the
data", of course, going back to the mid-90s. An mbox download service might
be interesting, once we decommission the mod_mbox services.
>...
> > 1. Install a web server. nginx? (just kidding)
>
> Apache HTTP Server would probably be a better choice since more dev_at_svn
> and Infra people are familiar with it, but it's a fair question to ask.
> (Cf. INFRA-7524)
>
Infra has no position on that. Feel free to use nginx 😁 ... but DShahaf is
correct: local support will be higher with apache httpd.
> 2. Setup httpd.conf
> > 3. Configure a DocumentRoot where I can put the files. Doesn't seem right
> > to store them in /home
>
> Hmm. These things should all be done via puppet. I'm not sure what's
> best practice nowadays regarding writing puppet PRs and testing them,
> though.
I think the first thing is to get httpd up and running with the desired
configuration. Then step two will be to memorialize that into puppet. Infra
can assist with the latter. I saw on Slack that Humbedooh gave you a link
to explore.
Cheers,
-g
Received on 2020-12-22 02:08:31 CET