Greg Stein wrote on Wed, Nov 25, 2020 at 00:08:32 -0600:
> Hey Daniel,
>
> I think the best place for this content is on mbox-vm.a.o. That is where we
> have our permanent list archives in mbox format.
> We can then arrange to ship them off to lists.a.o. If you concur,
I concur in the sense that it'd be great to have the mboxes stored on
and served by whatever Infra uses for all other archives.
However, when I last looked at lists.a.o I was of the opinion that Infra
shouldn't use it. (Back then its permalinks weren't permanent and
weren't able to be generated or dereferenced while the user was offline
*or while the external vendor was offline*. I don't know whether those
have been fixed since then.) Unless that has changed, I wouldn't like
Subversion to rely on that particular archive. Instead, there's
mod_mbox, or a static snapshot of svn.haxx.se.
(@Greg: You know I wouldn't normally have repeated the above, but
(1) you asked, and (2) the dev@ audience doesn't all know this context.)
> then I'll ask the team to get you access.
Would InfraAdmin let someone else from the PMC take point? I realize
that this is an ASF-wide box (as opposed to a PMC box) and I'm a known
entity at Infra, but I'm short on tuits.
> You can preserve all the data you want into your homedir, and we can
> sort from there.
Sounds good. Nathan, Daniel Sahlberg — could you work with Infra on
getting the data over to ASF hardware?
Note that svn-org@ doesn't have an equivalent @s.a.o list, and that, as
mentioned upthread, the post-migration (from tigris.org to apache.org)
mboxes may be in a different order than the official ones, and shouldn't
be "deduplicated".
> You indicate a desire to maintain URLs. Do you have some ideas on that?
Each individual message .shtml file contains the message-id in
a comment. We can extract the comments and build a redirector around
them. (By the way, this is basically the same exercise that Infra must
have solved back when Sebb received that CSV file from the lists.a.o
vendor, so there may be an opportunity for code reuse.) Of course, the
full rsync likely has the same info available less scrapily.
Or, as mentioned above, the .shtml files could just be preserved
statically (plus or minus an appropriate message in the list of years on
the /${listname}/ page). In fact, I'm having trouble coming up with
a reason _not_ to serve a static snapshot of the pages, even if we do
build a redirector.
Cheers,
Daniel
Received on 2020-11-27 19:26:32 CET