#66: Restore ability to export mbox files ------------------------+------------------------------ Reporter: dmw | Owner: Type: enhancement | Status: new Priority: major | Milestone: Beta version Version: | Keywords: mbox export data ------------------------+------------------------------ Hi there,
After reading about the project on LWN, I was somewhat disappointed to see there doesn't appear to be mbox export facility yet. As one commenter put it on the LWN story,
"Download as mbox" or alternatively "download as maildir" are basically the "liberate the community data" and "allow fast local mirrors of the archive" functionality for email mailing lists... If it doesn't have that, it is a show stopper as far as I'm concerned. Maybe a plugin can fix that, but it is best implemented well integrated with the archive web interface.
Do you have plans to implement something like this?
Thanks
#66: Restore ability to export mbox files -----------------------------+--------------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: major | Milestone: Beta version Version: | Resolution: Keywords: mbox export data | -----------------------------+--------------------------- Changes (by abompard):
* status: new => accepted * owner: => abompard
Comment:
Yes, I definitely have plans to restore this feature, it's just that I must make sure I escape all email addresses and properly stream a file that can be pretty big without hitting too much on the server. I may have to introduce some async processing, which is a bit of work if I want to keep the install procedure and the dependencies under control.
#66: Restore ability to export mbox files -----------------------------+--------------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: major | Milestone: Beta version Version: | Resolution: Keywords: mbox export data | -----------------------------+---------------------------
Comment (by sumanah):
Also may be related to https://bugs.launchpad.net/mailman/+bug/1414176 "better list-data export and import via XML, maybe web".
#66: Restore ability to export mbox files -----------------------------+----------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: critical | Milestone: 1.1 Version: | Resolution: Keywords: mbox export data | -----------------------------+----------------------- Changes (by sumanah):
* priority: major => critical * milestone: Beta version => 1.1
Comment:
I agree with the commenters in https://lwn.net/Articles/596817/ that this functionality is very important.
As abompard said, it's important to properly escape the email addresses. And there is some ability for an administrator to get at the archives via https://github.com/hyperkitty/mailman-hyperkitty .
I'm moving this to the 1.1 milestone rather than 1.0 because it is slightly less crucial than other bugfixes we need in order to get a basic version of HyperKitty out this week. But it's really important and I've bumped the priority up and I hope we can do it very soon! Thanks for reporting this.
#66: Restore ability to export mbox files -----------------------------+----------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: critical | Milestone: 1.1 Version: | Resolution: Keywords: mbox export data | -----------------------------+-----------------------
Comment (by berrange):
NB, a reason for providing mbox archives is that a number of projects rely on tools which download the archives and extract patches from them, as part of their development workflow. For these kind of tools to work, you really don't want to escape / obscure email addresses in the mbox archives. The tools need the full original messages with all headers and data intact including email addresses - essentially the same as if the person had had the mails directly delivered to them.
#66: Restore ability to export mbox files -----------------------------+----------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: critical | Milestone: 1.1 Version: | Resolution: Keywords: mbox export data | -----------------------------+-----------------------
Comment (by abompard):
Hey @berrange,
I'm curious as to how these tools work currently, since the only mbox archives that Mailman 2.X provides has the email addresses escaped, such as "user at domain.tld". Would this kind of escaping be OK for the tools you speak of? I do want to make the spam harvesters' job as hard as can be.
Another way to do what you're trying to do would be to subscribe an email address to the list and use POP3 or IMAP to get the emails. This way you're sure they won't be altered. Opening a POP3 enabled mailbox somewhere should be pretty easy, and POP3 is an old enough protocol that it shouldn't be too hard to convert the tools to use it, there's libraries for it in every language I know.
#66: Restore ability to export mbox files -----------------------------+----------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: critical | Milestone: 1.1 Version: | Resolution: Keywords: mbox export data | -----------------------------+-----------------------
Comment (by berrange):
@abompard that's not the case in the mailman archives that I deal with
eg, go the libvirt list and look at the OCt 2015 mbox :
https://www.redhat.com/archives/libvir-list/
The *HTML* archives have email addresses escaped, but the mbox format archives are 100% intact as originally received. ie no escaping is performed on mboxes.
I understand the desire to combat spammers, so perhaps allow escaping to be configurable by the list administrator, so those projects which need this un-escaped mbox facility can still get them.
#66: Restore ability to export mbox files -----------------------------+----------------------- Reporter: dmw | Owner: abompard Type: enhancement | Status: accepted Priority: critical | Milestone: 1.1 Version: | Resolution: Keywords: mbox export data | -----------------------------+-----------------------
Comment (by abompard):
Hey @berrange, I haven't found this kind of behavior on the lists.fedoraproject.org or lists.fedorahosted.org mailing-lists. I wonder if it may be a local modification.
Since only those two domains are to be migrated soon, do you know of any lists there where people expect the original emails in the mbox archive?
hyperkitty-devel@lists.fedorahosted.org