I would like to know if anyone here knows about an archival tool or
script that can be used to back up new articles into mbox format. Similar
to what exists in archive.org. I'm not much of a programmer but before I spend the time and effort to cobble together a bash script to do this, I
want to see if someone here already has some this that can do that
already. No need to re-invent the wheel, etc. Thanks.
I'm using slrn as news agent, and simply use '#' to tag messages/threads
I want to archive, and then press 'o' to select mbox file in which to
save them (could be more automated with slang macros in slrn, if one
wants)
That is for archive.org-alike saving (you choose what you want to save).
Hi all,
I would like to know if anyone here knows about an archival tool or
script that can be used to back up new articles into mbox format. Similar
to what exists in archive.org. I'm not much of a programmer but before I spend the time and effort to cobble together a bash script to do this, I
want to see if someone here already has some this that can do that
already. No need to re-invent the wheel, etc. Thanks.
I would like to know if anyone here knows about an archival tool or
script that can be used to back up new articles into mbox
format. Similar to what exists in archive.org. I'm not much of a
programmer but before I spend the time and effort to cobble together a
bash script to do this, I want to see if someone here already has some
this that can do that already. No need to re-invent the wheel, etc.
However I'm running my own INN news server using the tradpool storage
method. I want to be able to create an automated monthly archive
of every article on my server and dump that to mbox files for each
newsgroup. Doing it through slrn would be way too much of a headache.
Jason Evans <jsevans@mailfence.com> writes:
I would like to know if anyone here knows about an archival
tool or script that can be used to back up new articles into
mbox format. Similar to what exists in archive.org. I'm not
much of a programmer but before I spend the time and effort to
cobble together a bash script to do this, I want to see if
someone here already has some this that can do that already.
No need to re-invent the wheel, etc.
The archive program that comes with INN and can be configured
as a feed in newsfeeds is fairly close to what you want except
that when you configure it to store multiple messages in a
single file, it uses a custom separator rather than doing From
escaping and inserting a mailbox From.
You could run it in its default mode where it saves each
individual message to a file and then separately run some other
program to convert a directory full of files to a mailbox. I
suspect most of the Google hits for "maildir2mbox" would do it,
since a maildir is essentially a directory full of messages.
The formail tool that comes with procmail may be worth looking at in
this context too.
On 8/23/21 6:29 PM, Ted Heise wrote:
The formail tool that comes with procmail may be worth looking
at in this context too.
formail is a very nice tool. I use the crap out of it,
particularly in procmail recipes and commands querying
messages. But I thought that it split mbox / archives into
multiple discrete messages, not the other way around which is
my understanding of the OP's need. If I'm mistaken, please
correct me.
On Mon, 23 Aug 2021 22:24:30 -0600,
Grant Taylor <gtaylor@tnetconsulting.net> wrote:
On 8/23/21 6:29 PM, Ted Heise wrote:Ah Grant, I think you are correct. Thanks for setting things
The formail tool that comes with procmail may be worth lookingformail is a very nice tool. I use the crap out of it,
at in this context too.
particularly in procmail recipes and commands querying
messages. But I thought that it split mbox / archives into
multiple discrete messages, not the other way around which is
my understanding of the OP's need. If I'm mistaken, please
correct me.
straight!
I believe that formail habitually /appends/ new headers to the existing headers. Seeing as how the From line is used as a message separator in
mbox format, it /MUST/ be the first header. I'm not sure that formail
in and of itself can /prepend/ / insert a header at the start of a
message.
formail can be used to add mail headers such as From.... You could
process through formail and then just 'cat' straight to an mbox file.
Grant Taylor <gtaylor@tnetconsulting.net> writes:
I believe that formail habitually /appends/ new headers to the existing >>headers. Seeing as how the From line is used as a message separator in >>mbox format, it /MUST/ be the first header. I'm not sure that formail
in and of itself can /prepend/ / insert a header at the start of a
message.
Another problem is that the line used as a message separator is not a
header (it doesn't have a colon). It starts with "From " and is a weird >special case left over from mbox's odd legacy format.
A more subtle problem is that Usenet messages don't escape lines that
start with "From " in the body of the message, but this is mandatory when >storing messages in mbox format or a body line might be mistaken for the >start of a new message. Conventionally this is done by prepending > to
the line starting with "From ". There are other approaches, but one needs
to do something about this. The maildir2mbox programs will handle this
case (or should).
I've been using alpine/pine forever. It doesn't parse for "nlFrom_" but
a line resembling ENVELOPE FROM. All these years of use, I can't say
it's ever mistaken a line in the body for a separator line.
Russ Allbery <eagle@eyrie.org> wrote:
Grant Taylor <gtaylor@tnetconsulting.net> writes:
I believe that formail habitually /appends/ new headers to the
existing headers. Seeing as how the From line is used as a
message separator in mbox format, it /MUST/ be the first
header. I'm not sure that formail in and of itself can
/prepend/ / insert a header at the start of a message.
Another problem is that the line used as a message separator is
not a header (it doesn't have a colon). It starts with "From "
and is a weird special case left over from mbox's odd legacy
format.
A more subtle problem is that Usenet messages don't escape
lines that start with "From " in the body of the message, but
this is mandatory when storing messages in mbox format or a
body line might be mistaken for the start of a new message.
Conventionally this is done by prepending > to the line
starting with "From ". There are other approaches, but one
needs to do something about this. The maildir2mbox programs
will handle this case (or should).
The conventional mbox separator line is created from ENVELOPE
FROM.
I've been using alpine/pine forever. It doesn't parse for
"nlFrom_" but a line resembling ENVELOPE FROM. All these years
of use, I can't say it's ever mistaken a line in the body for a
separator line.
If I archive an article from Usenet, it's to an mbox just so I
can read it with a mail client. If I have to do it manually, I
just copy and paste a separator line that is recognized.
"Adam H. Kerman" <ahk@chinet.com> writes:
I've been using alpine/pine forever. It doesn't parse for "nlFrom_" but
a line resembling ENVELOPE FROM. All these years of use, I can't say
it's ever mistaken a line in the body for a separator line.
I agree that the chances of this being a problem given a sufficiently
picky parser are low, but they're still not non-zero since there is no
protocol reason why a Usenet article cannot contain a line like:
From foo@example.com Wed Aug 25 13:33:52 2021
in, for example, a discussion of envelope From lines. :) So when
contemplating archive software that one wants to just work and not have to
think about, ideally it should cope with this.
The old Babyl format solves this problem, but alas never caught on in the
UNIX world.
I've been using alpine/pine forever. It doesn't parse for "nlFrom_" but
a line resembling ENVELOPE FROM. All these years of use, I can't say
it's ever mistaken a line in the body for a separator line.
From foo@example.com Wed Aug 25 13:33:52 2021
For what it's worth, I saved the above message to my pine mbox
file and got the below. so somewhere somehow the bare From line is
getting changed in that process.
For what it's worth, I saved the above message to my pine mbox file and
got the below. so somewhere somehow the bare From line is getting
changed in that process.
Grant Taylor <gtaylor@tnetconsulting.net> writes:
I believe that formail habitually /appends/ new headers to the existing
headers. ...
who has run a news message through formail to see what happens.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 428 |
Nodes: | 16 (2 / 14) |
Uptime: | 105:47:20 |
Calls: | 9,053 |
Calls today: | 10 |
Files: | 13,395 |
Messages: | 6,015,566 |