Current version is 3.12 (Sep 1999)
MIMER - program to make an e-mail message MIME compliant

Description Synopsis Running How it works Configuring Examples Bugs Download Comments

Description

  This program deals with e-mail messages and is supposed to convert them into valid MIME (Multipurpose Internet Mail Extension) format. It analyzes character statistics (up to the second order i.e. it calculates character-pair correlations) and writes correct charset value.

  This program is free and is distributed under the terms of GNU General Public License. It is built with DJGPP - MS-DOS port of GCC. This program is tested to work under MS-Windows 3.1/95/NT and under plain MS-DOS (cwsdpmi.from original DJGPP distribution is required if no external DPMI host is available).

  Originally this program was developed to help Pegasus Mail for Windows in handling Russian e-mail messages. Some messages (de-facto standard for Internet mail is KOI8) simply lack these three lines:

 MIME-Version: 1.0
 Content-Type: text/plain; charset=koi8-r
 Content-Transfer-Encoding: 8bit
Others, mostly microsoft-oriented users, have these lines in their messages but real contents are in ANSI codepage... There's no way to count all possible errors in a simple e-mail ("It is impossible to make anything foolproof because fools are so ingenious." - Murphy). MIMER should be able to make user's life easier in such cases. MIMER is easily configurable - it can be set up to handle other languages and charsets. It is also supplied with several helper programs to make Pegasus Mail and UUPC/@ live together.

Synopsis

mimer [-d] [-D] [-hcustom header] [-ooutput directory] file1 [file2 ...]

How to run

  In the simplest case you can type something like mimer.exe mail.txt The file mail.txt will be parsed and the result will be placed into the default directory (it can be overridden with -o option). MIMER will try to create a unique file name in destination directory. You can specify either -d or -D switch to delete or leave input file (mail.txt). And, at last, you can use -h option to make MIMER add custom header line to the message. Usually it is something like "X-Developed-by-Mimer:" - it can be used to mark already converted messages.

How it works

  For each input file (you can use Unix-like wildcards; directory names are ignored, no directory recursion is implemented, is it necessary at all?) MIMER tries to (recursively) parse all its MIME parts. For each part an attempt to add (or modify) necessary headers is taken. No changes in message body is made, the only affected part is the message header. When the message is completely developed it is written to the destination directory with unique name. If MIMER detects another instance of itself (it can happen under multitasking environment) working in the same directory it generates a task-list file and terminates. The first MIMER instance is supposed to detect this task-list and develop messages on behalf of the terminated one. There's an additional short program MimHelp, it is intended to improve MIMER performance in some cases. It takes exactly one argument - file name (without wildcards). To explain what it is for, one have to know how Pegasus Mail deals with mail filtering. Suppose we have the following filtering rules for the new-mail folder (these are real rules from my own installation):
 If not expression headers matches "X-Developed-by-Mimer:" Run "F:\\EXE\\mimhelp.exe "
 If not expression headers matches "X-Developed-by-Mimer:" Delete ""
So, according to the first rule, Pegasus tries to execute MimHelp for each undeveloped new mail. MimHelp, in turn, adds task-list if one MIMER instance is already running or spawns new MIMER otherwise. MimHelp is very small, so its startup is much faster - it really makes sense if you have 50 new messages in your new-mail folder.

  MIMER program contains extensive language specific information. Russian is used here just as an example, all this burned-in information can be easily modified with MimPATCH (see the next section). MIMER keeps several character sets koi8-r, cp-866, cp-1251 and x-bad-kw. The very first charset is discriminated and referred to as in the following as the main charset. So, each charset is determined by its name and a 256-byte conversion table from this charset to the main one. Apparently, the first conversion (that is main to main) should be identity (it is not so for russian, but it is the second order effect). Another language dependant table is frequency table - it is used to determine statistically most probable character set (from the known list). This table is kept in the main charset and should be collected as statistics from the large text in the main codepage.

How to configure

  All configuration information for the MIMER is stored inside the executables. There's a special program to configure MIMER, it is called MimPATCH. It may be invoked with the following syntax:

mimpatch [-i] [-d] [-D] -f file to be patched [-F file to get patch from] [-r reliable charset list] [-u unreliable charset list] [-g file to gather statistics from] [-G file to gather statistics from] [-c charset] [-C charset] [-x extension for output files] [-H path to MimHelp.EXE file] [-h custom header] [-o destination directory]

Some comments here. -i causes MimPATCH to print full information from the patched file. Full path to MIMER should be specified with -f option. Options -d,-D,-o and -h are inherited from MIMER itself, they just set default behavior. -F is used to copy patchable data from another file, -x sets default extension for output files (usually .CNM). -H specifies the path to the MimHelp.exe file to be patched. Using options -c and -C one can add or remove charsets from the known list. To setup MIMER to work with particular language one have to teach MIMER how to recognize it. It can be done with -g and -G options. The last two options -r and -u deal with the concept of "reliable" and "unreliable" charsets. One might wish MIMER to believe some original charset values. These values should be listed after -r (reliable) option. Alternatively, one may ask MIMER to keep all charset values except several stated. It is done with -u (unreliable) option. Both these lists are regular expressions as defined by POSIX 1003.2.

One more program comes in the same bundle - UU2PM. its only task is to extract separate messages from the UUPC/@ mailbox and put them into Pegasus Mail user's home directory.

Examples

mimpatch -f f:/exe/mimer.exe -i Shows configurable information from the f:/exe/mimer.exe
mimpatch -f mimer.exe -F c:/datablock.bin -i Picks a patch from c:/datablock.bin and patches mimer.exe with it. Information is shown.
mimpatch -f mimer.exe -F c:\mimer.exe -iDh X-Developed-by-Mimer: -o E:\mail\leva -H f:\exe\mimpatch.exe Patch files mimer.exe and f:\exe\mimpatch.exe, basing on c:\mimer.exe with following modifications: input files are NOT deleted after processing, output path is set to "e:\mail\leva", custom header is set to be "X-Developed-by-Mimer". Information is shown.
mimpatch -f mimer.exe -c x-cyr-wry -i Takes the file "x-cyr-wry" (it should be simple 256-byte array) and adds encoding "x-cyr-wry". Information is shown.
mimpatch -f mimer.exe -g War_n_Peace.txt -i Takes the (large) file War_n_Peace as sample of the language. Statistics is collected and burned into mimer.exe. Information is shown.
mimpatch -f mimer.exe -r "^koi" Makes MIMER rely on "koi8-r", "koi8-u", and "koi8" but not "x-koi8-r" charsets.
mimpatch -f mimer.exe -u "iso" Makes MIMER think that "iso-8859-1" and "x-iso" are unreliable while "koi8-r" is reliable.

Bugs

  Only plain text (8-bit) is used in charset recognition. No internal "base-64" nor "quoted-printable" decoders are implemented.

Download

 Sources for MIMER can be downloaded as one zip-file mimer_s.zip. They can be compiled with DJGPP 2.x. RHIDE project files are attached.

 Prebuilt executables are available for Russian language - it is known to be rather good in distinguishing between koi8-r, cp-1251, cp-866, and broken (cp1251->koi)2 (named here as "x-bad-kw") encodings. It can be easily customized to other languages and charsets, supplied program mimpatch.exe is able to gather language statistics from any huge plain-text file. Download mimer_b.zip for complete set of mimer.exe, mimhelp.exe, mimpatch.exe, and uu2pm.exe files.

Comments ?

Do you have any comments, suggestions or bug reports? Feel free to mail me.

leva
Back to Lev Melnikovsky's home page

Description Synopsis Running How it works Configuring Examples Bugs Download Comments