|
|||
Quarantining Malicious Outlook Attachments |
Home | John | Connie | Publications | Software | Correspondence | NtropiX | NdustriX | NformatiX | NdeX | Thanks
The following seventeen line procmail(1) script fragment can be used to quarantine most malicious Microsoft Outlook® attachments. In the fragment, the quarantine account is represented as quarantine@somedomain.com, (and should be changed for compatibility with your system.) The quarantine account could be the address of an e-mail archive and retrieval system, such as included in the rel distribution at NformatiX, which uses SmartList, (also available from the procmail site,) for the archive functionality. SmartList is configurable through procmail scripts, for flexibility and extensibility, allowing different strategies to be implemented for handling quarantined messages. For example, accepting messages if they are from a local/corporate domain, and filing, (or bouncing with a delivery refusal notice,) e-mail with attachments from the rest of the Internet-perhaps with user notification that a message has been quarantined, etc. Description and Walk Through of the Script FragmentThe list of filename extensions that are executable, (or can contain executable code,) can be optimized for speed by building a tree, based on the first letter of the file name extensions, which is defined in a macro for substitution: ext='(a(d[ep]|r[cj]|s[dmxp]|u|vi)|\ b(a[st]|mp|z[0-9]?)|\ c(an|hm|il|lass|md|om|(p[lp]|\+\+)?|rt|sv)|\ d(at|e?b|ll|o[ct])|\ e(ml|ps?|xe)|\ g(if|z?)|\ h(lp|t(a|ml?)|(pp|\+\+)?)|\ i(n[cfis]|sp)|\ j(ava|pe?g|se?|sp|tmpl)|\ kbf|\ l(ha|nk|og|yx)|\ m(d[abew]|p(e?g|[32])|s[cipt])|\ ocx|\ p(a(tch|s)|c[dsx]|df|h(p[0-9]?|tml?)|\ if|[lm?]|n[gm]|[po][st]|p?s)|\ r(a[mr]|eg|pm|tf)|\ s(c[rt]|h([bs]|tml?)|lp|ql|ys)?|\ t(ar|ex|gz|iff?|xt)|\ u(pd|rl|x)|\ vb[es]?|\ w(av|m[szd]|p(d|[0-9]?)|s[cfh])|\ x(al|[pb]m|l[stw])|\ z(ip|oo)\ )' and includes the following file name extensions: .ade .adp .amp .arc .arj .asd .asm .asp .asx .au .avi .bas .bat .bz .bz0 .bz1 .bz2 .bz3 .bz4 .bz5 .bz6 .bz7 .bz8 .bz9 .c .c++ .can .chm .cil .class .cmd .com .cpl .cpp .csv .crt .dat .db .deb .dll .doc .dot .eml .ep .eps .exe .g .gif .gz .h .h++ .hlp .hpp .hta .htm .html .inc .ini .inp .ins .java .jpeg .jpg .js .jse .jsp .jtmpl .kbf .lha .lnk .log .lyx .mda .mdb .mde .mdw .mpeg2 .mpg2 .mpg3 .msc .msi .msp .mst .ocx .os .ot .pas .patch .pcs .pcx .pdf .phtm .phtml .php .php0 .php1 .php2 .php3 .php4 .php5 .php6 .php7 .php8 .php9 .pcd .pif .pl .plm .png .pnm .pps .ps .pt .ram .rar .reg .rpm .rtf .s .scr .sct .shb .shs .shtm .shtml .slp .sql .sys .tar .tex .tgz .tif .tiff .txt .upd .url .ux .vb .vbe .vbs .wav .wmd .wms .wmz .wp .wp0 .wp1 .wp2 .wp3 .wp4 .wp5 .wp6 .wp7 .wp8 .wp9 .wpd .wsc .wsf .wsh .xal .xbm .xls .xlt .xlw .xpm .zip .zoo Because RFC 1521 allows the syntax of the filename tokens and the file's name to be separated by white space, including a newline, tabs, or spaces, a procmail definition of white space that spans lines is necessary: ws = '[ ]*($[ ]+)*' Important: Note that the white space between both sets of square brackets consists of exactly one tab, (hex 09,) followed by exactly one space, (hex 20). RFC 1521 defines the file's name to be inclosed in a set of double quotation marks, '"', which is inconsistent with the way procmail handles double quotes in conditional statements, requiring a macro definition for substitution: dq = '"' End of Line, (used in conditions with variable substitution): eol='$' Encrypted, (there is a potential for signatures to carry executable programs, too,) applications, files, and base64 attachments defined in e-mail headers can carry malicious programs, so, the message should be quarantined: # :0 * 1^0 $ ^content-type:${ws}(multipart/(mixed|alternative|\ application|signed|encrypted))|(application/) * 1^0 $ ^content-disposition:${ws}attachment;${ws}.*\ name${ws}=${ws}${dq}.*\.${ext}(\..*)?${dq}${ws}${eol} * 1^0 $ ^content-transfer-encoding:${ws}base64 ! quarantine@somedomain.com The conditional statement: # :0 BE * -3^0 * 4^0 $ name${ws}=${ws}${dq}.*\.${ext}(\..*)?${dq}${ws}${eol} * 4^0 $ begin${ws}[0-9]+${ws}.*\.${ext}(\..*)?${ws}${eol} * 4^0 $ ^content-type:${ws}application/ * 4^0 $ ^content-transfer-encoding:${ws}base64 * 2^0 [<](!doctype|[sp]?h(tml|ead)|title|body) * 2^0 [<](app|bgsound|div|embed|form|i?l(ayer|ink)|img|\ i?frame(set)?|meta|object|s(cript|tyle)) * 2^0 =3d ! quarantine@somedomain.com operates as follows:
ExtensionThe script fragment is compatible with the Stochastic UCE Detection procmail script, which is very effective at reducing the amount of commercial e-mail received by users. Also, Microsoft® executable attachments can be detected in messages by the howto-virus.txt procmail fragment, which is available on the ReceivedIP page. AddendumTo evaluate the relative execution speed of the regular expression search mechanism used in procmail, a simple search, (using the macro extensions listed above,) of a 10 MB e-mail file using the following procmail construct in a file: :0 B: * $ name=${dq}.*${ext}${dq} { DUMMY=true } # :0 /dev/null and using the command "procmail file < e-mail_file", was compared against egrep(1) with the following expression file: name=".*\.ade" name=".*\.adp" name=".*\.amp" name=".*\.arc" name=".*\.arj" name=".*\.asd" name=".*\.asm" name=".*\.asp" name=".*\.asx" name=".*\.au" name=".*\.avi" name=".*\.bas" name=".*\.bat" name=".*\.bz" name=".*\.bz0" name=".*\.bz1" name=".*\.bz2" name=".*\.bz3" name=".*\.bz4" name=".*\.bz5" name=".*\.bz6" name=".*\.bz7" name=".*\.bz8" name=".*\.bz9" name=".*\.c" name=".*\.c++" name=".*\.can" name=".*\.chm" name=".*\.cil" name=".*\.class" name=".*\.cmd" name=".*\.com" name=".*\.cpl" name=".*\.cpp" name=".*\.crt" name=".*\.csv" name=".*\.dat" name=".*\.db" name=".*\.deb" name=".*\.dll" name=".*\.doc" name=".*\.dot" name=".*\.eml" name=".*\.ep" name=".*\.eps" name=".*\.exe" name=".*\.g" name=".*\.gif" name=".*\.gz" name=".*\.h" name=".*\.h++" name=".*\.hlp" name=".*\.hpp" name=".*\.hta" name=".*\.htm" name=".*\.html" name=".*\.inc" name=".*\.inf" name=".*\.ini" name=".*\.isp" name=".*\.ins" name=".*\.java" name=".*\.jpeg" name=".*\.jpg" name=".*\.js" name=".*\.jse" name=".*\.jsp" name=".*\.jtmpl" name=".*\.kbf" name=".*\.lha" name=".*\.lnk" name=".*\.log" name=".*\.lyx" name=".*\.mda" name=".*\.mdb" name=".*\.mde" name=".*\.mdw" name=".*\.mpeg2" name=".*\.mpg2" name=".*\.mpg3" name=".*\.msc" name=".*\.msi" name=".*\.msp" name=".*\.mst" name=".*\.ocx" name=".*\.os" name=".*\.ot" name=".*\.pas" name=".*\.patch" name=".*\.pcd" name=".*\.pcs" name=".*\.pcx" name=".*\.pdf" name=".*\.phtm" name=".*\.phtml" name=".*\.php" name=".*\.php0" name=".*\.php1" name=".*\.php2" name=".*\.php3" name=".*\.php4" name=".*\.php5" name=".*\.php6" name=".*\.php7" name=".*\.php8" name=".*\.php9" name=".*\.pif" name=".*\.pl" name=".*\.plm" name=".*\.png" name=".*\.pnm" name=".*\.pps" name=".*\.ps" name=".*\.pt" name=".*\.ram" name=".*\.rar" name=".*\.reg" name=".*\.rpm" name=".*\.rtf" name=".*\.s" name=".*\.scr" name=".*\.sct" name=".*\.shb" name=".*\.shs" name=".*\.shtm" name=".*\.shtml" name=".*\.slp" name=".*\.sql" name=".*\.sys" name=".*\.tar" name=".*\.tex" name=".*\.tgz" name=".*\.tif" name=".*\.tiff" name=".*\.txt" name=".*\.upd" name=".*\.url" name=".*\.ux" name=".*\.vb" name=".*\.vbe" name=".*\.vbs" name=".*\.wav" name=".*\.wmd" name=".*\.wms" name=".*\.wmz" name=".*\.wp" name=".*\.wp0" name=".*\.wp1" name=".*\.wp2" name=".*\.wp3" name=".*\.wp4" name=".*\.wp5" name=".*\.wp6" name=".*\.wp7" name=".*\.wp8" name=".*\.wp9" name=".*\.wpd" name=".*\.wsc" name=".*\.wsf" name=".*\.wsh" name=".*\.xal" name=".*\.xbm" name=".*\.xls" name=".*\.xlt" name=".*\.xlw" name=".*\.xpm" name=".*\.zip" name=".*\.zoo" using the command "egrep -is -f file e-mail_file > /dev/null". On a 433 MHz. Pentium class machine, procmail took 0.229 machine seconds of CPU time, and egrep(1) took 0.432 machine seconds. Removing "name=".*\" from the file, and using fgrep(1), the time required was 0.475 machine seconds. Note that the regular expression search construction in all three cases was modified to accommodate the egrep(1) and fgrep(1) restrictions that regular expressions can not span lines. Appendix IThere is an issue with the way Outlook parses MIME headers. Using the BadTrans.B worm as an example, which contains the MIME e-mail header construct: MIME-Version: 1.0 Content-Type: multipart/related; type="multipart/alternative"; boundary="====_ABC1234567890DEF_====" which is a violation of RFC 822, Section 3.1.1, (there is no preceding linear-white-space in the last two records.) A properly constructed e-mail header parser, (for example, the MIME reference code used in metamail(1),) would not consider such a message to have attachments, and, potentially, pass attachments that contain malicious code on to Outlook for execution. A safe, (and possibly conservative,) alternative is to search the body of the message for the "name" and/or "begin" tags, followed by a file name extension that can contain potentially malicious code. However, there is a substantial performance impact with the implementation-procmail's regular expression search algorithm requires about one CPU second per MByte of file size on a 466 Pentium class machine to execute the fragment. Appendix IIAs outlined in Microsoft Security Bulletin (MS00-075), there was a potential for an e-mail to contain an HTML link, luring an Outlook user to execute a script containing malicious code on a rogue site. Although there has been a fix for over a year, there have been numerous reports submitted to SecurityFocus' BugTraq® mailing list that the problem has not been resolved, and Internet Explorer®, (which is called from Outlook to render HTML in an e-mail,) is capable of executing malicious code disguised as jpg or gif images-which has been denied by Scott Culp, Security Program Manager, Microsoft Security Response Center in an e-mail to the BugTraq mailing list, 29 July, 2001. A safe, (and possibly conservative,) approach is to search the body of HTML messages for links to images, scripts, etc. The fragment will not quarantine strict HTML 4.0 compliant messages without links. Since the body of the message is searched as per Appendix I, the performance impact is minimal. ThanksA special note of appreciation to Stephen R. van den Berg, (AKA BuGless,) the author of procmail, who for nine years developed and supported the procmail program, (the "e-mail system administrator's crescent wrench,") for the Internet community. And, a special thanks to Philip Guenther the current maintainer of procmail, and moderator of the procmail mailing list for providing the search optimization for the procmail "recipe" described above. LicenseA license is hereby granted to reproduce this software for personal, non-commercial use. THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY. So there. Copyright © 1992-2005, John Conover, All Rights Reserved. Comments and/or problem reports should be addressed to:
|
Home | John | Connie | Publications | Software | Correspondence | NtropiX | NdustriX | NformatiX | NdeX | Thanks