Ultimately, any file can be used maliciously. No matter how innocent and unremarkable a file format is, it can probably be malformed in some way by attackers to make it malicious. Even plaintext files can be used maliciously. In the days of DOS, an attacker could send a victim a pure ASCII text file that when read, formatted the user's hard drive. It worked because a driver file called Ansi.sys would convert embedded keyboard control characters into their action-based counterparts. The attacker would embed commands that would remap the user's keyboard so that the next key they pressed formatted their hard drive. These types of attacks were called ANSI-bombs.
Text files and text editors (such as Notepad and Wordpad) can be used maliciously to overwrite legitimate text files (e.g., Autoexec.bat, Win.ini, Hosts, etc.). Another text file trick, more common years ago but still possible today, is for attackers to create text files that when fed through Debug.exe will create malicious executables. Because antivirus programs and other types of security scanners will often flag malicious code being copied onto a new host, attackers can sneak the malicious file by as a "harmless" ASCII text file. With this trick, the text file contains the ASCII representations of machine-language commands. Figure 5-1 shows the Qaz trojan/worm (http://securityresponse.symantec.com/avcenter/venc/data/w32.hllw.qaz.a.html opened in Debug. The ASCII text file would look something like the two bottom middle columns of hexadecimal characters. When assembled with debug, it would be converted into machine language instructions, as shown at the middle top of Figure 5-1.
Figure 5-1
Prior to assembling, a hacker could copy the ASCII source file from their machine to the victim's machine. Most antivirus scanners don't even attempt to scan text files. Even if they did, because the program is only ASCII text, the scanning program would only read the file as their literal character counterparts, as if it were reading a word processing document. Once past the scanners, the hacker would type in something like the following:
Debug <qaz.txt
The outputted program would be called Qaz.com. It could then be executed on the victim's machine, or a command could be added to one of the Windows startup files to execute the program when Windows restarts. The debug method works even when the user is a non-admin. This is because all users have Read and Execute access to the Debug.exe command.
Even pure data and graphic files can be used. Twenty years ago, security experts used to say that data couldn't infect a PC. That's before macro and script viruses came along and before malicious programmers learned to malform picture files into buffer overflow exploits. If every file type can be made malicious, a better question is what file types are considered high-risk files? That is, what file types have been used maliciously or will likely be used maliciously? High-risk files must meet some basic prerequisites before the actual flaws are even considered:
The file type must be popular among a large class of users, ensuring that software that will be interpreting the file is commonly installed on a large percentage of computers.
The file type is more likely to be used for malicious purposes by attackers than by authorized users for legitimate purposes.
It must have a high incidence of malicious use, even if most of its use is legitimate.
Once those prerequisites are satisfied, the file type must contain a flaw or allow malicious use.
Files can be flawed by design or misused to cause unintended consequences.
Developers often create vulnerable file formats or allow the programs that run them to have insecure functionality.
For instance, when Microsoft's Windows Scripting Host first came out, any script file ending in .Hta, .Ws, or .Cs would automatically be executed by Wscript.exe or Cscript.exe when double-clicked or downloaded. Hta files could automatically launch in Internet Explorer and then have complete access to the system (under the user's logon credentials). When Windows Scripting Host first came out, there was no runtime message to warn the user, and scripts were allowed to modify system values and information without recourse. Zero security was built into Windows Scripting Host and there wasn't any for nearly a year. Hta and other scripting files became a hacker's dream. Visual Basic Script (.Vbs) files had similar treatment early on. Macro viruses became popular in the mid-1990s because of the same design flaw — no security risk evaluation was ever performed before the software was released.
Another common problem is that the file format contains unexpected, exploitable holes that allow malicious use. For example, hackers are frequently malforming graphic file formats (e.g., PNG, GIF, JPG, ICO, etc.) so that the related rendering programs error out and initiate a buffer overflow. Open the wrong picture today and your computer is the hacker's. Often the program is as easy as typing the wrong file size in the header. For example, a hacker could create a graphic file with a file size of 1 or 0 written in the file's header. The program that interpreted the file would read the malformed file size, somehow calculate the file size to be a negative billion value, and result in a buffer overflow.
Many file formats allow other embedded files and links to be embedded within the file. For instance, Microsoft Office files can contain links to other documents such that when the original document is opened, the linked document is downloaded and executed too. Many file types were originally fairly safe, but gained the capability to contain additional linked documents as their popularity grew (e.g., PDF, RTF, XLS, etc.). So whereas the original document type is still safe, the linked document can be used to accomplish the exploit. Many files, such as Program Information Files (.Pifs) and Scrap Shell (.Shs) files contain links to other documents by design. A security scanning tool, if it isn't programmed to look for and enumerate links, might miss the linked document. Hackers commonly exploit program GUI skins (i.e., GUI themes). The skins were made to allow users to create and easily transfer different program look-and-feels. Unfortunately, for end users, many skins are allowed to have embedded links, and the links can download or execute malicious code.
Oftentimes, though, malicious hackers take files and programs and do unexpected things. For example, most OS platforms offer a variety of compression options for choosing a program archiver. In the Windows world, the most popular program archiver is the Zip file format (unarchiving is natively supported by Windows XP and later, and using Winzip and Pkzip on previous platforms). Its creators never intended the .Zip file format to be used maliciously, and probably didn't think about it being used maliciously. But today, attackers frequently use the .Zip file format to bypass security protections, including the following:
Archiving malware programs inside of .Zip files so that programs that do not open archive files and scan contained files (called recursive scanning) will not detect the malware threats. Some viruses send themselves as password-protected .Zip files. The receiver is given the password to open the malicious archive, but the virus scanner skips scanning the file because it "can't see the password" and can't open the archive.
Continually re-archiving the same file over and over again (called nesting) so if the scanning program does open scanned files, it might not de-archive the file enough to scan the original file contents
Overwriting another legitimate file with a malicious version
Using the file format to automatically run one of the contained programs when the file is opened
Renaming other file types (i.e., MS-Word Documents) to .Zip file extensions so that when they are opened in Windows Explorer, the other document type executes instead
Creating a subdirectory structure within the archive file format that, when opened, opens up dozens to millions of child directories. Some security scanners cannot or will not scan down past a certain number of subdirectories. Other attacks using this method have also created so many child subdirectories that the OS ends up out of usable space.
Other flaws include header mismatches. A file may have a file extension claiming to be one type of program when it is in fact another type of program. In its "pretend state," it bypasses normal security mechanisms, but at some point in the execution path it is rendered in its intended malicious form. An example here is that any OLE2 file (i.e., most Microsoft Office documents) can be renamed to any extension (e.g., FISH.TXT), and when executed will run as the legitimate document type and attempt to open the normal MS-Office program.
The key is that the legitimate design of the program or file type unintentionally allows malicious behavior.
Many programs allow configuration files to be created, and when the configuration file is clicked, it modifies the related system settings. For example, registry settings can be imported using .Reg files. A malicious hacker can e-mail a victim a .Reg file and if the registry edit file is clicked, it modifies the registry. .Ins files can be used to initiate unauthorized connections to Internet sites. .CER and .CTL files can be used to surreptitiously install an attacker as a trusted resource. Table 5-1 lists many common file types that will manipulate Windows or an installed setting simply by the user double-clicking on it.
File Extension | File Type | Malicious Use Details |
---|---|---|
.ade, .adp, .and | Microsoft Access project files | Can contain auto-executing macros |
.ani | Windows Animated Cursor | Two exploits were announced by Flashsky Fangxing (flashsky@xfocus.org) on Dec. 23, 2004. First, a Windows Kernel DoS exploit: Windows XP SP2 not vulnerable, but most other Windows versions are (NT to 2003). Second, an Integer buffer overflow: most Windows versions are vulnerable (NT to 2003), caused by LoadImage API in USER32.Lib. |
.arc | File Archive File format | Older, pre-Windows file archive file format. Still used occasionally by malware to bypass computer security defenses. |
.arj | File Archive | Can be used by malware to bypass computer security defenses. Arj files can be created and unarchived using many popular programs, including Winzip. More detail on the .arj program can be found at http://filext.com/detaillist.php?extdetail=ARJ. |
.asf, .lsf, .lsx | Streaming audio or video file | Can be exploited through buffer overflows, header malformation, or dangerous scriptable content |
.atf | Symantec pcAnywhere autotransfer file | Can initiate a pcAnywhere file-transfer session |
.bas | Visual Basic (VB) class module | Can contain malicious instructions |
.bat | DOS batch file | Can contain malicious DOS command interpreter instructions. Also can contain executable .Exe code that will run even with the incorrect extension. |
.bmp | Windows Bitmap graphics file | Integer buffer overflow, announced on Dec. 23, 2004. Most Windows versions were vulnerable (NT to 2003) until patched; caused by LoadImage API in USER32.Lib. |
.cab | Microsoft cabinet archive file | Opens in Windows Explorer, IE, and can help install malicious files. Commonly used by Microsoft to install legitimate files, but could be used by malware to bypass computer security defenses. Unexpected CAB files arriving via e-mail or from untrusted web sites should not be opened. |
.cbo, .cbl, .cbm | Microsoft Interactive Training file | User= field allows an exploitable buffer overflow (SEH pointer). Microsoft Interactive Training (Orun32.exe) must be present, although it is often present by default in OEM versions of Windows XP. First exploit of this file type announced on June 14, 2005 by iDEFENSE labs. Patched by MS05-31. HK_CR\MITrain.Document\shell\open\command is related to the Orun32.exe program. |
.cer, .crt, .der | Security certificate | Can install a malicious certificate in IE to permit automatic downloading of malicious content |
.chm | Windows Compiled Help File | Windows Help Files (.hlp) can be compiled for better performance and feature sets. Malformed Compiled Help Files have been involved in many announced exploits over the years, including Microsoft Security Bulletin MS05-031. Can be opened in Internet Explorer automatically without user intervention using Ms — its moniker. |
.cmd | Command file | Contains batch-file-like DOS interpreter script commands. Can contain malicious instructions. Also can contain executable .Exe code that will run even with the incorrect extension. |
.com | Program executable | Older, legacy DOS and 16-bit Windows executables. Still work under all Windows versions, except newer 64-bit Windows. |
.cpl | Control Panel Applet | Executable program written to run in Control Panel context. Can be infected by viruses or used by malware programs to install themselves. Example includes a Win32.Beagle variant (http://securityresponse.symantec.com/avcenter/venc/data/w32.beagle@mm!cpl.html). If located so as to be part of Control Panel, listing the contents of Control Panel can run malware code even before any specific Control Panel item is "opened." This risk is fairly unpublicized and may not have been exploited as yet; it's known by-design functionality, however. |
.css | Cascading Style Sheet | Used by IE and other browsers. Used by web developers to easily deliver a consistent look-and-feel style to a web site without having to recode the style on each web page. Has been exploited maliciously many times. |
.ctl | Certificate Trust List | Could be used by a remote attacker to trick a victim into installing the attacker as a trusted publisher |
.cur | Windows cursor graphic file | Integer buffer overflow, announced by flashsky fangxing (flashsky@xfocus.org) on Dec. 23, 2004; most Windows versions are vulnerable (NT to 2003); caused by LoadImage API in USER32.Lib. |
.dbg | Debug file | Can contain malicious machine-language instructions that can be compiled by debug.exe into malware |
.dll | Dynamic Linking Library | Most Dlls are legitimate program files containing pre-compiled library routines that other programs can call, or can contain complete programs. Have been involved in many viruses and worms. Because of Windows File Protection, most Windows system Dlls cannot be overwritten or modified by malware, but rogue Dlls can be installed. Dll code can also be run as a stand-along executable if run via RunDLL or similar generic .Dll "launching wrappers" that are legitimate parts of Windows. The advantage to malware is that these "wrapped" Dll processes are typically named in Task Manager by the parent wrapper name. Host-based firewalls and IDSs that monitor applications only see the parent wrapper name. That's why .Dll files in particular are commonly used by trojans. |
.doc | Microsoft Office Word Document | Can contain malicious macros, scripts, objects, links, and executables. Very difficult to block because legitimate use is very common. By default, many malicious objects are blocked by default in recent versions of Microsoft Office. |
.dot | Microsoft Office Document Template | Can be manipulated by malware to contain malicious objects that are then added to every new document that relies on the related template file. Very commonly manipulated by early Microsoft Office macro files, but not as commonly modified by malware today. |
.dsm, .far, .it, .stm, .ult, .wma | Nullsoft WinAmp media file | Has been involved in malicious exploits |
.dun | DUN export file | Can contain malicious dial-up connection information that initiates outward calls |
.edt | Adobe Reader PDF ebook file | Involved in at least one announced exploit (www.idefense.com/application/poi/display?id=163) in 2004. If ebook functionality is not needed, it can be blocked without affecting overall Adobe Reader functionality. |
.eml, .email | Outlook Express e-mail message | Used by Nimda and many other worms. Eml files are opened in Outlook Express even when some other e-mail application is the current default. Attachments can be hidden within the file and will be clickable, and any exploits affecting Outlook Express or the Internet Explorer rendering engine will be exposed. |
.exe | Application file | Can be used to launch malicious executables |
.fav | IE Favorites list | Can be used to list malicious web sites that the user then visits |
.gif | Graphic file format | GIF stands for Graphics Interchange Format. Although normally just a picture or image data file, it has be malformed to cause improper application handling and buffer overflows. It has impacted several applications, including Microsoft Windows Messenger, which was patched to fix a GIF exploit (see Microsoft Security Bulletin MS05-022). |
.gzip, .gz, .taz, .tgz | Gzip file format | Can be used by malware to bypass computer security defenses. Very common on Unix/Linux platforms, but can also be used in Windows. See .tar also. |
.hlp | Microsoft Help File | Can be used in multiple exploits |
.ht | Hyperterminal file | Can initiate dial-up connections to untrusted hosts |
.hta | HTML application | Frequently used by worms and trojans |
.htm, .html, .dhtml, .shtml | HTML file | Can initiate an IE session and be used to automatically download and execute rogue files. All the "active content" risks, e.g., scripting, apply here as well. |
.htt | Explorer Stylesheet | Can be used/manipulated by adware/malware to display unwanted browser Windows and popups. Used by the View As Web Page display attribute as a way of inte-grating HTML content. .htt files have the potential to make any writable network share a point of malware entry from infected systems that can see that share. |
.ico | Windows Icon graphic file | Integer buffer overflow announced on Dec. 23, 2004; most Windows versions were vulnerable (NT to 2003) until patched. Caused by LoadImage API in USER32.Lib. |
.inf | Install configuration file/security template | A Setup Information installer configuration file, it can be used to maliciously manipulate existing programs or to install new malicious programs. As a security template, it could be used to downgrade existing security permissions. |
.ini | Application configuration settings file | Can be used to maliciously change a program's default settings. Also, Desktop.ini can be used to auto-launch malicious programs. Desktop.ini and .htt files have the potential to make any writable network share a point of malware entry from infected systems that can see that share. It's a significant integration point, as it's tedious to find and check all Desktop.ini files for references that launch malware. |
.ins, .isp | Internet communication settings | Can be used to initiate Internet connections to untrusted sources |
.jar | Java archive file | Can launch Java attacks |
.jav, .java | Java applet | Can launch Java attacks |
.jpg, .jpe, .jpeg, .jfif | JPEG files | JPEG stands for Joint Photographic Experts Group. Although normally just a graphics file format, it has been malformed to cause buffer overflows in various applications. |
.js, .jse | JavaScript (encoded) file | Can contain malicious code. JSE files are encoded JavaScript files that can easily be decoded and read by Windows and IE. These files are executed by Wscript.exe, Cscript.exe, or JScript.dll. |
.lnk, .desklink | Shortcut link | Can be used to automate malicious actions |
.lzh | Archive file format | Can be used by malware to bypass computer security defenses. Used on Windows platforms, especially by game developers or Japanese programmers, but is not common. |
.mad, .maf, mda, .mas, .mag, .mam, .maq, .mar, .mat, .mav, .maw, .mdn, .mdt, .mdx | Access module shortcut | Can carry out macro manipulation that isn't controlled by Office security settings |
.mdb, .mdbhtml | Access application or database | Can contain malicious macros |
.mde | Access database with all | Can contain malicious macrosmodules com piled and source code removed |
.mhtml, .mhtm | MIME HTML document | Can contain harmful commands |
.mim | MIME file | Could become a target of future MIME exploits |
.msg, .mmf | Microsoft Mail or Outlook Express item | Can carry a virus or worm |
.msh | Microsoft Shell Command file | New file format in Windows Vista, used to replace previous shell language files (e.g., .bat, .cmd, etc.). Demonstration viruses have already been developed exploiting this file format (www.f-secure.com/v-descs/danom.shtml). |
.msi, .msp | Microsoft Installer package | Can be used to install or modify software |
.mst | Visual Basic test source file | Can be used maliciously |
.nws | Outlook Express news message | Network newsgroup protocol. Can carry viruses, worms, and other malware. |
.ocx | ActiveX control | Can be used to install malicious ActiveX programs |
.oft | Outlook Template file | Outlook Template file that can contain malicious scripting or objects. Not commonly used by malware. E-mail worms and viruses can sometimes harvest legitimate e-mail addresses from OFT files. |
.ovl | Program overlay file | Commonly used by legitimate programs. Can be used to install malware, or legitimate ones can be infected by viruses. |
.pdc | Microsoft compiled script | Can contain dangerous code |
| Adobe Reader Portable Document Format | Involved in several exploits over the years. Difficult to block because of widespread legitimate use. By design, Acrobat Reader can auto-run scripts (JavaScript) within Pdf files; this feature can be disabled in Adobe Reader 7.x or later versions. |
.pif | Program information file | Can run malicious programs, and the file extension is always hidden throughout Windows by default. These files also define their own icons, as contained within the file, further assisting attempts to disguise them as "safe" file types, and potentially facilitating run-on-display "icon" exploits. |
.pl | Perl script file | Can contain rogue code |
.png | Portable Network Graphics file | PNG is an open-source graphics format with lossless compression (www.libpng.org/pub/png). Has been involved in several exploits, including multi-browser buffer overflows. Last PNG IE buffer overflow resolved by MS05-025. |
.pol | Windows Policy file | Could be used to lower security settings on Windows 9x and later machines |
.ppt, .ppa, .pot, .ppthtml, .pothtml | Microsoft Powerpoint presentation, add-in, or template file | Can contain scripted exploits |
.prf | Outlook profile settings | Can override default or trusted settings |
.pst | Outlook or Exchange personal store file | Can contain malicious attachments and be imported into Outlook or Outlook Express |
.pwl | Windows 9x password file | Could be used to overwrite legitimate passwords |
.py | Python script file | Can contain rogue code |
.rar | WinRAR archived file | Used by malware to bypass detectors that normally open zip files but don't open RAR files. Used by Beagle worm among others. See http://schmidt.devlib.org/file-formats/rar-archive-file-format.html. |
.rat | Internet Explorer content advisor ratings file | Part of Internet Explorer's content advisor rating feature. Can be installed to allow malicious web sites to be approved as secure. Also can be used on IIS web sites to pre-rate content to be delivered to visitors. If installed on IIS, could be used to execute malicious program instructions. Has been involved in a malicious buffer overflow announcement in the past. |
.rdp | Remote Desktop Top connection shortcut | If an end user can be tricked into running a malicious RDP file, it could execute local commands, or map a drive (should provide warning in XP Pro and later) to remote malicious machine, giving the attacker access to local files. Currently not popularly exploited. |
.reg, .key | Registry entry file | Can create or modify registry keys |
.rtf | Rich Text Format file | Can script other attacks and contain embedded malicious links. This problem exists because MS Word will auto-run Word document macros within .RTF files, even though .RTF is supposed to be a "safe" file type for information interchange. |
.scf | Windows Explorer command | Could be used maliciously in future attacks |
.scp | DUN script | Can initiate rogue outbound connections |
.scr | Windows screen saver file | Can contain worms or trojans. Essentially, a .Scr file is an .Exe file. |
.shs, .shb | Shell scrap object | Can mask rogue programs by containing links to other programs. Shell scrap file objects can have hidden extensions even when Windows is told to display hidden file extensions. |
.slk | Excel SLK data-import file | Can contain hidden malicious macros |
.stl | Certificate Trust List (CTL) | Can induce a user to trust a rogue certificate |
.swf, .spl | Shockwave Flash object | Can be exploited |
.sys | Driver or configuration file | Used by many auto-run files, including config.sys. Can be used to install malicious programs. Legitimate .sys files can be infected by viruses. |
.tar, .taz, .tgz, .tz | Archive file format | TAR stands for Tape Archive file format. Common Linux/Unix archive file format, but is used in Windows. Can be used by malware to bypass computer security defenses. |
.url | Internet shortcut | Can connect user to malicious web site or launch a malicious action |
.uu, .uue | Archive file format | UUecode file format is used to send program files and other objects through plaintext e-mail. Used to be common across most PC platforms in the early days of the Internet, but is not common today. Can be used by malware to bypass computer security defenses. |
.vb, .vbe, .vbs | VBScript file | Can contain malicious code. VBE files are encoded VBScript files that can easily be decoded and read by Windows and IE. These files are executed by Wscript.exe, Cscript.exe, or VBScript.dll. |
.vcf | vCard file format | Used in many e-mail clients, including Outlook and Outlook Express to communicate recipient addressing details. Has been involved in a few exploits. |
.vxd, .386 | Virtual device driver | Can trick a user into saving a trojan version of a legitimate device driver |
.wbk | Word backup document | Can contain dangerous macros |
.wiz | Wizard file | Used by Microsoft to launch end-user-friendly "wizards" that walk new users through common tasks. Could be used to automate a future social engineering attack but is not a common malware vector. |
.ws, .cs, .wsf, .wsc, .sct | WSH file | Can execute malicious code |
.xla, .xlb, .xlc, .xld, .xlk, .xll, .xlm, .xlt, .xlv | Excel file types | Can contain dangerous macros and code |
.xls, .xlshtml, .xlthtml | Excel spreadsheet | Can contain dangerous macros and code |
.xml, .xsl | XML file | Likely to be the next language of choice for malicious coders |
.z | Gzip file format | Can be used by malware to bypass computer security defenses. Very common on Unix/Linux platforms, but can also be used in Windows. |
.zip | Pkzip or Winzip archive file | Can be used maliciously several ways, including: 1) Can allow malware to bypass file integrity checkers and antivirus software that does not unzip (password-protected) zip files, 2) Can contain a zip file within a zip file (several levels of nesting possible) to bypass security programs that do not do recursive scanning, 3) Can be used to auto-launch programs when the file is unzipped, 4) Can be used to overwrite other legitimate files, 5) Can be used to create an overwhelming number of directories and subdirectories, causing quota problems, low disk space, and other operating system abnormalities. The latter problem has also been used to bypass security programs that do not handle long and "deep" directory names well. |
Other files are more dangerous simply because of their file name. Windows has dozens of files that will be read and executed in a certain way just because of their name. For instance, a file called Autorun.inf located on a CD-ROM or DVD disk will be automatically executed when the removable media is first inserted. The Normal.dot file becomes the default Microsoft Word template simply because of its name. A file called Desktop.ini will take special precedence over any other file in a Windows folder when the folder is double-clicked.
Note | Special thanks to Microsoft MVP Chris Quirke for this section. |