Image Spam on the Rise

By Terri Wells,

Web Hosting News

November 29, 2006

Normal spam is bad enough, but spammers have come up with a trick that lets them get around many older spam filters. It's called image spam. It has spiked over the past year. What is it, and what can we do about it?

When I first started spotting stock spam in my email inbox, I didn't think much of it - until the volume started increasing. I'd see this screen that displayed apparently normal text for a second, then flash over to what looked like a formatted HTML message. My spam filters weren't catching these messages, no matter how many of them I threw in the junk box after the fact. What was going on?

It turns out that I wasn't alone, and what I took to be a formatted HTML message was in fact an image. It's the latest weapon in the modern spammer's arsenal for getting around spam filters. It's also painfully on the rise.

Image spam has been around for about four years, but it didn't start taking up serious space in inboxes until last year. Postini, a messaging management company, notes that image spam was even on the decline in 2005, from 12 percent of all spam at the beginning of the year to five percent of all spam in November 2005. December 2005, however, saw the beginning of a sharp spike (I'll explain one of the possible reasons for that spike in the next section, when I explain why image spam works so well).

While Postini places that spike at 25 percent of all spam, IT security company Sophos estimated that image spam made up about 18 percent of all spam at the beginning of 2006. Sophos thinks the problem getting much worse than Postini estimates, however. According to Carole Theriault, a senior consultant at Sophos, image spam now makes up 40 percent of all spam. "That's a big increase," Theriault notes, with a gift for understatement.

The IronPort Threat Operation Center noted that global spam has nearly doubled in the past year. In October 2005, spammers sent 31 billion pieces of unsolicited bulk email every day. As of mid-November 2006, that number had reached 61 billion. Whether we're looking at 25 percent or 40 percent of that many messages, that's an awful lot of bandwidth (an issue which I'll talk more about when I discuss the special problems involved in dealing with image spam).

Why is Image Spam So Successful?

To understand why image spam is so successful at dodging spam filters, you need to understand how conventional spam filters work. These filters analyze the content of emails, looking for certain suspicious words and phrases that are known to be associated with spammers, such as "penis enlargement," "Viagra," and "weight loss pills." Many filters are so good at this that they catch clever variations of those words as well (such as those that include misspellings, extra spaces, or unusual characters). These messages are flagged as suspicious and go into a junk folder.

The key point is that spam filters were created to deal with text messages. When they are confronted with an image, they often can't recognize it, even if it's only an image of text. So the spam filter spots nothing out of the ordinary and lets the message get through.

Actually, it's a little more complicated than that. Some spam filters grew clever enough to spot simple types of image spam. At that point, spammers came up with a fiendishly clever trick. They learned how to use a layer of text on top of a layer of a randomly generated background for each message. While humans can easily read the message and tell that it's spam, to a spam filter, each message is unique because of the changing background. Many image spam messages also vary the colors, picture sizes or font types to make them appear more like individual messages to the filters.

It makes sense that spammers would have figured this out right around the time that image spam spiked tremendously. Postini spokeswoman Catherine Leahy said that her company "attributes this increase to spammers testing the deliverability of image spam in early 2005 and realizing that many older spam filters are helpless when messages contain text to analyze, so the use of images helps get their spam received. Upon seeing the positive results, they converted much of their spam to image spam."

The real irony, which is not lost on the makers of spam filters, is that image spam turns a weapon of the computer security experts against them. You may have heard of CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) and even used it yourself. It's the acronym for a technique that prevents "spambots" - a type of automated web crawling program - from signing up for free services such as web hosting, email, and posting comments to blogs. It is even used to keep spambots from getting an email past a filter (in the case of SpamArrest's service, for example). CAPTCHAs show a sign-up form that displays an image of a distorted series of characters. Humans can figure them out, but spambots can't - for much the same reasons that conventional spam filters can't detect image spam.

A Closer Look at Image Spam

Craig Sprosts of anti-spam company IronPort Systems notes that much image spam seems to be coming from gangs in the United States and Russia. Most of it is trying to lure victims for pump-and-dump stock scams. You've probably seen them, promising that a particular stock will take off very soon and telling you to start buying it immediately. The scammer then turns around and sells the stock to make a profit.

Dmitri Allperovitch, a research engineer with CipherTrust, provided some insight into how these scams work. "These are Pink Sheet stocks, traded on the OTC bulletin boards, that typically don't get a lot of volume. They're niche companies with no profit and no products, so when you see a spike from almost no trades to two or three million when the spam is sent out, you know there were a lot of people who fell for it."

Aside from the problem of not being stopped by many conventional spam filters, image spam creates a serious bandwidth issue for many companies. An increase of forty percent in the amount of conventional spam getting through would be bad enough, but these messages are images, not text, and consequently take up more space. Numbers vary, but experts have estimated that a piece of image spam is typically anywhere from three to more than seven times as large as a similar piece of text-based spam.

And spammers continue to come up with new ways to make their messages look different and slip past the spam filters. Richi Jennings, an analyst with Ferris Research, noted in June that "We're now seeing things like taking a big image and splitting it up into different sized tiles that fit together when you view the message. The size and shape of the tiles varies from message to message, so it can be difficult to spot."

Blocking Image Spam

Fortunately, when it comes to catching image spam, all is not lost. Paul Bacca, a spam and virus researcher at Sophos, notes that "We see a lot of image spam and we know which computers are sending it," so they simply block mail from those computers. "We think we catch about 80% of image spam using these conventional techniques."

Other techniques take advantage of the fact that these are, after all, images, and many of them are scanned into a computer. That means they contain information connected to the scanner that was used, such as the number of colors or pixels it uses. Adjust the spam filter to look for those colors and numbers, and suddenly you have a new metric for deciding whether or not a message is spam.

A third technique acknowledges that, image or not, these emails contain text (in the sense of something that humans can read) - and tries to find ways for computers to recognize that text. This means using optical character recognition (OCR) techniques that can extract the text from the image. Once the text is extracted, conventional text filtering techniques can be applied to it. Sadly, that level of OCR is said to be a long way off. "You're looking at technology that is anything from 10 to 30 years away," notes Luis von Ahn of Carnegie Mellon University, one of the developers of the CAPTCHA technique.

Spam filters can also detect and block image spam by examining certain attributes of the sending computer, message envelope and headers. As a general point, Scott Petry, Postini's founder and CTO, recommends that companies pay attention to the volume of incoming messages with image attachments. If a significant portion of these messages are getting through, the IT department may want to restrict their delivery because image spam is a potentially huge bandwidth problem. "You don't want those messages to undermine the availability of data in your enterprise," Petry explained. "It might mean some grumpy users, but at least the mail server will remain up and running." In short, to combat image spam, you will need to use filters that focus on both the content and the origin of the message.