Introduction
This is an analysis of spam received at a set of spam traps during the week of Monday February 9th 2004 (ISO week number 07, from Monday February 9th 2004 to Sunday February 15th 2004 inclusive). Most of the spam traps are expired or non-existent accounts, the remaining few have been unused for a long time with no legitimate email traffic. All the expired addresses have been bouncing mail for over a year, most for more then three. None of the spam traps have been actively seeded to receive spam. For the purpose of this report, spam is defined as any message delivered to these spam traps. All times and dates are relative to UTC.
While it is impossible to say with any level of certainty how representative these numbers are of spam in general, I believe they are close enough to provide useful information to anyone interested in spam or spam prevention.
Summary
A short summary of this weeks numbers.
- Total number of messages:
- 2175
- Unique message bodies:
- 2064
- Unique sender IP addresses:
- 1635
- Average message size:
- 12501
Client DNS
Reverse DNS
The client IP address is looked up in DNS at the time of delivery. The following shows the number of clients with and without published reverse DNS information, and the number of clients with matching forward and reverse DNS information.
- Clients without reverse DNS:
- 609 (28%)
- Clients with reverse DNS:
- 1566 (72%)
- Clients with matching forward and reverse DNS:
- 1436 (66%)
Originating Domain
The 10 most frequent domains (reverse excluding the host name/left most label). Only clients with matching forward and reverse DNS information are considered. Count is the number of messages delivered from clients within the given domain. Percentage is relative to the number of messages delivered from clients with matching forward and reverse DNS information.
| Domain | Count | Percentage |
|---|---|---|
| client.comcast.net | 97 | 6.8% |
| dyn.optonline.net | 32 | 2.2% |
| sheck-buy.com | 26 | 1.8% |
| cable.mindspring.com | 17 | 1.2% |
| dsl.pltn13.pacbell.net | 17 | 1.2% |
| dip.t-dialin.net | 16 | 1.1% |
| ne.client2.attbi.com | 16 | 1.1% |
| client.mchsi.com | 14 | 1% |
| alcatel.com | 13 | 0.9% |
| bb.netvision.net.il | 13 | 0.9% |
Client Countries
Client Country Distribution
Client country is determined using the WebHosting.Info ip-to-country data. Data is updated monthly if there are updates available. This shows the top 15 countries for delivering clients. Count is the number of messages delivered from clients in the given country. Percentage is relative to the total number of messages received in the reporting period.

| Contry | Count | Percentage |
|---|---|---|
| United States | 1045 | 48% |
| China | 155 | 7.1% |
| Republic Of Korea | 136 | 6.3% |
| Canada | 88 | 4% |
| Brazil | 64 | 2.9% |
| France | 51 | 2.3% |
| Netherlands | 48 | 2.2% |
| United Kingdom | 36 | 1.7% |
| Israel | 36 | 1.7% |
| Germany | 33 | 1.5% |
| Italy | 32 | 1.5% |
| India | 31 | 1.4% |
| Spain | 30 | 1.4% |
| Poland | 26 | 1.2% |
| *unknown* | 25 | 1.1% |
Senders
Claimed Sender Domains
Domain name claimed in envelope sender addresses. It is commonly accepted that a significant share of all spam is sent with a forged sender address, so the following list is useless for identifying message origin. It is included here to show spammers domain preference when forging sender addresses.
| Domain | Count | Percentage |
|---|---|---|
| yahoo.com | 108 | 5% |
| 95 | 4.4% | |
| hotmail.com | 48 | 2.2% |
| msn.com | 32 | 1.5% |
| aol.com | 27 | 1.2% |
| sheck-buy.com | 26 | 1.2% |
| syndicatesales.biz | 19 | 0.9% |
| gamesandspecialofferstoo.com | 12 | 0.6% |
| juno.com | 12 | 0.6% |
| cisco.com | 11 | 0.5% |
Clients
A port scan is performed on all clients delivering messages to spam traps. The scan is started as soon as possible after message delivery, usually within a few seconds.
Please note that in cases where the client is located behind a router or firewall doing NAT, the target of the scan may be the router/firewall not the sending client.
Services On Unfiltered Ports
Services are identified by nmaps version detection feature. The following lists the 15 most common services found on unfiltered ports. The count is the number of distinct IP addresses, the percentage is relative to the total number of distinct IP addresses seen within the reporting period.
| Service | Count | Percentage |
|---|---|---|
| Microsoft Windows msrpc | 709 | 43.4% |
| Microsoft Windows UPnP | 468 | 28.6% |
| Microsoft Windows XP microsoft-ds | 225 | 13.8% |
| OpenSSH | 127 | 7.8% |
| Apache httpd | 126 | 7.7% |
| Microsoft mstask | 102 | 6.2% |
| Microsoft Terminal Service | 66 | 4% |
| Microsoft Distributed Transaction Coordinator | 65 | 4% |
| Microsoft IIS webserver | 57 | 3.5% |
| Sendmail smtpd | 56 | 3.4% |
| Microsoft ESMTP | 46 | 2.8% |
| ISC Bind | 44 | 2.7% |
| KaZaA client | 40 | 2.4% |
| Microsoft DNS | 34 | 2.1% |
| Microsoft ftpd | 33 | 2% |
Client OS Vendor
Client operating systems are identified by nmaps TCP/IP fingerprinting feature. The following lists the 10 most common client operating system vendors. In cases where TCP/IP fingerprinting did not successfully identify the clients operating systems, it is listed as *unknown*.

| Vendor | Count | Percentage |
|---|---|---|
| Microsoft | 892 | 54.6% |
| *unknown* | 463 | 28.3% |
| Linux | 149 | 9.1% |
| FreeBSD | 69 | 4.2% |
| Turtle | 19 | 1.2% |
| Sun | 7 | 0.4% |
| IBM | 6 | 0.4% |
| Apple | 4 | 0.2% |
| Cisco | 4 | 0.2% |
| Smoothwall | 4 | 0.2% |
Time Distribution
Day of Week
Distribution of messages over days of week. The count shows the number of messages received on each day, the percentage is relative to the total number of messages received within the reporting period.

| Day | Count | Percentage |
|---|---|---|
| Mon | 366 | 16.8% |
| Tue | 386 | 17.7% |
| Wed | 395 | 18.2% |
| Thu | 297 | 13.7% |
| Fri | 254 | 11.7% |
| Sat | 207 | 9.5% |
| Sun | 270 | 12.4% |
Time of Day
Distribution of messages over time of day. 00 describes the hour from 00:00 to 01:00. The count shows the number of messages received within each hour, the percentage is relative to the total number of messages received within the reporting period.

| Hour | Count | Percentage |
|---|---|---|
| 00 | 88 | 4% |
| 01 | 97 | 4.5% |
| 02 | 111 | 5.1% |
| 03 | 87 | 4% |
| 04 | 106 | 4.9% |
| 05 | 62 | 2.9% |
| 06 | 97 | 4.5% |
| 07 | 79 | 3.6% |
| 08 | 99 | 4.6% |
| 09 | 73 | 3.4% |
| 10 | 84 | 3.9% |
| 11 | 86 | 4% |
| Hour | Count | Percentage |
|---|---|---|
| 12 | 93 | 4.3% |
| 13 | 104 | 4.8% |
| 14 | 86 | 4% |
| 15 | 89 | 4.1% |
| 16 | 70 | 3.2% |
| 17 | 95 | 4.4% |
| 18 | 95 | 4.4% |
| 19 | 93 | 4.3% |
| 20 | 89 | 4.1% |
| 21 | 94 | 4.3% |
| 22 | 107 | 4.9% |
| 23 | 91 | 4.2% |
Size Distribution
Message Size Distribution
Message size distribution over all messages. Count is the number of messages within the given size range. The percentage is relative to the total number of messages received within the reporting period.

| Size | Count | Percentage |
|---|---|---|
| 1.5KiB > size >= 0.5KiB | 624 | 28.7% |
| 2.5KiB > size >= 1.5KiB | 344 | 15.8% |
| 30.5KiB > size >= 29.5KiB | 296 | 13.6% |
| 3.5KiB > size >= 2.5KiB | 177 | 8.1% |
| 4.5KiB > size >= 3.5KiB | 126 | 5.8% |
| 31.5KiB > size >= 30.5KiB | 122 | 5.6% |
| 5.5KiB > size >= 4.5KiB | 93 | 4.3% |
| 32.5KiB > size >= 31.5KiB | 79 | 3.6% |
| 0.5KiB > size >= 0 | 46 | 2.1% |
| 6.5KiB > size >= 5.5KiB | 41 | 1.9% |
DNS Block Lists
These numbers shows the presence of the sending client in a DNS block list at the time of message delivery. The selection of DNS block lists may be updated on a monthly basis.
Senders In DNS Block Lists
Sending clients present in DNS block list at the time of delivery. The count is the total number of messages delivered from clients listed in the respective block list. The percentage is relative to the total number of messages received within the reporting period.
| DNS block list | Count | Percentage |
|---|---|---|
| bl.spamcop.net | 1282 | 58.9% |
| xbl.spamhaus.org | 987 | 45.4% |
| cbl.abuseat.org | 985 | 45.3% |
| dul.dnsbl.sorbs.net | 781 | 35.9% |
| sbl.spamhaus.org | 383 | 17.6% |
| dnsbl.ahbl.org | 361 | 16.6% |
| socks.dnsbl.sorbs.net | 250 | 11.5% |
| http.dnsbl.sorbs.net | 245 | 11.3% |
| spam.dnsbl.sorbs.net | 132 | 6.1% |
| relays.visi.com | 106 | 4.9% |
| smtp.dnsbl.sorbs.net | 13 | 0.6% |
| misc.dnsbl.sorbs.net | 12 | 0.6% |
| relays.ordb.org | 7 | 0.3% |
| zombie.dnsbl.sorbs.net | 5 | 0.2% |
DNS Block List Groups
To investigate overlap of the different DNS block lists and the effectiveness of combinations of block lists, the following shows the number of messages delivered from clients listed in at least one of the groups block lists. The set of groups may be updated on a monthly basis.
- all:
- 1699 of 2175(78.1%)
- bl.spamcop.net, cbl.abuseat.org and dul.dnsbl.sorbs.net:
- 1569 of 2175(72.1%)
- bl.spamcop.net and dul.dnsbl.sorbs.net:
- 1484 of 2175(68.2%)
- bl.spamcop.net and cbl.abuseat.org:
- 1392 of 2175(64%)
- *.dnsbl.sorbs.net:
- 1119 of 2175(51.4%)
SpamAssassin
All messages are filtered through SpamAssassin and reports are generated on hit rates excluding Bayesian classification scores.
SpamAssassin Without Bayesian Classifier
SpamAssassin hits when disregarding the result of the Bayesian classifier. The count is the number of messages for the given score range. The percentage is relative to the total number of messages received within the reporting period.

| Hits | Count | Percentage |
|---|---|---|
| 0 > hits | 4 | 0.2% |
| 5 > hits >= 0 | 1164 | 53.5% |
| 10 > hits >= 5 | 477 | 21.9% |
| 15 > hits >= 10 | 281 | 12.9% |
| 20 > hits >= 15 | 161 | 7.4% |
| hits >= 20 | 88 | 4% |
Distributed Checksum Clearinghouse
DCC Matches
Number of matches for the three matching algorithms used by DCC. The DCC servers are queried at the time of delivery. The Body, Fuz1 and Fuz2 columns shows the number of messages matched in the count range for their respective algorithm. The Highest column shows the number of hits from the algorithm returning the highest match count. The percentage is relative to the total number of messages received within the reporting period.

| Range | Body | Fuz1 | Fuz2 | Highest |
|---|---|---|---|---|
| 25 >= count | 2067 (95%) | 1502 (69.1%) | 729 (33.5%) | 637 (29.3%) |
| 50 >= count > 25 | 1 (0%) | 14 (0.6%) | 17 (0.8%) | 19 (0.9%) |
| 75 >= count > 50 | 3 (0.1%) | 22 (1%) | 24 (1.1%) | 24 (1.1%) |
| 100 >= count > 75 | 0 (0%) | 32 (1.5%) | 16 (0.7%) | 16 (0.7%) |
| count > 100 | 104 (4.8%) | 605 (27.8%) | 1389 (63.9%) | 1479 (68%) |
About
On the Origin Of Spam is published as weekly and monthly reports by B. Johannessen in the hope that it will be useful to the anti-spam community.