Open Bug 335008 Opened 18 years ago Updated 2 years ago

Spam filtering should use source IP, reverse DNS info, as data for Bayesian filter

Categories

(Thunderbird :: General, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: usenet, Unassigned)

References

(Blocks 1 open bug)

Details

Currently, some countries are disproportionate sources of spam, yet most users do not recieve any legitimate E-mails from those countries. The most signficant octet of an IPv4 address correllates well with the registry that issued it (ARIN, RIPE, APNIC, etc.) and, to a lesser degree, with large ISPs. Thus, the MSB information can be used as a _very approximate_ geocoding hint. 

Similarly, reverse IP lookup data for source IPs could help discriminate by ISP sources, or even by the fact that reverse lookups were either missing or bogus.

These sorts of imprecise hint are exactly the kind of information that Bayesian filtering is designed to exploit, in the (common) case that certain countries or ISP are more, or less, likely to be sources of spam or ham, from the viewpoint of a particular user.

(For any of this to work, header analysis will first need to be robust against most common spoofing attacks.)

Since these would only be two features among many, and would in any case depend on Bayesian learning, using this source IP information will not cause whole countries' E-mail to be marked as spam or not spam; rather, it will only tip the balance in edge cases where the mail is already questionable, in cases where geographical / ISP informatis is already useful.
Isn't this a task of the SMTP mail-server of your ISP, not of the mail-client ? The last few headers will be the one from your ISP anyway.
maybe it is but there might be something Thunderbird can do about it 
Assignee: mscott → nobody
Blocks: junktracker
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.