[Maia-users] Fuzzy OCR scores....
Robert LeBlanc
rjl at renaissoft.com
Thu Apr 12 21:39:27 PDT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
David Sims wrote:
> Can someone explain how Fuzzy OCR scores work? I have a MM 1.0.0 with
> Fuzzy OCR v2.3b running and here is what the amavis.log shows:
>
> Content analysis details: (8.4 points, 5.0 required)
>
> pts rule name description
> ---- ---------------------- --------------------------------------
> -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
> [score: 0.0000]
> 11 FUZZY_OCR BODY: Mail contains an image with common spam
> text inside
> Words found:
> "alert" in 1 lines
> "alert" in 1 lines
> "trade" in 1 lines
> "banking" in 1 lines
> "enlarge" in 2 lines
> "erectile" in 1 lines
> "expand" in 1 lines
> "patch" in 1 lines
> (9 word occurrences found)
As explained in the wiki:
> It's useful to understand how the plugin assigns its score value to the FUZZY_OCR rule. The rule is only triggered if there are at least focr_counts_required word matches (default: 2) in the image. At that point, the rule's score becomes focr_base_score + focr_add_score for every additional word match (default: 4 + 1/word after the second match). At default values, then, two matching words would score a total of 4 points; three matching words would score 5 points; four would score 6 points, etc. Feel free to adjust these values to your tastes.
In other words, your focr_base_score (default = 4) + focr_add_score
(default = 1) * (word hits - focr_counts_required (default = 2)) gives
you the total score for the FUZZY_OCR rule. If you have these set to
their default values, then, your score in this example would be 4 + 1 *
(9 - 2) = 11, which is what it reported.
> but if I look at the mail in MM's web page I see in the header:
>
> 1.000 AWL From: address is in the auto white-list
> 1.000 FUZZY_OCR Mail contains an image with common spam text inside
> -2.599 BAYES_00 Bayesian spam probability is 0 to 1%
Since you're using Maia 1.0.0, the scores for dynamically-scored rules
like AWL and FUZZY_OCR will always appear as "1.000" in the mail viewer.
This has no effect on the actual score used for discriminating ham from
spam, however; SpamAssassin sees the true values. Upgrading to Maia
1.0.2 will fix this display issue in the mail viewer.
> In any event, the mail is passed as non-spam.... Why?? I have disabled
> SA's AWL factory and do not load the AWL program... I also emptied the AWL
> table in the Maia DB.... What else??
The AWL rule is apparently triggering, despite your efforts to disable
it. Make sure you've deleted (or commented out) references in your
local.cf file to "auto_whitelist_factory", "user_awl_dsn",
"user_awl_sql_username", and "user_awl_sql_password". You should also
set "use_auto_whitelist 0" to disable the non-SQL version of the
feature, and delete (or comment out) the "loadplugin
Mail::SpamAssassin::Plugin::AWL" line from your v310.pre file. Restart
amavisd-maia after making these changes.
- --
Robert LeBlanc <rjl at renaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamailguard.com/>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFGHwl+GmqOER2NHewRAoobAKCNLMSCt+lZHTPSSikblvt5NpI4UQCdG4GK
Xrza5Xw3x+hda0yrVeFp4AQ=
=b+5l
-----END PGP SIGNATURE-----
More information about the Maia-users
mailing list