[Maia-users] newbie problems
Dale Carstensen
dlc at lampinc.com
Sun Dec 3 06:07:26 PST 2006
Ah. I see that maiadbtool.pl is in svn trunk, but not in 1.0.1. Thanks.
I tried to see where the Bayes data is, and found that the tables in
the maia database with bayes and awl in the table names have zero rows.
Hmm. Then I looked here and there, for instance, locate bayes, and
eventually in /var/amavisd/.spamassassin I found some files that must
be the data. The file names are auto-whitelist, bayes_seen and bayes_toks.
The auto-whitelist has a date of mid-Friday afternoon (it's Sunday morning
now) and the other two have a date of about 5 minutes ago. The "locate"
database, of course, skips /var, so running locate was useless, by the
way. "find /var -name '*bayes*'" was the key. I'm not sure locate
finds names beginning with "." or descendants, come to think of it,
and maybe it does do /var and missed it because of the dot.
So then I thought I would see what's really there, maybe what words
contribute what score. The "perldoc sa-learn" documentation says the
data can be viewed in human-readable format via the --backup format.
It also mentions a --dump option. Of course, I had to add another
option, --dbpath=/var/amavisd/.spamassassin, too.
I don't know what comes from "seen" and what comes from "toks" in the
--dump output, though it's similar to the --backup where the first
column is "s" or "t." And for the "t" lines in --backup and the only
lines in --dump, the fourth column looks like a date in seconds since
New Years 1970, and the fifth column is 10 hex digits.
Maybe the bayes data is reasonable?? Maybe not. I did feed all those
miserably mis-classified false negatives back through, after all,
hundreds of them.
Then, via "od -c" I see that auto-whitelist indeed has bad places
like chello.nl and hundreds of other domains that definitely do
not belong in any whitelist.
So I guess I could develop my own large samples of ham and spam,
and feed them through sa-learn with the appropriate options, --dbpath
chief among them, and get a decent Bayesian database.
My question is, can I get any useful representation of the Bayes
data? These hex strings defy interpretation. Well, maybe I just
need to push messages through spamd and see what gets returned --
hmm, how to just do the Bayes part and get the actual score???
And another question: Is this how maia normally uses the Bayes
and AWL features in spamassassin?
Dale
>Dale Carstensen wrote:
>> I'm working on a long reply with Robert LeBlanc's first reply fully
>> quoted. But for now, I have a simple (I hope) question, so I'll
>> put a short quote here, then the question, then a bigger quote for
>> context.
>>
>>> start by fixing the
>>> internal_networks and trusted_networks settings in your local.cf file.
>>> Then you'll want to wipe your Bayes and AWL databases to start fresh.
>>
>> OK. Now, how do I wipe the Bayes and AWL databases without just
>> starting over completely?
>
>options to maiadbtool.pl
>
> --clear-bayes : empty the Bayes database
> --clear-awl : empty the AWL database
More information about the Maia-users
mailing list