Retry timeout exceeded – Exim greylist problem

This article relates to Exim 4, running in a WHM/cPanel environment under Centos, but may affect other configs too.

You may find instances where a local user tries to send mail to a host that operates greylisting. The messages never gets to the recipient. You see things like this in the exim_mainlog

2011-11-10 15:14:05 1ROWKK-0003I1-Ia <= (FredBlogs) [] P=esmtp S=7852 id=!&!AAAAAAAAAAAYAAAAAAAAAEDCVk4NrhRJjsshyvaOnAfCgAAAEAAAAOV7jpjiT51Jm/ T="FW: test" for
2011-11-10 15:14:06 1ROWKK-0003I1-Ia == <> R=lookuphost T=remote_smtp defer (-44): SMTP error from remote mail server after RCPT TO:<>: host []: 451 Greylisted, please try again in 223 seconds
2011-11-10 15:14:06 1ROWKK-0003I1-Ia ** retry timeout exceeded
2011-11-10 15:14:06 1ROWKK-0003I1-Ia Completed

Possible simple reasons for messages failing due to greylisting

Queue Interval time

Now, I’m assuming here that you have a sensible queue retry interval set in the exim command ( the -q switch ). You can check this by running

ps aux | grep exim

and checking the output…

/usr/sbin/exim -bd -q15m

The -q15m above means the queue is running every 15 minutes. In a WHM/cPanel environment you should set this in the Tweak Settings > Mail section.

If your retry interval is too long, you may miss the greylist window, and get greylisted again upon retrying, and thus eventually the message will fail.

Max load queue runner sleep factor

By default, exim will not run the queue if the system linux load average goes above 3.00 – on a modern server with a dozen cpu cores this is a patently silly value. The value should really be set to at least the number of cores on the machine. The actual exim config variable concerned is deliver_queue_load_max.

You can adjust/over-ride the value in the default exim.conf file by adding this to the first box at the top of the advanced exim config screen in WHM.

deliver_queue_load_max = 12

More tricky reasons for messages failing

Exim uses a number of hints databases. On a cPanel server, these are in /var/spool/exim

First thing to do is check what exim thinks the next retry will be for your failed message:

# exinext
Route:<> error -44: SMTP error from remote mail server after RCPT TO:<>: host []: 451 Greylisted, please try ag
  first failed: 03-Nov-2011 10:17:20
  last tried:   16-Nov-2011 09:23:21
  next try at:  16-Nov-2011 17:23:21
  past final cutoff time

Now. In this instance I know the message was only sent on 16th Nov so there must be a bug somewhere for exim to think it was first sent on 03rd Nov.

My first try was to run

# exim_tidydb -t 7d /var/spool/exim retry

This removed a whole bunch of retry data from the database, but to no avail. Exim still had any message going to this domain as originally failing on the 3rd Nov.

I then decided to rip out the data for this domain directly using exim_fixdb. The man entry for exim_fixdb is a bit dry, and doesn’t really tell you how to identify the record keys, but it’s actually quite easy once you find out how!

First, search the database for your suspect domain:

# exim_dumpdb /var/spool/exim retry | grep<> -44 13133 SMTP error from remote mail server after RCPT TO:<>: host []: 451 Greylisted, please try ag

The key to the hints database record is<>

So, now just run exim_fixdb

# exim_fixdb /var/spool/exim retry
Modifying Exim hints database /var/spool/exim/db/retry
16-Nov-2011 09:23:21
0 error number: -44 SMTP error from remote mail server after RCPT TO:<>: host []: 451 Greylisted, please try ag
1 extra data:   13133
2 first failed: 03-Nov-2011 10:17:20
3 last try:     16-Nov-2011 09:23:21
4 next try:     16-Nov-2011 17:23:21
5 expired:      yes
> d

the d command just deletes the most previously viewed record. That’s it! Now run exinext again:

# exinext
No retry data found for

That’s it – any messages sent to the remote domain should now retry properly again.




Tags: ,

One Response to “Retry timeout exceeded – Exim greylist problem”