queued for message checks: retry timeout exceeded

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

queued for message checks: retry timeout exceeded

jeremymcs
Quite often I get this error and I have to remove the retry db and restart exim. It happens in both versions; community and enterprise. 
The problem is the bounce is immediate. If it times out, shouldn't it retry. ?

---

A message that you sent could not be delivered to one or more of its recipients. This is a permanent error. The following address(es) failed:

 [hidden email]
   queued for message checks: retry timeout exceeded


--
Jeremy McSpadden
Flux Labs, Inc | http://www.fluxlabs.net | Endless Solutions
Office : <a href="tel:850-250-5590;101" x-apple-data-detectors="true" x-apple-data-detectors-type="telephone" x-apple-data-detectors-result="1">850-250-5590x101 | Cell : <a href="tel:850-890-2543" x-apple-data-detectors="true" x-apple-data-detectors-type="telephone" x-apple-data-detectors-result="2">850-890-2543 | Fax : <a href="tel:850-254-2955" x-apple-data-detectors="true" x-apple-data-detectors-type="telephone" x-apple-data-detectors-result="3">850-254-2955

_______________________________________________
http://pledgie.com/campaigns/12056
Reply | Threaded
Open this post in threaded view
|

Re: queued for message checks: retry timeout exceeded

Charles DeVault
A little late with the response, but I was waiting to see if I could
trigger this behavior, and I just did.  It certainly didn't look like a
corrupt db to me, so I have a test install of Baruwa 2.0 that I've been
steadily pumping mail every two minutes through it via a cron job on a
remote machine, hoping I could trigger this.

Relevant log entries (pruned a bit):

root@calcium:~# exigrep 'retry timeout exceeded' /var/log/exim/main.log
2013-10-24 11:07:30 1VZPJi-0005nK-Ld <= [hidden email]
H=mercury.spiritone.com [xxx.xxx.xxx.xxx] P=esmtp S=459

2013-10-24 11:07:30 1VZPJi-0005nK-Ld == [hidden email]
R=message_checks defer (-1): queued for message checks

2013-10-24 11:07:30 1VZPJi-0005nK-Ld ** [hidden email]: retry
timeout exceeded

2013-10-24 11:07:31 1VZPJi-0005nK-Ld Completed


And:

root@calcium:/var/spool/exim.in/db# exim_dumpdb /var/spool/exim.in/ retry

R:spiritone.com -1 0 queued for message checks
10-Oct-2013 11:06:35  24-Oct-2013 11:09:35  24-Oct-2013 17:09:35 *


And:

root@calcium:~# exinext spiritone.com
Route: spiritone.com error -1: queued for message checks
   first failed: 10-Oct-2013 11:06:35
   last tried:   24-Oct-2013 11:09:35
   next try at:  24-Oct-2013 17:09:35
   past final cutoff time


The first instant bounce occured exactly 14 days after the "first
failed" from exinext, and "queued for message checks" is being seen as a
failure.


Removing /var/spool/exim.in/db/retry and
/var/spool/exim.in/db/retry.lockfile resolves the problem.  I did not
have to restart exim.  (for those feeling squeamish about this, from the
docs: 'Deleting any of Exim’s hints files is always safe; that is why
they are called “hints”. ')

I'm considering having the retry files removed daily via a cron job.  If
the exim.in queue is only feeding the messages to MailScanner, this
shouldn't cause any problems with out of control queues (theoretically).

Part 10 ("Long-term Failures") from this part of the Exim specifications
seems to be relevant:
http://www.exim.org/exim-html-current/doc/html/spec_html/ch-retry_configuration.html

Now that I have a retry db file that can trigger this, I'm perfectly
happy to provide any further information y'all may need.

-charles



On 10/07/2013 11:37 AM, Jeremy McSpadden wrote:

> Quite often I get this error and I have to remove the retry db and
> restart exim. It happens in both versions; community and enterprise.
> The problem is the bounce is immediate. If it times out, shouldn't it
> retry. ?
>
> ---
>
> A message that you sent could not be delivered to one or more of its
> recipients. This is a permanent error. The following address(es) failed:
>
> [hidden email] <mailto:[hidden email]>
>     queued for message checks: retry timeout exceeded
>
> --
> Jeremy McSpadden
> Flux Labs, Inc | http://www.fluxlabs.net
> <http://www.fluxlabs.net/> | Endless Solutions
> Office : 850-250-5590x101 <tel:850-250-5590;101> | Cell : 850-890-2543
> <tel:850-890-2543> | Fax : 850-254-2955 <tel:850-254-2955>
>
>
> _______________________________________________
> http://pledgie.com/campaigns/12056
>

_______________________________________________
http://pledgie.com/campaigns/12056
Reply | Threaded
Open this post in threaded view
|

Re: queued for message checks: retry timeout exceeded

jeremymcs
Yeah, I've written a small bash to purge the retry db files outside of the normal exim-tidy cronjob. I've noticed it happens If the receiving server is down or not responding for a period of time. Happens to a few domains from time to time. 

--
Jeremy McSpadden
Flux Labs, Inc | http://www.fluxlabs.net | Endless Solutions
Office : <a href="tel:850-250-5590;101" x-apple-data-detectors="true" x-apple-data-detectors-type="telephone" x-apple-data-detectors-result="1">850-250-5590x101 | Cell : <a href="tel:850-890-2543" x-apple-data-detectors="true" x-apple-data-detectors-type="telephone" x-apple-data-detectors-result="2">850-890-2543 | Fax : <a href="tel:850-254-2955" x-apple-data-detectors="true" x-apple-data-detectors-type="telephone" x-apple-data-detectors-result="3">850-254-2955

On Oct 24, 2013, at 1:43 PM, "charles" <[hidden email]> wrote:

A little late with the response, but I was waiting to see if I could trigger this behavior, and I just did.  It certainly didn't look like a corrupt db to me, so I have a test install of Baruwa 2.0 that I've been steadily pumping mail every two minutes through it via a cron job on a remote machine, hoping I could trigger this.

Relevant log entries (pruned a bit):

root@calcium:~# exigrep 'retry timeout exceeded' /var/log/exim/main.log
2013-10-24 11:07:30 1VZPJi-0005nK-Ld <= [hidden email] H=mercury.spiritone.com [xxx.xxx.xxx.xxx] P=esmtp S=459

2013-10-24 11:07:30 1VZPJi-0005nK-Ld == [hidden email] R=message_checks defer (-1): queued for message checks

2013-10-24 11:07:30 1VZPJi-0005nK-Ld ** [hidden email]: retry timeout exceeded

2013-10-24 11:07:31 1VZPJi-0005nK-Ld Completed


And:

root@calcium:/var/spool/exim.in/db# exim_dumpdb /var/spool/exim.in/ retry

R:spiritone.com -1 0 queued for message checks
10-Oct-2013 11:06:35  24-Oct-2013 11:09:35  24-Oct-2013 17:09:35 *


And:

root@calcium:~# exinext spiritone.com
Route: spiritone.com error -1: queued for message checks
 first failed: 10-Oct-2013 11:06:35
 last tried:   24-Oct-2013 11:09:35
 next try at:  24-Oct-2013 17:09:35
 past final cutoff time


The first instant bounce occured exactly 14 days after the "first failed" from exinext, and "queued for message checks" is being seen as a failure.


Removing /var/spool/exim.in/db/retry and /var/spool/exim.in/db/retry.lockfile resolves the problem.  I did not have to restart exim.  (for those feeling squeamish about this, from the docs: 'Deleting any of Exim’s hints files is always safe; that is why they are called “hints”. ')

I'm considering having the retry files removed daily via a cron job.  If the exim.in queue is only feeding the messages to MailScanner, this shouldn't cause any problems with out of control queues (theoretically).

Part 10 ("Long-term Failures") from this part of the Exim specifications seems to be relevant: http://www.exim.org/exim-html-current/doc/html/spec_html/ch-retry_configuration.html

Now that I have a retry db file that can trigger this, I'm perfectly happy to provide any further information y'all may need.

-charles



On 10/07/2013 11:37 AM, Jeremy McSpadden wrote:
Quite often I get this error and I have to remove the retry db and
restart exim. It happens in both versions; community and enterprise.
The problem is the bounce is immediate. If it times out, shouldn't it
retry. ?

---

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

[hidden email] <[hidden email]>
   queued for message checks: retry timeout exceeded

--
Jeremy McSpadden
Flux Labs, Inc | http://www.fluxlabs.net
<http://www.fluxlabs.net/> | Endless Solutions
Office : 850-250-5590x101 <tel:850-250-5590;101> | Cell : 850-890-2543
<tel:850-890-2543> | Fax : 850-254-2955 <tel:850-254-2955>


_______________________________________________
http://pledgie.com/campaigns/12056


_______________________________________________
http://pledgie.com/campaigns/12056

_______________________________________________
http://pledgie.com/campaigns/12056