News:

SMF 2.1.6 has been released! Take it for a spin! Read more.

Main Menu

Emails Stuck in Mail Queue

Started by Darkness7148, June 27, 2025, 12:55:38 PM

Previous topic - Next topic

oskar866

Funny enough I was about to post a new post about mail getting stuck in the queue when I saw this post.  I've not yet upgraded to 2.1.6, still on 2.1.4 but I'm getting the same problems.  Searching this forum shows a history of very similar questions but with no sure answer.  I won't hijack this post but I will follow it, a lot of people seem to be having the same problem.

Liviu Lalescu

With SMF-2.1.4 the notification emails worked very well for me. Only the recent updates to 2.1.5 and 2.1.6 brought me this delay problem.

gkawa

I can't say if it happened in 2.1.4. I noticed after the upgrade. I've been trying different things to find the reason, but I'm stuck. According to the log, typically two hours, with some exceptions below that. One was dispatched immediately, less than one second! Same priority as the others. And the Daily Digests are all being delayed more than 10 hours. I don't understand that since there's no way to tell them apart from the information in the mail queue. Except for the priority.



Darkness7148

I can confirm too that it's working.

gkawa

I'm trying it now, and it seems to be working. It's going to take a while because I tried a newsletter to 200 users. So far, it's sending at a rate of 10 per page refresh (it's set for that).

But I don't understand the problem. It's a system to protect the mail server from being overloaded if there are a lot of failed emails. I'm not having any.

The other question, this change has to be reverted before any update, right?

petewadey

Will there be an official update to resolve this? Now it's a known fault.

Sesquipedalian

Quote from: petewadey on July 01, 2025, 11:28:46 AMWill there be an official update to resolve this? Now it's a known fault.
Yes. However, the official fix might not be the same as the changes described in the comments on the GitHub bug report.
I promise you nothing.

Sesqu... Sesqui... what?
Sesquipedalian, the best word in the English language.

shawnb61

Quote from: gkawa on July 01, 2025, 10:22:41 AMBut I don't understand the problem. It's a system to protect the mail server from being overloaded if there are a lot of failed emails.

My understanding... 

Mail servers have been changing behavior, & SMF needs to adapt to that.  In the past, mail servers acted as a relay...  SMF would give it an email, the server would acknowledge receipt, and SMF's job was done.  If the mail server couldn't send it, only the mail server knew.  To look at failed emails, you'd likely need to use a web client on your host.  (If you haven't done so, go check it out...  You may have 20 years of failed mail attempts in there, and a couple hundred replies where folks have replied to your forum's email...)

So, the old model was:
- SMF: here take this; Host: got it, I'll see what I can do; SMF: thank you, I'm done now!

SMF never really had to deal with items left in the queue that couldn't be sent, because all it had to do was pass it on to the mail server.  SMF's mail queue existed mainly to slow down bulk mailings, to avoid pissing off your host.

But we are seeing some hosts abandon this relay model...  As SMF connects to the mail server, it's being checked at that time.  The new model is:
- SMF: here take this; Host: let me check... gimme a minute... Nope! Recipient not found!; SMF: WHAT?!?!?

SMF wasn't designed to handle real-time responses from mail servers in this fashion.  Entries were getting stuck in the queue forever; sometimes clogging queue processing.  But some are acting this way (& we will likely see more going forward), so SMF has to adapt.  And sometimes the hosts are just like: "Recipient's mail server thinks we're sending too many emails!"  Expecting SMF (i.e., not the mail server as in the past) to hold on & retry later.  Again, new behavior SMF must adapt to. 

2.1.5 included some changes to handle this.  The SMF mail queue changed from being solely 'not sent yet' to a mix of 'try again later' and 'not sent yet'. 

The new queue logic has been on THIS site for a while.  And THIS site sees a tremendous volume of email, so it looked like things were good.

Quote from: gkawa on July 01, 2025, 10:22:41 AMThe other question, this change has to be reverted before any update, right?
Any time you tweak the core SMF code for an issue such as this, you likely need to revert it before the official patch comes out. 
A question worth asking is born in experience & driven by necessity. - Fripp

gkawa

Quote from: shawnb61 on July 01, 2025, 11:54:16 AMAny time you tweak the core SMF code for an issue such as this, you likely need to revert it before the official patch comes out. 
Thanks. I'll keep that in mind.

By the way, I really really really really like the "old model". Any chance to have that as an option?
I understand that for most forum admins it's a problem. But I check my mail server regularly. Mostly because it's my work server and it's shared with the forum. If SMF can't send emails, I'm having a bigger fish to fry...

shawnb61

Quote from: gkawa on July 01, 2025, 12:29:46 PMBy the way, I really really really really like the "old model". Any chance to have that as an option?

This isn't an SMF choice - it's how some host mail servers behave.  So SMF must support both the old asynchronous relay model & the new synchronous one.

SMF can definitely speed up with a tweak or two.  Probably back to where you can't tell the difference.
A question worth asking is born in experience & driven by necessity. - Fripp

KittyGalore

Quote from: peter_mein on July 01, 2025, 09:34:22 AMHello
There seems to be a solution here
I tried it out and it works.

https://github.com/SimpleMachines/SMF/issues/8712

Yes i can confirm since i changed them codes it has fixed the problem for me also.  I was able to clear out the mail queue.
SMF Curve 2.0x

gkawa

Quote from: shawnb61 on July 01, 2025, 11:54:16 AMBut we are seeing some hosts abandon this relay model...  As SMF connects to the mail server, it's being checked at that time.  The new model is:
- SMF: here take this; Host: let me check... gimme a minute... Nope! Recipient not found!; SMF: WHAT?!?!?
Got it! I misunderstood the situation.

SleePy

@shawnb61,
You hit the nail on the head with the explanation.  More and more modern mail servers are performing real-time checks as we are trying to hand off the mail, and the current mail code assumes it was a failure.  So it just keeps trying and never sends the mail.  This builds up, and mail stops flowing as SMF didn't have a way to handle this.
We could update the code in SMF to just try and ignore responses, but legit failures could result in mail not being sent.  Think your mail server is down, and SMF just throws it out there and doesn't wait for a response.  It is what we did before, but the mail server would effectively accept it and then it would be responsible for trying to send that and depending on the mail server config, it would send a message back to the inbox of the sender, which the suggested setup would be to have those forwarded to a admin (all this done on mail server setup, not SMF).

To adapt, SMF will need to do some more intelligent handling of emails.  I have plans for 3.0 to handle this and to add an additional unused column to ensure we can adapt the code in the future without having to deal with the lack of database changes to perform those changes.

The worst part about the failures would be the DOS that SMF would effectively start.  When mail queue started to pile up, it would keep trying to send emails and if you just upped your mail queue size thinking your more busy, your system would keep trying to send and it could result in other remote servers thinking your performing a some sort of attack and throttle or block you, which then you no longer can send any emails to that domain.  Running your own mail server, you learn that your domain and IP reputation are very important and you take care of everything to ensure you follow the rules, or the big giants will block your domain and don't care about unblocking you.  Those forms typically do nothing and where time time-based.  With some "AI", I suspect they may try to reduce the time they block you.
Jeremy D ~ Site Team / SMF Developer ~ GitHub Profile ~ Join us on IRC @ Libera.chat/#smf ~ Support the SMF Support team!

shawnb61

Quote from: SleePy on July 01, 2025, 09:12:13 PMThe worst part about the failures would be the DOS that SMF would effectively start.  When mail queue started to pile up, it would keep trying to send emails and if you just upped your mail queue size thinking your more busy, your system would keep trying to send and it could result in other remote servers thinking your performing a some sort of attack and throttle or block you, which then you no longer can send any emails to that domain.  Running your own mail server, you learn that your domain and IP reputation are very important and you take care of everything to ensure you follow the rules, or the big giants will block your domain and don't care about unblocking you.

Reading this, I can't help but think that massive sites like SMF, which have decades worth of old registrations & emails, are sending a massive volume of email to nowhere...

I bet a lot of those are for email accounts that are no longer valid.

I wonder if, upon seeing a 5xx response code, we should inactivate email for that user - unset the opt-in flag.  And never even *try* to send to them again.

I bet the volume of email being sent would drop. 

A lot.
A question worth asking is born in experience & driven by necessity. - Fripp

Aleksi "Lex" Kilpinen

If we could make it so, that it's retried once like a week or so later, and then act if it still fails - Even better. Just to make sure it's not just a system malfunction or something. (Only saying this, because my own email address has been dropped in places I really wish it hadn't, when Microsoft was having temporary issues... )
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

shawnb61

#36
Quote from: Aleksi "Lex" Kilpinen on July 02, 2025, 01:24:11 PMIf we could make it so, that it's retried once like a week or so later, and then act if it still fails - Even better. Just to make sure it's not just a system malfunction or something.

The recent change does most of this already.  It slows down resends, & ultimately drops repeated fails. 

The good thing is that the mail transports generally report 4xx codes for those worth retrying.  5xx is normally dead altogether (although 552 is mailbox full, which might be temporary...).  A 550 normally means the mailbox doesn't exist.  Though it can mean that the sender (you/your host) has been blacklisted.
https://www.mailersend.com/blog/smtp-codes

Maybe opt out the 550s with repeated fails.  TBH - I'd opt out all of 5xx.  If they ever fix things they can opt back in.

That's actually a benefit to this new synchronous mode we're starting to see...

Under the old relay model, the only place to see if emails are getting rejected is to logon to your host's webmail.  Your forum has no idea which emails are invalid.  If you want to clean those up, you must do so by hand.
A question worth asking is born in experience & driven by necessity. - Fripp

gkawa

Quote from: shawnb61 on July 02, 2025, 01:37:27 PMTBH - I'd opt out all of 5xx.  If they ever fix things they can opt back in.
I'd do that. A mailbox full is as good as dead. I'm sure that person has bigger problems than missing forum notifications. And, most likely, it's an abandoned mailbox.

While checking on this issue, I found out that most daily digest subscriptions are from users who haven't logged into the forum in ages. And we have little to no traffic at all, fewer than 30 users are active every day. I can imagine that the problem is huge for popular forums.

CRM 114

Quote from: peter_mein on July 01, 2025, 09:34:22 AMHello
There seems to be a solution here
I tried it out and it works.

https://github.com/SimpleMachines/SMF/issues/8712
For me, this did not work.

Sending a test mail (SMTP) works though.
German Wet Shaving Forum: www.gut-rasiert.de/forum

Oldiesmann

Quote from: CRM 114 on July 02, 2025, 02:31:54 PM
Quote from: peter_mein on July 01, 2025, 09:34:22 AMHello
There seems to be a solution here
I tried it out and it works.

https://github.com/SimpleMachines/SMF/issues/8712
For me, this did not work.

Sending a test mail (SMTP) works though.

What didn't work about it? It's worked for me on two different forums (my birthday was two days ago so the birthday notification mail got stuck in the queue on both of them, and I've since gotten a topic reply notification email from one of them as well). After making the change if you go to the mail queue and click "Send queue now", it should send all the emails that are stuck in the queue (though I don't recommend doing this if you have a ton of emails in the queue - I only had 3 to 5 messages in the queue on each)

Advertisement: