As of a few weeks ago, posting to this list got the post back in my mailbox in a few minutes. Now I am seeing delays.
I spot checked a couple of posts (1 mine 1 someone else via yahoo) - the 2 machines that take the longest to pass messages through are:
lists01.pubmisc.prod.ext.phx2.redhat.com ---> 2 mins mx01.util.phx2.redhat.com ---> 7-10 mins
gene/
Mail Lists wrote:
As of a few weeks ago, posting to this list got the post back in my mailbox in a few minutes. Now I am seeing delays.
I spot checked a couple of posts (1 mine 1 someone else via yahoo) - the 2 machines that take the longest to pass messages through are:
lists01.pubmisc.prod.ext.phx2.redhat.com ---> 2 mins mx01.util.phx2.redhat.com ---> 7-10 mins
I noticed it, too. One of the last servers in the chain took 26 minutes to pass on an email I posted back on the 22nd. It's been slowly improving since then.
I wonder if it's related to the mailserver changeover?
<deleted>
Someone explain to me why it matters, please?
Mail Lists wrote:
On 12/27/2009 01:09 PM, Marc Wilson wrote:
<deleted>
Someone explain to me why it matters, please?
Coz if the server can only process N messages per day - there will be a backup problem ... and messages will eventually get deleted before they get mailed out.
That help?
Of course a given message, by default, would have to remain in the queue for 4320 minutes before it is declared non-deliverable. Somehow I think the list admins would notice message delays and queue depths that would result in messages becoming non-deliverable and take the appropriate action.
On Sun, 2009-12-27 at 11:36 -0500, Mail Lists wrote:
As of a few weeks ago, posting to this list got the post back in my mailbox in a few minutes. Now I am seeing delays.
Greylisting, perhaps. If something has changed, the learnt whitelist might no-longer be in effect.
Tim:
Greylisting, perhaps. If something has changed, the learnt whitelist might no-longer be in effect.
Mail Llists:
No I dont believe so - there is no delay on the incoming MX .. only on the list server and the outgoing MX.
Your ISP's or within the list server servers'?
Headers from your email, as I received it (but abbreviated), below:
Received: from localhost; Mon, 28 Dec 2009 16:13:43 +1030 Envelope-to: tim@localhost; Delivery-date: Mon, 28 Dec 2009 16:27:35 +1100 Received: from server for tim@localhost (single-drop); Mon, 28 Dec 2009 16:13:43 +1030 (CST) Received: from mx1-phx2.redhat.com by external mail ; Mon, 28 Dec 2009 16:27:35 +1100 Received: from lists01.pubmisc.prod.ext.phx2.redhat.com ; Mon, 28 Dec 2009 00:20:06 -0500 Received: from int-mx05.intmail.prod.int.phx2.redhat.com ; Mon, 28 Dec 2009 00:17:02 -0500 Received: from mx1.redhat.com ; Mon, 28 Dec 2009 00:16:57 -0500 Received: from s3.sapience.com ; Mon, 28 Dec 2009 00:16:46 -0500 Received: from mail.prv.sapience.com ; Mon, 28 Dec 2009 00:16:45 -0500 Received: from lap1.prv.sapience.com ; Mon, 28 Dec 2009 00:16:45 -0500
I can see a delay in the middle, but only a few minutes. That could well be normal processing times.
And something odd within my LAN; some 14 minutes going back and forth in time. All our PCs are NTP synchronised, and timezones are set right (Adelaide, South Australia), so it's not a local clock issue.
Tim wrote:
Headers from your email, as I received it (but abbreviated), below:
Received: from localhost; Mon, 28 Dec 2009 16:13:43 +1030 Envelope-to: tim@localhost; Delivery-date: Mon, 28 Dec 2009 16:27:35 +1100 Received: from server for tim@localhost (single-drop); Mon, 28 Dec 2009 16:13:43 +1030 (CST) Received: from mx1-phx2.redhat.com by external mail ; Mon, 28 Dec 2009 16:27:35 +1100 Received: from lists01.pubmisc.prod.ext.phx2.redhat.com ; Mon, 28 Dec 2009 00:20:06 -0500 Received: from int-mx05.intmail.prod.int.phx2.redhat.com ; Mon, 28 Dec 2009 00:17:02 -0500 Received: from mx1.redhat.com ; Mon, 28 Dec 2009 00:16:57 -0500 Received: from s3.sapience.com ; Mon, 28 Dec 2009 00:16:46 -0500 Received: from mail.prv.sapience.com ; Mon, 28 Dec 2009 00:16:45 -0500 Received: from lap1.prv.sapience.com ; Mon, 28 Dec 2009 00:16:45 -0500
I can see a delay in the middle, but only a few minutes. That could well be normal processing times.
And something odd within my LAN; some 14 minutes going back and forth in time. All our PCs are NTP synchronised, and timezones are set right (Adelaide, South Australia), so it's not a local clock issue.
You say it isn't a local clock issue, yet the time zones are flipping within your LAN. AFAIK, Adelaide is GMT+1030 in summer time. The only time I've seen time zone incorrectness like this was when some systems, at the office I worked at, had some UID's that would alter the TZ environment variable. Made troubleshooting time sensitive transactions a real bitch.
I'm especially not fond of systems that use alpha designations for time zones. CST, is that "Central Standard Time (USA)", "China Standard Time", or ? :-)
On Mon, 2009-12-28 at 17:23 +0800, Ed Greshko wrote:
You say it isn't a local clock issue, yet the time zones are flipping within your LAN. AFAIK, Adelaide is GMT+1030 in summer time. The only time I've seen time zone incorrectness like this was when some systems, at the office I worked at, had some UID's that would alter the TZ environment variable. Made troubleshooting time sensitive transactions a real bitch.
Yeah, our timezones are GMT+9.5 normally, or GMT+10.5 in summer time (which is now). A half hour difference, but the headers show the time flipping by 14 minutes, as well. And, it's all on the same computer. Grrrrrr!
Locally, it was fetchmail getting mail from my host, dropping it into my personal (local) mailbox, and I read it through dovecot on the same box.
I'm especially not fond of systems that use alpha designations for time zones. CST, is that "Central Standard Time (USA)", "China Standard Time", or ? :-)
Me either, I've made the same argument on other mailing lists, in the past.
Tim wrote:
On Mon, 2009-12-28 at 17:23 +0800, Ed Greshko wrote:
You say it isn't a local clock issue, yet the time zones are flipping within your LAN. AFAIK, Adelaide is GMT+1030 in summer time. The only time I've seen time zone incorrectness like this was when some systems, at the office I worked at, had some UID's that would alter the TZ environment variable. Made troubleshooting time sensitive transactions a real bitch.
Yeah, our timezones are GMT+9.5 normally, or GMT+10.5 in summer time (which is now). A half hour difference, but the headers show the time flipping by 14 minutes, as well. And, it's all on the same computer. Grrrrrr!
Weird....
Wondering if "hwclock -r" returns a correct time...or if there is a difference between the hardware clock and the system clock.
Tim:
Yeah, our timezones are GMT+9.5 normally, or GMT+10.5 in summer time (which is now). A half hour difference, but the headers show the time flipping by 14 minutes, as well. And, it's all on the same computer. Grrrrrr!
Ed Greshko:
Weird....
Wondering if "hwclock -r" returns a correct time...or if there is a difference between the hardware clock and the system clock.
It shouldn't, and it doesn't. The hardware clock is set to GMT on that machine, too. It's not dual-boot, and it's rarely ever rebooted or shutdown. Everything was set up for the least annoyances.
Tim wrote:
Ed Greshko:
Weird....
Wondering if "hwclock -r" returns a correct time...or if there is a difference between the hardware clock and the system clock.
It shouldn't, and it doesn't. The hardware clock is set to GMT on that machine, too. It's not dual-boot, and it's rarely ever rebooted or shutdown. Everything was set up for the least annoyances.
Yes, it "shouldn't" (one of those famous last words). Still, if you have processes reporting times that are off by 14 minutes it would be nice to track down the culprit.
Of "interest" is 14 is "around" half the difference between GMT+1030 and GMT+1100. :-)
On 12/28/2009 02:44 AM, Tim wrote:
Your ISP's or within the list server servers'?
List server - your posting confirms it.
It clearly shows delay exactly as original post said 2-3 mins in int-mx05 ... and 7-10 mins in lists01-xxx
Received: from mx1-phx2.redhat.com by external mail ; Mon, 28 Dec 2009 16:27:35 +1100 Received: from lists01.pubmisc.prod.ext.phx2.redhat.com ; Mon, 28 Dec 2009 00:20:06 -0500 Received: from int-mx05.intmail.prod.int.phx2.redhat.com ; Mon, 28 Dec 2009 00:17:02 -0500
Below shows no delay from poster to redhat mx:
Received: from mx1.redhat.com ; Mon, 28 Dec 2009 00:16:57 -0500 Received: from s3.sapience.com ; Mon, 28 Dec 2009 00:16:46 -0500 Received: from mail.prv.sapience.com ; Mon, 28 Dec 2009 00:16:45 -0500 Received: from lap1.prv.sapience.com ; Mon, 28 Dec 2009 00:16:45 -0500
If each message takes 15 mins to process that would be a maximum of 96 messages per day outgoing ... sounds like a potential problem no? Doesn't sound like normal processing time to me ... but what do I know.
2009/12/28 Mail Lists lists@sapience.com:
If each message takes 15 mins to process that would be a maximum of 96 messages per day outgoing ... sounds like a potential problem no?
Only if message delivery was a blocking, serial process - it isn't.
Even if it took a message 15 minutes, to get from input to output, there's nothing that says you can't be submitting a constant stream of messages up to the bandwidth of the input pipe - it would take 15 minutes for your stream to get to the output (high latency!), but once it got there it could be spitting out messages at the same rate as the input.
It would however mean that if the delay holds true, then you'd be able to get about 50 back/forth exchanges on any thread/topic a day - which is certainly more than most threads get on this particular list.
-- Sam
On 12/28/2009 11:36 AM, Sam Sharpe wrote:
It would however mean that if the delay holds true, then you'd be able to get about 50 back/forth exchanges on any thread/topic a day - which is certainly more than most threads get on this particular list.
-- Sam
I think of it more simply (not per poster but for the server as a whole) - there is a pipe which can urinate outgoing messages at some rate - the number of desired outgoing is
Num_Out = num_subscribers x num_posts per unit of time ..
once the faucet/tap is on full - (whatever the max capability of the outgoing servers is) thats the max outflow rate of messages. If Num_Out ever exceeded the max flow rate of the servers there would be a bit of a problem.
Depending how many outgoing MX servers etc it may be a total non-issue or could be a problem looming - only the server maintainers can really see the load.
However, if things go slow enough there is a potential problem that the lists may overwhelm the outgoing MX servers.
The observation I had is simply that the delay (which is in the outgoing MX) has increased from 1-2 mins to 10 possibly higher. It is the change in the delay that I thought was of interest.
What to do with that observation is up to the list/mx maintainers I'd imagine.
2009/12/28 Mail Lists lists@sapience.com:
However, if things go slow enough there is a potential problem that the lists may overwhelm the outgoing MX servers.
The observation I had is simply that the delay (which is in the outgoing MX) has increased from 1-2 mins to 10 possibly higher. It is the change in the delay that I thought was of interest.
That's the point I was trying to make. Higher latency does not necessarily mean throughput goes down, it may in fact go up - That's fundamental to a lot of engineering disciplines.
Lets take the example of mail-servers sending out similar messages, to keep this on-topic. We have two possible delivery methods (there are more obviously):
A) We can either send out each message in the order it is queued, one message at a time. B) We can hold each message a while and wait to see if the same message comes in for a different recipient on the same mailserver - and then send them both in the same SMTP transaction.
Strategy (A) has the lowest latency. Strategy (B) has a higher latency, as it will take a "while" to fill the buffer and pop out the same message. Ultimately with millions of messages in the pipeline, Strategy (B) will have a higher throughput, because it is more efficient to queue multiple mails together.
So my point being, an increase in Latency is not something to be concerned about.
What intrigues me is that people seem to have some kind of expectation of "immediate" email delivery. Last I looked, email wasn't defined as a reliable transmission method with any kind of time guarantee.
-- Sam
Mail Lists <lists <at> sapience.com> writes:
On 12/28/2009 02:46 PM, Sam Sharpe wrote:
Good points ...
All very well - but the fact remains that for a user like me the list is the primary method of discussion about Fedora issues, fixes, workarounds etc. and I would like to see a timely server response - certainly it did not used to be like this with slow response.
Maybe when the lists move to their new servers around 9th January 2010 then perhaps the servers will be tuned to deal with posts in a timely fashion?
On Wed, 2009-12-30 at 11:16 +0000, Mike Cloaked wrote:
All very well - but the fact remains that for a user like me the list is the primary method of discussion about Fedora issues, fixes, workarounds etc. and I would like to see a timely server response - certainly it did not used to be like this with slow response.
Maybe when the lists move to their new servers around 9th January 2010 then perhaps the servers will be tuned to deal with posts in a timely fashion?
I'm not convinced that the few minutes delay in question is really going to cause you such a big problem. How fast do you expect people to type replies to your question?
On Mon, 2009-12-28 at 10:29 -0500, Mail Lists wrote:
It clearly shows delay exactly as original post said 2-3 mins in int-mx05 ... and 7-10 mins in lists01-xxx
I managed to miss seeing the additional delay.
If each message takes 15 mins to process that would be a maximum of 96 messages per day outgoing ... sounds like a potential problem no? Doesn't sound like normal processing time to me ... but what do I know.
The only way we'll know what's happening is if we (or you) ask whoever's actually in charge of those machines if they know the reason.
For all we know, that point in the chain could be where this list and many others start to come together, and it's got a very large workload. Or it could be the point that processes and destroys masses of spam. Or it's doing some other heavy processing as well as being a mail server. But we'll never know by guessing.