Sign in to follow this  
Followers 0
Wazoo

[Resolved] Reporting System failure mode

11 posts in this topic

As noted on the Reporting System Status graphic/link at the top right og the page, the Reporting system has a major issue, the error messaging pointing to an unternal network issue. Ellen has dropped by, left a note to advise that she has opened up a ticket with engineering to get the problem resolved.

The system appears to still be accepting sybmittals, but is not able to generate and/or provide an actual parser output to actually build an outgoing Report. Not enough feedback to be able to make any suggestion as to the usefullness of continuing to submit your spam at the moment. Your call.

Share this post


Link to post
Share on other sites
As noted on the Reporting System Status graphic/link at the top right og the page, the Reporting system has a major issue, the error messaging pointing to an unternal network issue. Ellen has dropped by, left a note to advise that she has opened up a ticket with engineering to get the problem resolved.

The system appears to still be accepting sybmittals, but is not able to generate and/or provide an actual parser output to actually build an outgoing Report. Not enough feedback to be able to make any suggestion as to the usefullness of continuing to submit your spam at the moment. Your call.

Yes, the system will accept your spam and parse it and then it fails when trying to update the tables necessary for sending the spam reports -- operations, engineering and a DBA are all working on it. I have no time estimate as to how long it will take to resolve. I do not think continuing to submit spam at this time if you are doing a copy/paste is productive; if you are mailing in the spams, wait to click the link to finish reporting until after the problem is fixed.

I'll reply to this thread when I have more information (as soon as I figure out how to mark the thread so I can find it again :-)

Ellen

Share this post


Link to post
Share on other sites
Yes, the system will accept your spam and parse it and then it fails when trying to update the tables necessary for sending the spam reports -- operations, engineering and a DBA are all working on it. I have no time estimate as to how long it will take to resolve. I do not think continuing to submit spam at this time if you are doing a copy/paste is productive; if you are mailing in the spams, wait to click the link to finish reporting until after the problem is fixed.

I'll reply to this thread when I have more information (as soon as I figure out how to mark the thread so I can find it again :-)

****UPDATE*****

DBA and engineering have located the cause of the problem and have taken the system offline to work on it.

News to follow as it becomes available.

Ellen

Share this post


Link to post
Share on other sites
****UPDATE*****

DBA and engineering have located the cause of the problem and have taken the system offline to work on it.

News to follow as it becomes available.

**** Update 2 *****

Engineering/Ops are working on the resolution. This is going to take quite a while to remedy. I do not have a restoration time yet but I wanted to let you know that we will not be back online in the next few hours.

Ellen

Share this post


Link to post
Share on other sites
**** Update 2 *****

Engineering/Ops are working on the resolution. This is going to take quite a while to remedy. I do not have a restoration time yet but I wanted to let you know that we will not be back online in the next few hours.

Ellen

...Thanks, Ellen, appreciate the update!

Share this post


Link to post
Share on other sites
**** Update 2 *****

Engineering/Ops are working on the resolution. This is going to take quite a while to remedy. I do not have a restoration time yet but I wanted to let you know that we will not be back online in the next few hours.

********UPDATE # *************

We anticipate that the system will be back up around midnight to 3AM EDT - sorry I can't be more specific than that but engineering will want to process a large backup before turning the website back on and it is not known at this time how long that will take. Please note that it could be later than 3AM if the backlog is not processing as fast as we think it will or if new issues are encountered.

Regarding any spams that you may have submitted prior to the system going into maintenance mode -- if you submitted by email and have the return email with the links go ahead and try the links. However remember that any spams that you received today during the day will be stale by tomorrow so I would just delete them and not worry about it. It is more important to submit the new spams than the older ones ....

This will be my last update for today. After the system comes up if you notice any major problems please write to deputies <at> admin.spamcop.net with as much information as possible -- the tracking url if there is one, what exactly you were doing, how you submitted the spam, a small copy/paste snippet of the error message from the website (if there is one) etc.

And many thanks for your patience during this long outage!

Ellen

SpamCop

Share this post


Link to post
Share on other sites
********UPDATE 3 *************

We anticipate that the system will be back up around midnight to 3AM EDT - sorry I can't be more specific than that but engineering will want to process a large backup before turning the website back on and it is not known at this time how long that will take. Please note that it could be later than 3AM if the backlog is not processing as fast as we think it will or if new issues are encountered.

**** UPDATE 4 *********

Engineering is satisfied with the state of the process queues and has taken they system out of maintenance mode.

Ellen

SpamCop

Share this post


Link to post
Share on other sites

Tagging as Resolved, un-Pinning the Topic.

Share this post


Link to post
Share on other sites

http://zeta.cesmail.net/pipermail/scspamco...une/008928.html

From: RW <nobody[at]spamcop.net>

Newsgroups: spamcop

Subject: Re: "Sorry, failed to get reportid from database, will not send." errors.

Date: Thu, 25 Jun 2009 00:12:28 -0600

Message-ID: <h1v4ga$20k$1[at]news.spamcop.net>

NNTP-Posting-Date: Thu, 25 Jun 2009 06:12:26 +0000 (UTC)

Farelf wrote:

> Farelf wrote:

> ...

>>

>> It appears we're back in business. Processing status again

>> showing/indicated at http://www.spamcop.net/spamgraph.shtml?spamstats

>> [Reminder, when this .spamcop.net page is unavailable the independent

>> one(s) at .forum.spamcop.net can be checked in the event of purely

>> local/user effects.]

>>

>> Ellen posted in the forums that the system is out of maintenance

>> mode/backlog processing. She earlier posted in:

>> http://forum.spamcop.net/forums/index.php?...ost&p=71979

>>

> ...

> Though I admit some concern over the downward trend in the spamgraph

> over the past minutes.

If you look at the history though, the drop is within normal range for

this time of day. Unlike other outages, there was no backlog to contend

with. Submissions through the day, including traps, were lost, not delayed.

The problem and solution were fairly simple, but it did require

engineering to spend quite a bit of time today writing new code,

rebuilding databases, etc. The bottom line is SpamCop reached its limit

as programmed.

SpamCop is a collection of perl scripts with the reportid field being a

32-bit integer data type. When SpamCop reached report number

4294967295, it couldn't count any higher. That was its limit.

The solution was to rewrite the code to allow BIGINT in 64-bit, but that

meant rebuilding the databases, tables, etc. That's what took all the

time, but well under the 24 hour estimate.

Richard

Share this post


Link to post
Share on other sites
So, SpamCop has reported so much spam that the reports officially became numberless! I was wondering whether we might get close to such a limit at some point.

I recall a Perl app that I did for work where I had to sort events into time order by time_t numbers; the problem was that the garden-variety Perl sort statement is lexical (not numerical), so when we hit time_t = 1,000,000,000 some time ago (sometime in 2002, if I recall), my sorts suddenly spectacularly failed. I was very proud when I tracked down this bug, and was ticked off that nobody else seemed interested in something that could have been at least as nasty as Y2K was supposed to be.

-- rick

Share this post


Link to post
Share on other sites
...I was very proud when I tracked down this bug, and was ticked off that nobody else seemed interested in something that could have been at least as nasty as Y2K was supposed to be.
As mentioned to Richard in the NGs - http://xkcd.com/571/ (...don't forget to declare sheepCount as a long int.). Evidently you are the inspiration Rick :D

Share this post


Link to post
Share on other sites
Sign in to follow this  
Followers 0