Jump to content

snover

Members
  • Content Count

    10
  • Joined

  • Last visited

Community Reputation

0 Neutral

About snover

  • Rank
    Member
  1. Oh, I know, and that’s where I understand the justification to be. However, RFC 2822 was written in 2001, when there was enough memory to deal with more than 1kB of header data at a time. Since they were changing other more integral things, such as the datetime representation, I am surprised that this limitation was not removed. But when the code is written to comply with the spec, and the spec says MUST, it is the spec that enforces the limit. I guess it depends on what the problem is that you’re concerned about. Accept-Language is all about preferred language and doesn’t have anything to do with keyboard layout, by the way. This is actually a US English keyboard. If you’re trying to resolve the problem with bungled characters in edits, fix Invision Power Board to be consistent with what character set it reports and uses. Quick edit seems to assume ISO-8859-1 which breaks all the non-ASCII UTF-8 characters. If you run the pages through a validator you’ll find there’s a lot lacking in terms of standards-compliance in IPB anyway, but that’s neither here nor there. This is extremely OT so feel free to send a PM about it or something if you want to continue the discussion.
  2. snover

    URL Shrinking Website != Spam

    If you wanted me to search for SpamCop FAQ, you should have said that instead of SpamCop Forum. The FAQ is exactly what I’m talking about in regards to too much information, though. It’s 11 pages long!
  3. From URL Shrinking Website != spam, and suggested to be posted here to not get lost: The spam is already being processed for links. There will only need to be a couple of extra steps to be able to report the actual spamvertised site instead of just the redirection service. 1. Match the link domain against a list of known shortening domains 2. Optionally, opportunistically check an unknown domain to see if it is a URL shortening service if the domain part is <= 9 characters or so — this will help so that nobody has to manually add domains to the list, but it may add too much overhead or garbage in the lookup table, slowing things down 3. Do a HEAD request and cache the resulting Location header Unlike the problem of rotating DNS records, shortening service URLs are idempotent, so they really only need to be looked up once. Obviously, of course, this proposed method won’t catch ANY redirect, but it ought to take care of most of the shortening services. I know that SpamCop does not consider the spamvertised site reporting to be a particularly important or high-priority thing, but I would love to see this nevertheless if attitudes ever changed in this regard.
  4. snover

    URL Shrinking Website != Spam

    I don’t know if it’s really worth mentioning or not but I had a really, really hard time finding that page. I couldn’t find it by following your instructions; even doing a search for the word “spamvertized†only got me there thanks to a link in “FAQ Entry: The Link Analysis Processâ€. (The danger of “important†topics, especially when not organised really well, is that as you keep adding things, eventually there is too much and the relevant bits get lost. ) Well, in the very specific case of resolving domain shortening services, it’s fortunately not a particularly vexing problem to solve, at least from a programming perspective. The spam is already being processed for links. There will only need to be a couple of extra steps. 1. Match the link domain against a list of known shortening domains 2. Optionally, opportunistically check an unknown domain to see if it is a URL shortening service if the domain part is <= 9 characters or so — this will help so that nobody has to manually add domains to the list, but it may add too much overhead or garbage in the lookup table, slowing things down 3. Do a HEAD request and cache the resulting Location header Unlike the problem of rotating DNS records, shortening service URLs are idempotent, so they really only need to be looked up once. Obviously, of course, this proposed method won’t catch ANY redirect, but it ought to take care of most of the shortening services. Anyway — I completely understand if it’s not something that wants to be implemented, as turetzsr said. But it was a thought.
  5. I really, really hope that people don’t make changes to RFC documents and repost them on the Internet. Though, in a twisted way, that would a rather clever scheme for really confusing some people. (I was looking at the faqs.org copy, for reference.) I guess I really don’t understand why they would add confusion by writing “998/78†there instead of just “78â€. In a section called “Long header fieldsâ€, no less. Clearly, the people writing the specs aren’t Aspergian enough for me. So then I was correct in thinking that this is where the spec indicates that the 998-limit applies to folded headers. I understand now, though I still think it’s not terribly clear. (Nor do I think it’s particularly smart for the spec to enforce an arbitrary maximum length on the data portion of unstructured header fields, but I can at least understand the reasoning behind it.) UTF-8, because your Web server returns “Content-Type: text/html; charset=UTF-8â€. HTML 4.0 § 5.2.2 states that HTTP header takes precedence over meta http-equiv. Cheers,
  6. snover

    URL Shrinking Website != Spam

    Is there a reason that SpamCop doesn’t make HEAD requests against the spamvertised URLs to ensure that the original host also gets informed of the spam instead of just the redirection service? I’ve been receiving “Critical Microsoft Update†spam all day that use king.cd to redirect to rapidshare.com, but SpamCop only reports about king.cd (and Comcast doesn’t seem to be liking the reports; I keep getting form responses that they need more information). (Refs:) http://members.spamcop.net/sc?id=z31906593...19d5a53b7dd242z http://members.spamcop.net/sc?id=z31922196...3b5bef2f4dc5ccz http://members.spamcop.net/sc?id=z31922212...b316f04b65e337z http://members.spamcop.net/sc?id=z31931161...33c440a5eff2bbz http://members.spamcop.net/sc?id=z31931828...0f3d872a0b0120z
  7. I don't want to seem argumentative, I am just trying to understand. The actual RFC says: From my reading, this says that folding headers is a workaround for the 998 character limitation. Unless your argument is that "Each header field should be treated in its unfolded form for further syntactic and semantic evaluation" means that the rule in 2.2.1 applies ... but in such a case, why would they have said that folding was to deal with the character limitation?
  8. I’m confused about what you are talking about. This is a pretty standard SpamAssassin 3 installation and as far as I can tell, none of the X-spam-* lines are longer than 84 characters. I thought maybe you were talking about the total header length, but I read RFC 2822 and as far as I can tell it only specifies that a single line can be a maximum of 998 characters — nothing specific about the total length of a header as long as it’s been line split properly, and nothing about a 255-character limit anywhere. Interestingly, despite the error message auto-response from their system, I got a response back from a human (a human!) that said they had terminated the spamming account. So it seems that it wasn’t a fatal error.
  9. Ah, yes. Here you are: http://www.spamcop.net/sc?id=z3185225336ze...bb5f923c6a1daaz
  10. I recently reported some spam that contained Hotmail return-address in the body. When I reported it to Microsoft through SpamCop, I received back this weird response from “Micorsoft [sic] Customer Support <report_spam[at]css.one.microsoft.com>â€: Since the original message was a very short, typical multipart message with only text and html parts, I am perplexed. Has anyone else had trouble recently with reporting spam to Microsoft? Regards,
×