Jump to content

Character-set issues


Wazoo

Recommended Posts

A bit off-topic, but .... what character-set are you using .. what is the character that seems to have been mis-handled by this application? Did you notice whether it was displayed correctly prior to your edit?

FYI: It isn't necessary to quote the entire message you are replying to; especially mine. Most visitors to this site find reading my submissions once is quite enough; mercifully.

In all fairness, that post was edited quite a while back .. I'm not sure how/why you saw it showing your 'complete' post ... then again, I've been working hard elsewhere, on other servers, maybe the timing and my memory is bad ...????

Moderator Edit: Off-Topic stuff extracted from http://forum.spamcop.net/forums/index.php?showtopic=7598 and made into its own Topic .....

Link to comment
Share on other sites

A bit off-topic, but .... what character-set are you using .. what is the character that seems to have been mis-handled by this application? Did you notice whether it was displayed correctly prior to your edit?

It did display correctly in "Preview" and prior to my petite edit. I didn't re-read it, and therefore didn't see the quotation marks hashing. The Font window just says "Font".

In all fairness, that post was edited quite a while back .. I'm not sure how/why you saw it showing your 'complete' post ... then again, I've been working hard elsewhere, on other servers, maybe the timing and my memory is bad ...????

I'm not sure how long your "quite a while back" would compare to my "quite a while"; during which while I had the post up on my screen, but otherwise engaged "on malitia", peppering some racoons bottoms with some 17 mm blunts to persuade them to frolic yonder. They have a partiality to the skylight in the master bedroom; especially if the TV gets left on when the missus falls asleep. I reckon they got suckered into becoming a Neilson's Family; or something. They're cute as the dickens to look up at, but they dug a hole in the roof near the ensuite skylight and it's going to cost me a couple of $K's to put a new roof on come Spring. Let them get their own d**n TV.

Link to comment
Share on other sites

It did display correctly in "Preview" and prior to my petite edit. I didn't re-read it, and therefore didn't see the quotation marks hashing. The Font window just says "Font".

Well, actually, I was asking about "your character-set selection" on your system/web-browser, etc. This is currently driving me a bit bonkers .... where the actual 'proble' is occurring ... some old posts have 'some' characters re-mapped ... I was just asked about 'new' posts and couldn't say that I'd seen the issue (outside of some posts going bad after an edit .. but had to go with that this was 'jist me' as usual) .. and yet, here' a 'new' post .. that also seems to fall under the 'only after an edit' scenario ....

I'm not sure how long your "quite a while back" would compare to my "quite a while"; during which while I had the post up on my screen, but otherwise engaged

Agreed with that bit of issue .. no idea of the timing .. remotely connected to three different servers, trying to find, compare, edit. move various files .... this system is on a KVM switch, so jumping between systems to contnue the various troubleshooting/repair/whatever is in progress on those other 'local' systems .... keeping track of time just isn't in the cards <g>

Yeah, the obvious thing is that I need to somehow stop trying to handle these dozens of things at the same time .....

Link to comment
Share on other sites

Even further off topic -

...peppering some racoons bottoms with some 17 mm blunts to persuade them to frolic yonder. ...
Sure hope you meant .17 inch - 17 mm is definitely a bit of overkill, I mean you could drag out the old Ross (7.7 mm - ".303") and that would *still* be overkill IMO.
Link to comment
Share on other sites

…pm your system/web-browser…

Well I’ll be a monkey’s uncle; or a racoon’s kit … somewhere or other my character-set got re-set to “Unicode (UTF8)†from “Western (ISO-8859-1) I’m unaware of ever touching that setting, much less needing to, since I installed F-Fox.

I wonder if that last Moz update could have had something to do with it? That would explain a few other wrinkles I’ve had during the last couple of weeks…. which drove me to start using Netscape and OE from time to time.

Agreed with that bit of issue .. no idea of the timing .. remotely connected to three different servers, trying to find, compare, edit. move various files .... this system is on a KVM switch, so jumping between systems to contnue the various troubleshooting/repair/whatever is in progress on those other 'local' systems .... keeping track of time just isn't in the cards

Nice try; even plausible. You’re getting old too; eh?

Link to comment
Share on other sites

Even further off topic - Sure hope you meant .17 inch - 17 mm is definitely a bit of overkill, I mean you could drag out the old Ross (7.7 mm - ".303") and that would *still* be overkill IMO.

Riiiiiight Steve;

That sure explains why it took me so long to hammer the load into the breech. It was impossible to read the specs on the ammo box in the dark. It's a good thing I was using "air" instead of NC1140 smokeless; eh?

WOT: I see the Aussies are at the Brits this W/E at the Perth Pitch.. One more "Test" to go and its "Up yer flue with a Kangaroo; eh 'myte'?

Link to comment
Share on other sites

Well I’ll be a monkey’s uncle; or a racoon’s kit … somewhere or other my character-set got re-set to “Unicode (UTF8)” from “Western (ISO-8859-1) I’m unaware of ever touching that setting, much less needing to, since I installed F-Fox.

I wonder if that last Moz update could have had something to do with it? That would explain a few other wrinkles I’ve had during the last couple of weeks…. which drove me to start using Netscape and OE from time to time.

Leaving ths intact ... I actually pulled up your post to edit out all the extra whitespace and immmediately noted the character change problem ..... want to see if this quoted-copy remains as seen .....

Nice try; even plausible. You’re getting old too; eh?

Between the age and working in too many 'new' areas at the same time ....

For example;

three different servers that JT has allowed me access to all have different vintages and flavors of Linux installed ... none of the storage paths are the same ... different software in use, what is the 'same' is also of different vintages .....

Code-whacking now is in Python, which I still haven't sorted out yet .... the code isn't all that bad, but trying to 'translate' file calls has me whipped pretty good, again pointing out that /var/lib/xxx is /usr/local/xxx on another system, just to offer a small inconsistency <g>

This Forum app (basically PHP) showing these issues on this 'new' server, tryng to work out an issue in the MailMan application (basically Python) to get the newsgroup (INN) archives (built by Python scripts) in a new location, which is now kicking me hard due to an issue with Postfix, which then led me to a posting on a List issue that dealt with these same character-set issues, but was explained as (probably) being an issue of the e-mail and the user's MUA encoding configurations ..... and now trying to apply/use that information in trying to figure out the web-browser/Forum-app interaction issues ..... that may or may not be caused by this server configuration ....

Old age and poor vision isn't all that's involved right now <g>

Edit: wondering why the edit of your next post didn't show character-set (change) issues ... danged moving targets .....

Link to comment
Share on other sites

Wazoo;

Kidding aside; I don't know how you do it. I feel like a poseur half the time throwing out acronyms and cadged phrases that I barely understand. I spent half an hour earlier trying to figure out what the heck an "IMAP" host is supposed to be.

Just to be clear, before I sent my last sub, I changed my charset back to Western. If that is what you meant when you indicated you didn't see any change.

Link to comment
Share on other sites

...It was impossible to read the specs on the ammo box in the dark. It's a good thing I was using "air" instead of NC1140 smokeless; eh?
Yep - mind you, 'bout the most dangerous thing you can do is take out a 12Ga shotgun with a pocketfull of mixed 12 & 20 Ga shells. That's the *reverse* situation, load a 12 Ga shell on top of the 20 that's dropped down the bore ... natural selection in action.
...WOT: I see the Aussies are at the Brits this W/E at the Perth Pitch.. One more "Test" to go and its "Up yer flue with a Kangaroo; eh 'myte'?
Social game at Lilac Hill tomorrow then the third test starts at the WACA next week. We don't count our chickens before they're hatched but some hint of optimism is in the air.
Link to comment
Share on other sites

Edit: wondering why the edit of your next post didn't show character-set (change) issues ... danged moving targets .....

I just happened on the Nov 22 Announcement about the charset issue. I didn't look at it before because I didn't see how it could involve me.

To the best of my recollection, all my font settings were in Western (ISO) so I was surprised to see at your prompting that F-Fox was in Unicode. Perhaps you have an idea about whether Moz's last update had anything to do with that.

More importantly; which of the 2 are optimum for SC specifically, and other forums and sites generally? I did re-set mine to Western and the other little peculiarities with other sites that I mentioned when browsing in F-Fox seem to have gone away; although it is still a bit soon to be sure. In my case with the alt-edits in SC, the hashing seemed to occur when I had used quotation marks in the non-edited portion of the text.

JFTR: I always have "Enable emoticons?" un-checked.

Link to comment
Share on other sites

No idea about 'optimized' .... I spent quite a lot of time trying to compare this, that, and the other .. trying to come up with 'exactly' what was different between the two servers ... confounded primarily because 'almost everything' was 'different' <g> .... So, branched out into more extended research, found that the symptom is one of the downsiides of trying to 'internationalyze' things.

In the 'good old days' .. one would handle these issues with the basic hardware configuration .. such as loading a spcific character map and replacing the print-head on a printer, for instance. Generally, that worked 'well' (?) Trying to accomplish this with software generally works when limiting user input to a specific range, say within the office ..... However, throwing this 'functionality' into the mix where user input comes from all over, those 'external' systems being configured in a myriad of different ways, all feeding into a bit of code developed under somewhat controlled conditions, but put into the wild and now expected to work with code developed elsewhere by someone else under totally different conditions, and all ending up being installed under some other bit of code developed elsewhere by someone else who made other decisions under other conditions ..... and all being placed under the control of some fool without a clue .... kind of amazing that some of this stuff works at all, eh? <g>

That said, thanks for looking through your details, posting back with some answers. Yes, you've definitely helped to narrow down where and why it's happening. However, how to solve it is really kicking me. That I find that others with more experience, knowledge, skills are also a bit confounded takes some of the pressure off, yet .... (also in answer to another of your questions) ... I have this mental/attutude problem in that not having/knowing the solution to a question drives me crazy .... so the search continues.

The eas answer is to take 'everything' back to only 'recognizing' plain english (or at least that 'default' Windows scenario of iso-8859 or some such) ... but the way of the 'future' does include the i18n stuff ..... yet noting that spammers have long used that to their advantage already ... just a definite pain all the way around ....

Link to comment
Share on other sites

It seems to me that a big part of the problem is that we can not see what is actually happening.

It appears to be that all characters are actually stored as character codes which in reality are stored in bindary, but all we see is the translated displayed character.

One of the most common conversion problems seems to be when a user inserts a single or double quote pair (separate character code for left and right quote marks) These appear to me to be different from the character code use by the standard US english keyboard that has a single key (unshifted for a single quote) (shifted for a double quote) for quote marks.

Is there a character map available for the forum default font for both charset=UTF-8 on the new server and the charset used on the old server?

Link to comment
Share on other sites

among other things going on today, i was looking up some old data, came across some posted MySQL results, posted within the [ code ] brackets so as to keep the whitespace showing .... most of that whitespace stuff (assume spaces for the most part) have been converted to "?" .... I can find no rhyme or reason for this. These posts weren't edited, haven't been 'touched', yet the data in the MySQL database has actually been changed. As it happens, this is what I've been searching throgh for the last couple of hours .. and truthfully, coming up with zero answers.

Link to comment
Share on other sites

  • 2 weeks later...

Wazoo;

A one-time observation that might reflect some technical matter you would understand.

I posted to: http://forum.spamcop.net/forums/index.php?...iew=getlastpost

an hour or so ago and had the devil of a time after I tried to effect an edit. It still isn't quite right, but it's close enough for me to leave it be.

What I did try, and got joy from, was doing it as a Complete Edit instead of a Quick Edit.

If it makes any sense to you, it seems to me that in the Quick Edit application some "English" punctuations get interpreted as, or by, a program intended for a different language; like Spanish.

Link to comment
Share on other sites

Invision Power Services > Bug Tracker > IP.Board > Fixed

http://forums.invisionpower.com/index.php?...g_title_id=4281

Gotta love the differing concepts of 'support'

Matt fixed this (so we hope at least).

Of course, nothing about what the fix was, when / where it might show up for others to use .... but, be so happy that it was fixed (somewhere) ...????

Link to comment
Share on other sites

Wazoo, you been working on the edits? Out of nowhere, the "This post has been edited by ..." came back (been missing for some time, apart from your own - Admin - edits). Just wondering if this has also fixed the character morphing in quick edit and wherever. A test seems to show apostrophes, dashes etc. are now stable...

[Well, "edited by" isn't evident here & now with my own post but it appeared in http://forum.spamcop.net/forums/index.php?...ost&p=52826 ]

Link to comment
Share on other sites

Wazoo, you been working on the edits?

I'll be honest. I really haven't done squat here or in/on a half-dozen other support forums n a few days. I have checked in here every now and them, basically to make sure that it was still running, with no issues. Thankfully, it's been amazingly slow, considering that enough time has passed that all those new xmas computer (users) should have been on-line for a bit.

That attitude thing going on, issues down the street, ... a computer got dropped off with a CD that the owner couldn't remove .. turned out to be Clapton doing some acoustic blues ... reminded me of just how out of practice I've become, not really able to remember the last time I actually pulled the acoustic out of its case (not even going to talk about the electric, it's so buried at present) ....

Had thoughts of loading up a tossed-away WinBook-XL (233MHz P-III, 80M RAM) into a linux box ..... up to something over a dozen different distros of different flavors ... only three of them actually making it to a 'seeing a prompt' stage, one actually bringing up a GUI, none actually able to fire up the PCMCIA wireless card to actually make a connection .... lots of time lost there for sure ....

Out of nowhere, the "This post has been edited by ..." came back (been missing for some time, apart from your own - Admin - edits).

[Well, "edited by" isn't evident here & now with my own post but it appeared in http://forum.spamcop.net/forums/index.php?...ost&p=52826 ]

My guess, you either did a 'quick edit' or didn't see the checkbox in a full edit .... either way, the glow would go that the poster had edited that post, which left the 'Edited by:' line in the post .. your edit then 'update' that line ..... the only way 'around' this is to do a 'full' edit and remove the checkbox flag before saving ....

Just wondering if this has also fixed the character morphing in quick edit and wherever. A test seems to show apostrophes, dashes etc. are now stable...

The more searching I do on the issue, the more it becomes apparent that it's happening all over. The drive to internationalize everything has left things in quite a mess. The 'randomness' seen here I have (in my opinion) isolated it to an issue of just where your browser has been / what it's seen just prior to doing an edit of a posting made by someone else with the same 'conditional background' ..... if 'everything' prior to the edit was seen as iso-8859-x stuff, things flow along gracefully. If one of the two actions occurred after the browser had been 'adjusted' to display a utf-8 page, then the data saved by the post/edit of that post also gets 'adjusted' .....

I was chasing down a Microsoft employee's Blog the other day, all hosted on a Microsoft server, discussing the packaged 'virtual PC' (ass-backwards in my opinion, allowing one to use a 'current XP with IE7 installed' system to also then retrograde back to also then run IE6 on the same system to check for web-page issues .... where the real issue is web-pages that worked fine for IE6 but are 'broken' under IE7 ...???) .. anyway, imagine my surprise to see Blog entries (assumedly plain text stuff other than the actual and real other-than-English posts) displaying some of these same issues ... quote marks, commas, etc. showing up as utf-8 character-string definitions .... in 'some' posts, most having no issue at all ....

Link to comment
Share on other sites

Thanks for the update, mate. I'm dusting off my abacus, not a lot of chaos to be found between the "heaven" and "earth" stones.

Heh! Another one of those nasty war stories there <g> ....

Picture me calling the Chinese, Japanese, Korean, etc. embassies (all somewhat 'local' at the time in the Washington D.C. area) asking what the difference was between a Chinese, Japanese, Korean, etc. abacus .... all because the 30+ pound Electrical/Electronic Engineering Dictionary specifically called out the 'Chinese abacus' in a definition .... something I was told finally got changed a couple of revisions down the road ...

Link to comment
Share on other sites

...asking what the difference was between a Chinese, Japanese, Korean, etc. abacus ....
:lol:

What, you missed the Russian one? Recall an Isaac Asimov article yonks ago about traditional Russian binary counting (how to count to 1024 decimal on your fingers, more if you use trinary). Didn't know the Russians ever needed the abacus until Wikipedia came along (Я не культурн).

Link to comment
Share on other sites

:lol:

What, you missed the Russian one? Recall an Isaac Asimov article yonks ago about traditional Russian binary counting (how to count to 1024 decimal on your fingers, more if you use trinary). Didn't know the Russians ever needed the abacus until Wikipedia came along (Я не культурн).

Steve;

Give Wazoo a break. He'll never find "yonks" in a "Yank" dictionary. Its just us'n Commonwealth folk what uses a "Concise Oxford Dictionary" et.al. libr. such as what's been out since the 70's that gets to see an entry for it.

rod

Link to comment
Share on other sites

...He'll never find "yonks" in a "Yank" dictionary. ...
Dictionary not required - a couple of episodes of AbFab is more than enough of an education in the Sloane Ranger dialect. Or did the Poms only inflict AbFab on the colonies?
Link to comment
Share on other sites

What, you missed the Russian one?

The 'Asian' embassy contacts I could 'cover' due to the Dictionary listing. Hitting the Soviet embassy from within that secure facility would probably have set off a few too many alarms <g>

Link to comment
Share on other sites

Dictionary not required - a couple of episodes of AbFab is more than enough of an education in the Sloane Ranger dialect. Or did the Poms only inflict AbFab on the colonies?

Sorry to break the news to you, but the BBC couldn't fob AbFab off on us either. I had to google it to find out what the heck you were dithering on about. It might have been available on satellite or some premium plus cable pack, but as far as I know the MMM didn't carry it.

"Sloanies" haven't made much of/any impact up here either. Now if you want to hear "Valley Girls" patois, all you have to do is walk through any chichi campy burb on our west coast that can pretend to having a private school in it and you'd think you were downwind of Brentwood High. Fer shur and like eeeew!

Link to comment
Share on other sites

Yesterday ... I saw rooster's Linerar post #21 .... way too much extra vertical whitespace ... Cyrillic characters displayed just fine .... QUick Edit to remove the white-space .. saved the post .. Cyrillic characters went to crap ....

Today, came in here due to a 'new' post ... Linear post #21 is showing Cyrillic characters just fine ....

I am so lost on what is actually going on when and where ....

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...