17889 - Changing character set reloads the page from web.

Reporter

Description

•

25 years ago

When a site (any site) has finished loading, pick any character set from the view menu other than current. The page will start loading from the net again. I think it is not needed, since you have all the page source in memory already. P.S. hope it belongs to i18n, if not, appologies.

Frank Tang

Updated

•

25 years ago

Assignee: ftang → buster

Frank Tang

Comment 1

•

25 years ago

buster, we simply do a reload of that page. Is reload in webshell got the data from the net ? Do we have cache now ?

buster

Updated

•

25 years ago

Status: NEW → RESOLVED

Closed: 25 years ago

Resolution: --- → REMIND

Target Milestone: M15

buster

Comment 2

•

25 years ago

seakmonkey has no cache yet. I'll mark this REMIND, so that once a cache is in place we can verify that it is used correctly.

Jesse Ruderman

Comment 3

•

25 years ago

This is similar to bug 6119 (view-source also reloads from the server).

Masatoshi Kimura [:emk]

Comment 4

•

25 years ago

Reopening since cache had arrived and chnaging milestone since M15 is out.

Status: RESOLVED → REOPENED

Resolution: REMIND → ---

Target Milestone: M15 → M16

buster

Comment 5

•

25 years ago

moving out to M18, added PERF keyword I may try to find another owner for this.

Status: REOPENED → ASSIGNED

Keywords: perf

Target Milestone: M16 → M18

buster

Comment 6

•

24 years ago

gordon, can you verify that this is no longer an issue? I don't know how to tell if a page is being loaded from net or cache on my connection here at the office. I think the things to verify are: 1) change charset 2) view source 3) edit page

Assignee: buster → gordon

Status: ASSIGNED → NEW

OS: Linux → All

Matthew T (active 1999-2002)

Comment 7

•

24 years ago

This is not just perf, it's a data-loss issue. Real-world example ... A Japanese customer at the Internet cafe started up IE 5, logged in to a Japanese Web-mail site, noticed that the Japanese text was displaying as gibberish, but didn't know how to change it ... so she started composing a new message anyway. Came the time to attach a file, all the button text was gibberish and she couldn't remember which button was the `Attach' button, so she asked for help. When we changed the encoding to Japanese, IE reloaded the page. Her entire message (entered into a TEXTAREA on the page, as is usual for Web-mail accounts) was lost, with no way to retrieve it. Ouch. Relying on the cache is probably not a good idea. Mozilla should never reload the page on a change of character set, even if the page's caching information says it should always be reloaded, and even if the disk cache is set to zero. In this way, changing character set is similar to save, print, view source, etc.

URL: Any

Keywords: perf

Hardware: PC → All

Matthew T (active 1999-2002)

Updated

•

24 years ago

Depends on: 40867

gordon

Comment 8

•

24 years ago

Sounds like the problem is higher level than the cache.

Assignee: ftang

Frank Tang

Updated

•

24 years ago

Status: NEW → ASSIGNED

Teruko Kobayashi

Updated

•

24 years ago

Keywords: nsbeta3

Frank Tang

Comment 9

•

24 years ago

nsbeta3+ per bug meeting P2

Keywords: perf

Priority: P3 → P2

Whiteboard: nsbeta3+

Frank Tang

Comment 10

•

24 years ago

how can we verify it is reload from the net or reload from the cache? momoi- do you have a web server somewhere that we can check the log to tell ?

Frank Tang

Comment 11

•

24 years ago

how can we verify it is reload from the net or reload from the cache? momoi- do you have a web server somewhere that we can check the log to tell ? can you teach teruko/blee how to view the server log ?

Katsuhiko Momoi

Comment 12

•

24 years ago

Yes. It's quite easy to tell from the server log if reloading leads to a new access to the server. blee already has an access to a server and I'll be happy to assist. If more subtle form of access records is needed that can be arranged, too.

Frank Tang

Comment 13

•

24 years ago

blee- can you try the nsbeta2 with some page have meta charset, and see it access the server once or twice ?

Whiteboard: nsbeta3+ → nsbeta3+, waiting for QA result from blee.

Teruko Kobayashi

Updated

•

24 years ago

QA Contact: teruko → blee

Teruko Kobayashi

Comment 14

•

24 years ago

Changed QA contact to blee@netscape.com.

Frank Tang

Comment 15

•

24 years ago

by using nsbeta2 1. If I hit a page with META charest, it will reload by hitting the http server twice 2. If I change the encoding, it will reload by hitting the http server again 3. If I hit the reload button, it will reload by hitting the http server again.

Matthew T (active 1999-2002)

Comment 16

•

24 years ago

Erm, a gentle reminder: the fix for this bug should have nothing to do with the cache. The page should not be reloaded from anywhere -- Net, cache, whatever -- otherwise you will have problems with DOM stuff (for example, form elements you have filled in, before changing the encoding, being inadvertently cleared). That's why this bug is dependent on bug 40867.

Dan Rosen

Comment 17

•

24 years ago

This bug is closely related to bug 6119, in which viewing source or saving reloads from the web. Not marking as a dup, but this and 6119 are likely to have the same fix.

Frank Tang

Comment 18

•

24 years ago

When the user select a "character set" , it mean the user decide to say- The character set the browser currently used is not the one it should be, change it to THIS charset. By doing so, it WILL and it HAVE to reload because we need to reconvert the data from the byte to Unicode BEFORE we pass to parser. No matter how we fix it, it required RELOAD. The question is we reload from cache or reload from the net. It some code have been apply to the document through DOM, it have to be redo again. Matthew Thomas : >A Japanese customer at the Internet cafe started up IE 5, logged in to a > Japanese Web-mail site, noticed that the Japanese text was displaying as > gibberish, but didn't know how to change it ... so she started composing > a new message anyway. Came the time to attach a file, all the button text >was gibberish and she couldn't remember which button was the `Attach' button, > so she asked for help. When we changed the encoding to Japanese, IE reloaded >the page. Her entire message (entered into a TEXTAREA on the page, as is >usual for Web-mail accounts) was lost, with no way to retrieve it. Ouch. The story tell us 1. If the user correct her mistake when she first see the garbage, she won't have this problem 2. If people delay the time to correct the mistake, they need to pay a huge price. 3. If s/he pay a huge price this time, s/he won't make such mistake twice in his/her life again, and s/he will remember how to do that for her/his whole life. 4. The current SeaMonkey behavior is not worst than IE. Suggest nsbeta3-

Whiteboard: nsbeta3+, waiting for QA result from blee.

Dan Rosen

Comment 19

•

24 years ago

Frank: Matthew's story is fun, but it's not the worst case behavior of this bug. The worst case is (for example) when, at the end of a big important transaction (such as money transfer, stock transactions, etc.) a page is displayed as the result of your cgi POST transaction. You need to change the charset, but in doing so, you re-execute the POST because the document is reloaded. This doubles your transaction, which you *really* don't want to do. As you noted, there are two places we can get documents from: the cache, and their original source. Necko does this for us. What we need is what's being discussed in bug 40867 (see Bill Law's comments on 2000-8-24), which is a way besides the cache for Necko to hang on to the current document. See also bug 6119 which describes several other symptoms of this same problem. You might want to reassign this or mark as a DUP of either 40867 or (better) 6119.

Matthew T (active 1999-2002)

Comment 20

•

24 years ago

I don't think this is a dup. Once 40867 is fixed, it may be with a new function or whatever that the fix for this bug needs to call to get the source when it reconverts the bytes to Unicode. You may want to note in bug 40867 that the source needs to be retained at the byte level, not the Unicode level. I can understand this getting nsbeta3- because it's `no worse than IE', though the use of IE as a measuring stick for whether bugs get +ed or -ed is mildly galling.

Jaime Rodriguez, Jr.

Comment 21

•

24 years ago

Marking as nsbeta3+ per I18N Bug Triage.

Whiteboard: [nsbeta3+]

Frank Tang

Comment 22

•

24 years ago

>The worst case is (for example) when, at the end of a big important transaction >(such as money transfer, stock transactions, etc.) a page is displayed as the >result of your cgi POST transaction. You need to change the charset, but in >doing so, you re-execute the POST because the document is reloaded. This >doublesyour transaction, which you *really* don't want to do. I don't understand this, if this transaction is important, the server shoudl send out the charset in the HTTP Content-Type instead of leave it blank and let user to switch it. If the server send out correctly, the user won't even need to change the character set. The server can easily correct both the display problem and this issue by sending out charset= in the HTTP header. >You may want to note in bug 40867 that the >source needs to be retained at the byte level, not the Unicode level. For this particular bug, we should cache in byte level, not in the Unicode level, because switch the view encoding mean we reinterprete the bytes into Unicode. I think we should - this bug.

Phil Peterson

Comment 23

•

24 years ago

PDT agrees this bug could be nsbeta3-

Frank Tang

Comment 24

•

24 years ago

put (consider to cut) into the status whiteboard

Whiteboard: [nsbeta3+] → [nsbeta3+](consider to cut)

Frank Tang

Comment 25

•

24 years ago

[nsbeta3-] per i18n bug meeting

Whiteboard: [nsbeta3+](consider to cut) → [nsbeta3-]

Ari Pollak

Comment 26

•

24 years ago

*** Bug 53724 has been marked as a duplicate of this bug. ***

Cyril Bortolato

Comment 27

•

24 years ago

*** Bug 55725 has been marked as a duplicate of this bug. ***

Frank Tang

Comment 28

•

24 years ago

give this bug to nhotta since this browser related. Mark this as Moz0.9 and P3. We should decide what we want to do with this bug.

Assignee: ftang → nhotta

Status: ASSIGNED → NEW

Keywords: nsbeta3 → intl

Priority: P2 → P3

Whiteboard: [nsbeta3-]

Target Milestone: M18 → mozilla0.9

nhottanscp

Updated

•

24 years ago

Target Milestone: mozilla0.9 → Future

Andreas Becker

Comment 29

•

24 years ago

Changing QA Contact to andreasb.

QA Contact: blee → andreasb

Andreas Becker

Comment 30

•

24 years ago

Changing QA contact to ylong@netscape.com.

QA Contact: andreasb → ylong

Andreas Becker

Comment 31

•

24 years ago

*** Bug 74043 has been marked as a duplicate of this bug. ***

Hixie (not reading bugmail)

Updated

•

23 years ago

Whiteboard: [Hixie-P3] (HTTP)

Frank Tang

Comment 32

•

23 years ago

*** This bug has been marked as a duplicate of 82244 ***

Status: NEW → RESOLVED

Closed: 25 years ago → 23 years ago

Resolution: --- → DUPLICATE

Danny

Comment 33

•

23 years ago

I can still reproduce the problem reported by the original reporter when Mozilla is in "online" mode (i.e. "view"->"character coding"->[select a different charset] reloads from network). However, when Mozilla is in "offline" mode, it seems to happily re-use the data from cache. Does the fix for bug 82244 completely fixes this bug? Or is there a new bug filed for the "online" reload behavior? Win2k, build ID:2001072703 trunk

nhottanscp

Comment 34

•

23 years ago

reopen

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

nhottanscp

Comment 35

•

23 years ago

Reassign to ftang.

Assignee: nhotta → ftang

Status: REOPENED → NEW

rpotts (gone)

Updated

•

23 years ago

Depends on: 90722

Frank Tang

Updated

•

23 years ago

Status: NEW → ASSIGNED

Frank Tang

Comment 36

•

23 years ago

shanjian- can you help to look at this one?

Assignee: ftang → shanjian

Status: ASSIGNED → NEW

Target Milestone: Future → mozilla0.9.6

Shanjian Li

Comment 37

•

23 years ago

I don't understand why this bug got reopened. Changing charset will reload the page. If cache is available, we will use it. Otherwise, reload the page from web. It might looks possible to redo charset conversion without downloading data from website, but in fact we need big arch change to do that. That is because raw data is not cached anywhere else except cache, and roundtrip conversion usually is not possible. Don't try to reopen this bug. File a new bug if you believe something does not work as expected, but changing character set does need to reload from web (treat network cache as a fast way to access web, and network cache is transparent for other componnets inside browser).

Status: NEW → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → WONTFIX

Yuying Long

Comment 38

•

23 years ago

Mark it as verified per Shanjian's coments.

Status: RESOLVED → VERIFIED

Hixie (not reading bugmail)

Comment 39

•

23 years ago

> It might looks possible to redo charset conversion without downloading data > from website, but in fact we need big arch change to do that. Incompetency and/or lazyness are not valid reasons for WONTFIXing a bug. REOPENing. If you are unable or don't want to fix it, reassign it to nobody. This is a valid bug. Just like with View Source, we should *never* reload from the network unless the user has requested it.

Status: VERIFIED → REOPENED

Resolution: WONTFIX → ---

Shanjian Li

Comment 40

•

23 years ago

reassign to nobody.

Assignee: shanjian → nobody

Status: REOPENED → NEW

Boris Zbarsky [:bzbarsky]

Updated

•

23 years ago

Keywords: helpwanted, nsCatFood

Asa Dotzler [:asa]

Comment 41

•

23 years ago

0.9.6 is long gone. -> 0.9.7

Target Milestone: mozilla0.9.6 → mozilla0.9.7

Matthew T (active 1999-2002)

Comment 42

•

23 years ago

Nobody is nobody@mozilla.org. Accept no cheap imitations. Reassigning.

Assignee: nobody → nobody

Markus Hübner

Comment 43

•

23 years ago

unfortunately this won't make it for 0.9.8 -> 0.9.9

Target Milestone: mozilla0.9.7 → mozilla0.9.9

Frank Tang

Comment 44

•

23 years ago

reassign to ftang and future

Assignee: nobody → ftang

Target Milestone: mozilla0.9.9 → Future

Frank Tang

Updated

•

23 years ago

Status: NEW → ASSIGNED

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 45

•

23 years ago

A neat idea would be to walk the DOM and change all textnodes/attribute-values of the parsed document when the encoding is changed. This assumes two things: 1. That the chars that affect parsing, <>="'&; etc, are the same in all encodings 2. that it's possible to losslessly decode and then encode a textstream though all charsets I don't know if these assumpions are true for most or all charsets. However if it works it would both fix the reload problem fully as well as the problem described in comment #7

Jean-Marc Desperrier

Comment 46

•

23 years ago

> 1. That the chars that affect parsing, <>="'&; etc, are the same in all > encodings True for most encodings. Not true for UTF-7. Kill UTF-7. All character above "@" included are in danger with SJIS. The one you list are below, but I remember not being able _at all_ to display some pages in SJIS before changing the encoding, so there must some case where it creates problems. > 2. that it's possible to losslessly decode and then encode a textstream though > all charsets To achieve the intended purpose, all conversions from local charset to unicode that encounter illegal characters should encode them to special, reserved, garanteed not to be displayed and not to be later interpreted as normal unicode sequences that can be reversed to go back to the original value. In fact, only 256 of those are needed. The expansion coefficient would be very bad, but we don't actually care in this case. I think this would be useful for some other bugs. There's for example that bug with non 7 bit characters in newsgroups name, where it's systematically interpreted as ISO-8859-1, but it can be something else. The unicode is reconverted to ISO-8859-1 when sending the data to the newsserver, but if there was originally an illegal in ISO-8859-1 character, it will be lost. I'm sure they are some other cases for that.

Andrew Hagen

Comment 47

•

23 years ago

Bug 40867 is now fixed. Fixing this bug should be possible now. Nominating for Mozilla 1.0.1.

Keywords: mozilla1.0.1

Frank Tang

Comment 48

•

22 years ago

> 1. That the chars that affect parsing, <>="'&; etc, are the same in all > encodings not true for ISO-2022-JP neither.

Uri Bernstein (Google)

Comment 49

•

20 years ago

Is this bug report good for Firefox as well, or is there a separate one for Firefox (or should I submit one)?

Frank Tang

Comment 50

•

20 years ago

what a hack. I have not touch mozilla code for 2 years. I didn't read these bugs for 2 years. And they are still there. Just close them as won't fix to clean up.

Status: ASSIGNED → RESOLVED

Closed: 23 years ago → 20 years ago

Resolution: --- → WONTFIX

Travis Chase

Comment 51

•

20 years ago

Mass Re-assigning bugs that Frank Tang Closed on March 1st Spam is his fault Mass Re-Open to follow

Assignee: ftang → nobody

Travis Chase

Comment 52

•

20 years ago

Mass Bug Re-Open of bugs Frank Tang Closed with no good reason. Spam is his fault not my own

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Travis Chase

Comment 53

•

20 years ago

Reassigning Franks old bugs to Jungshik Shin for triage - Sorry for spam

Assignee: nobody → jshin1987

Status: REOPENED → NEW

Daniel B.

Updated

•

20 years ago

Blocks: 288462

Reinout van Schouwen

Comment 54

•

19 years ago

The fact that this is a real reload causes the page to not jump back to the position it was at when the character set change was initiated. This is also a problem for embeddors.

Simon Montagu :smontagu

Updated

•

19 years ago

Blocks: 336109

Grey Nicholson (they/them)

Updated

•

18 years ago

Flags: blocking1.9?

Keywords: dataloss

Boris Zbarsky [:bzbarsky]

Comment 56

•

17 years ago

So... bug 336109 is not really this bug, as this bug was filed. It's a bug about us not persisting layout state, which is ON PURPOSE for charset reloads: see the CVS blame for why.

Boris Zbarsky [:bzbarsky]

Comment 57

•

17 years ago

Point is, dupping it here makes sure that issue will never get addressed. Comment 54 is about bug 336109, not this bug. We restore scroll position just fine on loads from the network if we want.

Simon Montagu :smontagu

Comment 58

•

17 years ago

I don't follow the last two comments. Apart from the totally uninformative "see the CVS blame for why", if not persisting layout state is on purpose, does that mean that bug 336109 WONTFIX, or does the issue need to get addressed?

Boris Zbarsky [:bzbarsky]

Comment 59

•

17 years ago

This bug is about us sometimes pulling updated data from the server on charset reload. Bug 336109 is about not preserving scroll position. The reason we don't is that a single mechanism is used for scroll position restoration and form control state restoration. And we don't want to restore the text for controls that had text by default, since it'll have been decoded with the wrong charset. This came up recently in some bug where I did the CVS archeology to dig this stuff up.... So the point is, the scroll/cursor state not being restored is just a bug that needs fixing. The form control state not being restored needs serious thought, because it's not clear what the right fix would be. Perhaps we can consider just restoring it and hoping the the "value changed" checks we now do in content will prevent the wrong-charset-used-initially thing from biting. But no matter what, all of that is not this bug.

Boris Zbarsky [:bzbarsky]

Comment 60

•

17 years ago

Bug 391632 is what I was thinking. So in fact, bug 336109 might be a duplicate of bug 134911. Should fix the latter and then retest. ;)

Simon Montagu :smontagu

Comment 61

•

17 years ago

Attached patch experiment (deleted) — Details — Splinter Review

FWIW, this is how I tried to fix this by doing the same as view-source does.

Simon Montagu :smontagu

Updated

•

17 years ago

No longer blocks: 336109

Damon Sicore (:damons)

Comment 62

•

17 years ago

Not a regression. -'ing.

Flags: blocking1.9? → blocking1.9-

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: amyy → i18n

Henri Sivonen (:hsivonen) (away from Bugzilla until 2023-09-11)

Comment 64

•

9 years ago

The Text Encoding menu is unused in 99.99% Firefox sessions. It's not worthwhile to tweak cache behavior for such a rarely used feature.

Status: NEW → RESOLVED

Closed: 20 years ago → 9 years ago

Resolution: --- → WONTFIX

Henri Sivonen (:hsivonen) (away from Bugzilla until 2023-09-11)

Comment 65

•

9 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #64) > 99.99% Not a rhetorical percentage but an actual telemetry reading.

Alain Knaff

Comment 66

•

9 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #64) > The Text Encoding menu is unused in 99.99% Firefox sessions. It's not > worthwhile to tweak cache behavior for such a rarely used feature. Cache not being used for various excuses seems to be a common thread in many long-standing firefox issues. Even if some of these only happen in 0.01% of the sessions, taken together, they surely amount to more than that. And a behavior can become annoying even if it happens much less often than 100% of the time. So please fix these cache&history issues once and for all. It might also be worthwhile comparing the time needed to fix it, with the time taken discussing it... If it takes more time (collectively) coming up with excuses and diversions why it shouldn't be fixed than it would take to just fix it, then that means it's worthwhile fixing.