Open Bug 69687 Opened 24 years ago Updated 2 years ago

Spellchecker: Multiple language design

Categories

(Core :: Spelling checker, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: lizal, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: parity-chrome)

Attachments

(3 files)

Sometimes is handy to check spelling during composition. Therefore, I suggest to include "spell now" in the mail composition toolbar. Also, I do not think that it is wise to have the "check spelling before sending" as global preference/option. This setting could be different for different email accounts used (my case). If it must stay as global preference, then I would prefer to add option (or option menu) for spelling in the toolbar "do not spellcheck this message" to override the global pref. setting. Also-what about different language spellchecks? Reason: the Mail/News design was not made well enough for cases when people use multiple languages for emails and multiple accounts. And - it usually requires different settings for accounts or, at least, individual emails. Consequently, for different languages different iso-pages could be appropriate in the composition. Solutions: Proposition 1 (minimalistic)- allow in composition of email override the global setting. Proposition 2 (maximalistic)- change the design to allow global setting, that could be overriden by account setting and the account setting could be overriden by individual email setting during composition. (this proposition would probably require to morph/split this report according to the "level" of the accepted solution(s)).
Since Moz. does not have speller (and from the comment I got impression that it would never be implemented), the checkbox in Preferences/MailandNewsgroups/MessageComposition/ComposingMessage named "Check spelling before sending" is redundant and misleading. Therefore the checkbox (option) should be removed. (that's the way I got to the original comment - consistency of options across different menus) Changing summary to "Redundant Spelling CheckBox" from Missing Spell now button/multiple language design and reopening, guessing the severity to be trivial.
Severity: enhancement → trivial
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
Summary: Missing "Spell now" button/multiple language design → Redundant Spelling CheckBox?
Okay marking New although its grayed out in the latest builds.
Status: UNCONFIRMED → NEW
Ever confirmed: true
> Since Moz. does not have speller (and from the comment I got impression > that it would never be implemented), the checkbox in > Preferences/MailandNewsgroups/MessageComposition/ComposingMessage > named "Check spelling before sending" is redundant and misleading. > No, it's not! First, one can download the spellchecker.xpi from Netscape and use it with Mozilla. Second, see bug 56301 - there are people working on open-source spellchecker with Mozilla (that hopefully would use existing spellchecker interface). It's true that there is no reason for that preference to be there when there is no spellchecker installed. That issue is addresse in bug 84607 ("Hide spellchecker pref"). I am reverting summary to what is was originally.
Severity: trivial → enhancement
Summary: Redundant Spelling CheckBox? → Missing "Spell now" button/multiple language design
Depends on: spellchecker
The "spell" button does exist is recent builds.
Summary: Missing "Spell now" button/multiple language design → Multiple language design
Summary: Multiple language design → Spellchecker: Multiple language design
Depends on: 58615
reassign to varada
Assignee: ducarroz → varada
Blocks: 119232
Just a note in case there will ever be a spellchecker in Mozilla. Or see it as a UI suggestion for NS 7 (don't know the interface NS 7 has now). Preference settings per account whether to check per default and what language to use. A language setting per address book entry. Note: there is the issue of contradicting settings (use the majority of header enttries? With To: weighted higher than Cc:?) A language setting per newsgroup. Again we have an issue, but on the Usenet you can safely (?) assume a xpost is in English. A setting per mail. Really cool would be a word processor like setting per paragraph.
Depends on: 180346
taking all of varada's bugs.
Assignee: varada → sspitzer
QA Contact: esther → core.spelling-checker
I wan't to emphasize as a Swedish user the possible benefit of having the address book remember the language for that person so that I won't have to all the time switch between languages.
Product: MailNews → Core
for language with non-us-ascii charsets autodetection in easy i vote for per-word or per-paragraph autodetect, because have a lot mixed english/russian emails
Only per-word autodetect. Why do it half way? Language can change from a word to a word.
This is what I have noticed going on so far with the spell checker in regard to multiple language design. I compose emails in Swedish or English, sometimes within the same email. Sorry about the lengthy comment. 1) A preference per account with override per email (proposition 2 by the original poster), while a good start, will not be very helpful for people who have only one account. A workaround for that would be an additional preference on a per-contact basis (comment 8 and 10). The order of preferences would be: Current email overrides contact, overrides account, overrides global. The preference for each level could also be "no preference" in which case it reverts to the next level, or at the end to current functionality. That way the user can choose if spell checking language is per account, contact or email. However, what should the logic be if multiple recipients (with different settings) are selected for the same email? The solution with weighted selection of language based on frequency on the To and CC line is not good because it will seem arbitrary to the user. Perhaps default to the preference for the contact that was first added on the To-line? A button in the toolbar (proposition 1 by the original poster) would help. A global preference to show that button, or it could become visible by default when you have multiple dictionaries installed. Of course, none of this will work very well if you first compose your email, then spell check, and then add the recipients. 2) If I add a word to my "personal dictionary", it's added in with all other words in the same list regardless of which language is being checked at the time. This is also how Microsoft Outlook does it. Having one list per language would be better. 3) When replying to an email in Swedish, the section "from N N, on m/d/y:" is caught by the spell checker as a misspelled word. Not a problem, it's easy to add "on" to the personal dictionary. Just a multi-language experience thing that is hard to know about if you don't use multiple languages on a day-to-day basis. 4) Language-sensing (the ultimate fantasy for multi-language spell checking). The office programs do a word-for-word check, and IMHO that's how it should be done. Microsoft Outlook does not do this either unless you use Word to compose emails. Sounds like a hard thing to put together, but it would be beautiful. I agree with comment 12, and bug 234143. Off-topic: I just have to add this: Thunderbird kicks ass, and I'm just amazed that I can write this and it might actually be read by the programmer who is working on this feature. For almost all other significant software products - dream on! Thanks!
Just now, I emailed my brother (in Swedish) and the spell checker didn't work seamlessly because there was an error message (in English) as part of the email. This makes me vote for word-for-word style which would bypass a lot of the preference-ickiness anyway.
*** Bug 227610 has been marked as a duplicate of this bug. ***
Component: MailNews: Composition → Spelling checker
OS: other → All
*** Bug 234143 has been marked as a duplicate of this bug. ***
Depends on: 300077
*** Bug 301927 has been marked as a duplicate of this bug. ***
*** Bug 303079 has been marked as a duplicate of this bug. ***
*** Bug 253984 has been marked as a duplicate of this bug. ***
*** Bug 313238 has been marked as a duplicate of this bug. ***
*** Bug 357973 has been marked as a duplicate of this bug. ***
Blocks: 334320
If the user has multiple dictionaries installed, couldn't one just guess the language by looking, which dictionary produces less misspelled words? I would suggest one guess after e.g. 10-15 words. afterwards, the user has the possibility, to change the used dictionary.
I would also vote for (optional) combined spell-checking, i.e., when a word is counted as correctly spelled if it is present in at least *one* of installed dictionaries and misspelled if it is absent is *all* dictionaries. This option should be easy to implement.
(In reply to comment #31) > If the user has multiple dictionaries installed, couldn't one just guess the > language by looking, which dictionary produces less misspelled words? I would > suggest one guess after e.g. 10-15 words. afterwards, the user has the > possibility, to change the used dictionary. > That's a good idea, but should ideally be implemented per paragraph, or for any selection that the user makes, i.e. allow multiple languages within the same email body. See how it's implemented for MS Word. That is (I think) the functionality to try to emulate. In MS Word, it works similar to making text bold, italics, etc. along with an auto sensing function (which mostly works). (In reply to comment #32) > I would also vote for (optional) combined spell-checking, i.e., when a word is > counted as correctly spelled if it is present in at least *one* of installed > dictionaries and misspelled if it is absent is *all* dictionaries. > This option should be easy to implement. > This might be a useful option if you have two very dissimilar languages (e.g. English-Chinese, English-Russian) installed, but not for English and Swedish (and many other combinations). There are too many similar words.
(In reply to comment #33) > That's a good idea, but should ideally be implemented per paragraph, or for any > selection that the user makes, i.e. allow multiple languages within the same > email body. See how it's implemented for MS Word. That is (I think) the > functionality to try to emulate. In MS Word, it works similar to making text > bold, italics, etc. along with an auto sensing function (which mostly works). MS-Word uses the current keyboard group selected to "detect" the language and embed that information into the text. For various reasons this is not possible under X11. > (In reply to comment #32) > > I would also vote for (optional) combined spell-checking, > ... > This might be a useful option if you have two very dissimilar languages (e.g. > English-Chinese, English-Russian) installed, but not for English and Swedish > (and many other combinations). There are too many similar words. That is a workable solution for many users that have such setup (two or more languages with completely different character set).
Two comments -- bug 405162 is a variation on this theme. The intent was if I had multiple dictionaries loaded, instead of the standard red-underline of a misspelled word, it might have a different indicator to tell me it was in an alternate dictionary. Perhaps a user might even assign different colors if they have more than 1 additional dictionary. ForWhatItsWorth -- MSword (at least the copy I have,2002) is not the best example to follow for *autodetection*. While it might get it right if my entire document was in one language, even mixed paragraphs confused it -- let alone mixed words. MSWord does support marking each word in a different language, but don't expect it to auto detect and set the language on a word level. Out of curiosity, I decided to look at the output of a webpage with mixed US English, French and UK English. For this web text: Do you speak French? Parlez-vous français ? Do you speak English? Parlez-vous Anglais ? How are you today? Comment allé vous ? Do you like the colour of the theatre? -------------------------------------------------------------------- MS's 'filtered web source' (skipping headers and such): <meta http-equiv=Content-Type content="text/html; charset=utf-8"> <body lang=EN-US> <div class=Section1> <p class=MsoNormal>Do you speak French?</p> <p class=MsoNormal><span lang=FR>Parlez-vous français&nbsp;? </span></p> <p class=MsoNormal>&nbsp;</p> <p class=MsoNormal>Do you speak English?</p> <p class=MsoNormal><span lang=FR>Parlez-vous Anglais&nbsp;?</span></p> <p class=MsoNormal>&nbsp;</p> <p class=MsoNormal>How are you today? </p> <p class=MsoNormal><span lang=FR>Comment allé vous ?</span></p> <p class=MsoNormal>&nbsp;</p> <p class=MsoNormal><span lang=EN-GB>Do you like the colour of the theatre?</span></p> <p class=MsoNormal><span lang=EN-GB>&nbsp;</span></p> </div> </body> -------------My filtered idea without MS-class code------: (my comments preceded by '#') <body lang=EN-US> # this could be set from the users "default" lang? <p>Do you speak French?</p> <p><span lang=FR>Parlez-vous français&nbsp;? </span></p> <p>&nbsp;</p> <p>Do you speak English?</p> <p><span lang=FR>Parlez-vous Anglais&nbsp;?</span></p> <p>&nbsp;</p> <p>How are you today? </p> <p><span lang=FR>Comment allé vous ?</span></p> <p>&nbsp;</p> <p><span lang=EN-GB>Do you like the colour of the theatre?</span></p> <p><span lang=EN-GB>&nbsp;</span></p> </body> ============ Is "span" needed if the language spans the entire paragraph? Note: on last two paragraphs -- it closed the span, then needed to reopen it within the next paragraph. If the 'lang' tag could be added as a attribute of the paragraph in the same way it is added to the body tag, it might make for shorter and cleaner code. One could still use the <span> tag within a paragraph if one needed to change language definitions within the same paragraph - but at least one could set a default for the paragraph which I would think would be the more common case. But merging lang into <p>: <body lang=EN-US> <p>Do you speak French?</p> <p lang=FR>Parlez-vous français&nbsp;? </p> <p>&nbsp;</p> <p>Do you speak English?</p> <p lang=FR>Parlez-vous Anglais&nbsp;?</p> <p>&nbsp;</p> <p>How are you today? </p> <p lang=FR>Comment allé vous ?</p> <p>&nbsp;</p> <p lang=EN-GB>Do you like the colour of the theatre? <br> &nbsp;</p> # mixed line example <p lang=EN-US>Well, as the French say "<span=FR>C'est la vie</span>".</p> </body> -------------------------------------- I don't know about "folding" bug 405162 into this bug, it's not really a "dup", as the aims of this 'enhancement are farther reaching, I think, than what I suggested in the other, but certainly, depending on how this feature is implemented, the suggested/desired functionality in 405162 might just "fall out" of the implementation for this one...
I would suggest a UI approach like Evolution has. It's in my opinion really userfriendly! See screenshot here: http://img369.imageshack.us/img369/5093/spellpe4.png
Please, review the following draft patch which enables multi language spell checking (actually, spell-in-all-available-languages). Use user_pref("spellchecker.alldicts", true); to enable this functionality. Additionally, see bug 471799 which can block this patch (so I added a workaround for en-US). Suggestions are welcome.
Attached patch proposed draft patch (deleted) — Splinter Review
This only implements a loop around all the installed dictionaries. While a big improvement in itself, this is not enough for a spelling checker. When a word exists in 1 language, it doesn't mean it's correctly spelled in the language of the surrounding paragraph (where it might need to be spelled differently). I routinely use 4 languages, and this is one of the mistakes I make the most (I normally select the language of the email to force the spellchecker in the correct language). But I repeat that is is already a big improvement, so it should not be rejected.
(In reply to comment #42) > When a word > exists in 1 language, it doesn't mean it's correctly spelled in the language of > the surrounding paragraph (where it might need to be spelled differently). In case of plain text (there is no <span lang="en-US"> or <p lang="en-US">) it's hard to determine the language of the word, so spell checking in Mozilla uses one global language. To be 100% sure you need to set (and have this ability) the language manually or try to guess it by scanning surrounding words or using system information (like keyboard language, which is a bad idea, especially in X.org). So I suggest to start using this loop-around-all-installed-dictionaries approach as a start point and improve it if somebody volunteers. ;)
(In reply to comment #40) > Please, review the following draft patch which enables multi language spell > checking (actually, spell-in-all-available-languages). You should request review from someone for this patch or it can get lost (see https://developer.mozilla.org/en/Getting_your_patch_in_the_tree ) Unfortunately, it's not clear who owns /extension/spellcheck nowadays (mscott is gone), probably you could try David Bienvenu.
This is would be a nice addition to Thunderbird too, probably you should try to involve also David Asher
Unfortunately, this patch is not very usable :( I have three dictionaries and Firefox freezes for a few seconds when I paste 10-20 words in a textarea from another window. Sequential verification words takes too much time.
Well, I believe this an area where MoMo should focus its development efforts, after all composing messages it's the most critical are of a mailer! The current Thunderbird implementation is really primitive compared to the one in Mail.app.
A few hours ago I filed bug 476623 because I did a search for automatic language detection this bug didn't come up, guess it's because it drifted from its original more general goal (multiple language design) to this more specific (automatic language detection). Probably we should consider changing the title. Anyhow, I noticed that nobody reported two extensions that implement automatic language detection in Thunderbird and Firefox reasonably well: http://en.design-noir.de/mozilla/dictionary-switcher-tb/ http://en.design-noir.de/mozilla/dictionary-switcher/ It'd be a good idea to contact the author and ask him if he wants to work on this bug.
Compare bug 481884.
The good thing about free software is that one project can do something and another project can reuse it. Solving such a bug would be rather more urgent for a project such as OpenOffice.org. Mozilla guys can stay focused on making the best Web client and HTML reader, while OO.o guys can stay focused on making the best document editor. So this request should be opened in OO.o's request tracker if it isn't already. When OO.o guys implement a clever multi-language spell checker, Mozilla can just integrate their code.
(In reply to comment #51) > When OO.o guys implement a clever multi-language spell checker, > Mozilla can just integrate their code. OOo used to have this spell-check-in-all-available-dictionaries feature but it has been removed in 3.0.1. See http://www.openoffice.org/issues/show_bug.cgi?id=69451 for details. IMHO, the most "clever" way to set the language(s) is to leave this choice to users, just like it's done in Evolution (see http://img369.imageshack.us/img369/5093/spellpe4.png).
FWIW, I'm not sure the "check all languages" is going to work well: A Unix distro may have 10 or more dictionaries installed by default. I don't think it makes sense to check all my words against Chinese and Italian, if I can only write German and English. I think the user should be able to explicitly select the languages (several), and we check against (all) those.
The spellchecker handles the words one by one for each dictionary. This is a linear process. I think that spellchecker should be parallel.
i suggest to put this into the right-click-context-menu, similar to firefox i currently use firefox 3.1b3 ("Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1b3) Gecko/20090305 Firefox/3.1b3") and this is working great
Can't check spelling for 2 or more languages. Please fix this bug. It's not an enhancement - it's a bug.
Bump. There is already a checkbox UI for the spell checker (in the right-click menu). What needs to be implemented is the backend and the ability for more than one checkbox to be checked at a time.
I would suggest the possibility to define a list, where a user can add websites and choose the dictionaries, that he/she wants to be automatically selected on a specific page. For Example: https://bugzilla.mozilla.org: English (US) http://web.de/: German (DE) And so on... It would be the best solution, I guess.
The Top-level domain should be helpful for this to. .de is obviously german .co.il is hebrew. and so on. Of course there are some international domain at which point adding a setting like suggested in Comment 65 would be helpful.
Another solution would be, that Firefox simply memorize, which dictionary was selected at last on a page, by keeping a history about it.
Site specific dictionaries won't work for places like gmail.com for example, where I write e-mail in both English and Swedish all the time. Which dictionary to use has to be deduced from the actual text.
(In reply to comment #66) > The Top-level domain should be helpful for this to. > > .de is obviously german > .co.il is hebrew. > and so on. No way to make this a default. Who says that everyone on a .co.il domain has to use Hebrew by default ? Other languages are used too in that country, you know ? And what about multi-lingual countries, like .be and .ca ? Actually assuming a single language without asking the user beforehand, is illegal in Belgium, that's why most website (except the monolingual ones) will prompt you. Or at the very least, the switch to a different language should be on the first page that you see. And I haven't started about .com or .org ... > > Of course there are some international domain at which point adding a setting > like suggested in Comment 65 would be helpful.
(In reply to comment #67) > Another solution would be, that Firefox simply memorize, which dictionary was > selected at last on a page, by keeping a history about it. Automatically selecting the last dictionary used on a single page can easily be stored per host (like the zoom-level). We just need a good way to indicate the dictionary that is currently used, as it might switch from website to website (I often open the context menu just to take a peak at the selected language). Ofcourse it doesn't help people who keep switching languages like in comment 68. For them, they need an automatic way to recognize the language, which is probably out of Mozilla's league (even Google is often incorrect). Selecting multiple dictionaries at the same time isn't going to much of a help.
Firefox being able to remember the last dictionary used per site would still be an improvement
Thunderbird's spell checker only allows a single language to be selected at a time. On the other hand, Evolution has an elegant way to allow the user to select multiple languages for spell-checking: 1. In Evolution's options, we have a list of all the installed dictionaries. We then select which dictionaries we want, using checkboxes; 2. When we are composing an e-mail using spell-check, Evolution uses all the selected dictionaries to find a match for words; 3. When we right-click on a marked word, we get a menu for each of the selected languages, each with their own suggestions. This would be a simple, yet great addition to Thunderbird, since many of us users are multilingual and use terms in another language (mostly English) in our normal writing.
Also with Midori (WebKitGTK+ browser) you can set multiple dictionaries at once : rather convenient actually
(In reply to comment #13) > ... > 2) If I add a word to my "personal dictionary", it's added in with all other > words in the same list regardless of which language is being checked at the > time. This is also how Microsoft Outlook does it. Having one list per language would be better. > ... When I look in my Thunderbird personal dictionary most of the words are places or personal names which would be valid whatever language I'm writing an email in. What's needed, when adding a word to the personal dictionary, is to specify 'this language' or 'all languages'. Also I sometimes mis-spell these places and names - but the words in my personal dictionary are not suggested in the context menu. They should be - subject to the language(s) I'm spellchecking in. And finally, the ability to specify a common personal dictionary between Fx and TB (and differing profiles) would be nice.
Check this for checking several languages simultaneously https://bugzilla.mozilla.org/show_bug.cgi?id=660506
It looks like most problems that users report with repeatedly choosing ONE spell checking language would be satisfactorily solved if the user could once for all setup SEVERAL dictionaries to be used SIMULTANEOUSLY for the languages he can write, among those installed on his system. That is especially needed for e-mail since many people write them in several languages (and that's the way Evolution does it). It seems fairly simple to do, see https://bugzilla.mozilla.org/show_bug.cgi?id=676500
I made the suggestion in Bug 334320 to have one single spelling language per document, but to have it automatically selected depending on recipient/newsgroup. Having dictionaries for several languages active simultaneously hides mistakes: "connexion" (in French) misspelled as "connection" and going unnoticed if an English dictionary is also active, or distorsion/distortion...
(In reply to Bertrand Denoix from comment #81) > I made the suggestion in Bug 334320 to have one single spelling language per > document, but to have it automatically selected depending on > recipient/newsgroup. Having dictionaries for several languages active > simultaneously hides mistakes: "connexion" (in French) misspelled as > "connection" and going unnoticed if an English dictionary is also active, or > distorsion/distortion... We have to consider that we are writing in a "default main language" and use to insert in our writing, words from other languages. That means the main dictionary to use and the alternate languages we'll have active, have to be specified for the document we are writing. When spell checking the document, the main language dictionary has to be used. If the spellchecker encounters a word not contained in the main language dictionary it should look at the alternate languages dictionaries. If finally no corresponding word is found than it should propose the nearest words from the main language dictionary first and then the nearest words from the alternate languages dictionaries (indicating the corresponding language dictionary used for each word proposed). In other words, the spellchecker should take care of the fact precedence has the default main dictionary on the alternate language dictionaries, A second, more advanced alternative spell-checking mode, may take notice of the case where we are writing entire sentences, blocks,,, of text in different languages... Therefore this should be a further future implementation.
The concept of "main language" doesn't prevent false positives if the misspelling isn't one in other languages. IMHO mixing languages is a Bad Idea. And if you have some "computable rule" to define a "main language", then you can apply to to be the single language. I do my mail/newsgroups in two languages, but I rarely mix them in the same post.
Regarding the previous comments. A single dictionary hides mistakes too, more abundantly than cross-language, such as "its" and "it's" or "suis" and "suit" or be it only French singular vs plural. It's up to the user to accept the small additional risk and to balance it with the many advantages. English and French are not the only languages, there is no risk of mistaking телефон and telephone and it would be a pity to forbid multi-language to Russian speaking people because one thinks it should be forbidden to be used optionally by others. I have used multi-language to my uttermost satisfaction with Eudora for more than 10 years and I miss it a lot. I'm not sure I have a "default main language" but it's a nice idea to have them ordered. I often mix phrases of English, French and Russian, words even more, just did it. In any case, the permanent multi-language user setting shouldn't me modified, just overridden. Overrides should be made under user's permission. I have little belief in automatic choice. (eBay deciding in what language a Frenchman must write to a Chinese and Firefox following that decision blindly). It's worth having a look at how Evolution does it and get "inspiration" from it. It presents suggestions well separated by language.
The proposal of comment 82 sound good, to prevent the effects mentioned in c83/84 words which came from other dictionaries could be marked in a different colour (e.g. green). Also the main language could be determined by a statistical analysis. The dictionary which contains the highest percentage of found words should be considered the main dictionary for the given text
Well, whatever we do (if anything), a machine-usable rule must be found. Mixing together English, Russian, Greek, Hebrew, Tamazigh (aka Berber), Arabic, Hindi, Chinese, North-Korean and Cherokee wouldn't be a problem because of the different scripts they use. In the case of just French, English and Dutch it would probably be much more delicate due to the common Latin script and the large number of cognates spelt just a little differently. In reply to Bertrand Denoix, comment #83: I often mix languages in a single web page, see e.g. http://users.skynet.be/antoine.mechelynck/ , and in view of the result I wouldn't call it a Bad Idea™; in a single email or web form it also happens, albeit less frequently. But OTOH I shun the false security of automated spell checkers and rely on my own head. ;-) I don't know where my sense of orthography comes from but I suppose I must count myself lucky.
Do you see any problem with the statistical approach. USe the dictionary which matches most of the words in a given email. maybe it could be combined with a default setting per recipient. I guess most people won't have more than 3/4 different dictionaries in use so it should not be a problem. The only problem I see is if the system goes into a messed up state as on my ubuntu instalation and all flavours of the same laguange show up: e.g. ES-CO, ES-ES, ES-...
Handling different flavours (flavors) of the same language needs some thought. My OpenSUSE packaged build of TB comes with the EN-US dictionary. I normally add EN-GB and FR and want to spell in British English or French. But I can't (easily) get rid of EN-US - perhaps I should be able to disable it. EN-US and EN-GB are alternatives: if I select EN-GB I want 'flavor' to be marked as misspelt - but whether I would correct it would depend on whom I was emailing or the NG I was posting to. But in the case of the FR dictionaries I suspect that they're different subsets of the same language - if you have both classique and modern should you be able to select both and treat it as a single language? I don't know. Often I have to run without the EN-GB dictionary - e.g. if it's incompatible with a new version of TB (or Fx). If I add a word into persdict (see my comment #77) I would ideally want to be able to specify 'valid for all EN' so that it would still also be valid for EN-GB once I installed it. With alternative flavours like EN-US and EN-GB both installed, an automatic 'select the language that fits best' process could easily get it wrong on the basis of one spelling mistake or a slight difference in the actual words included in the two dictionaries. But so long as it warned me - 'TB thinks you're writing in US English. Change dictionary?' - that wouldn't matter.
Dictionaries shouldn't be installed in an application but in hunspell. That makes 45 installations instead of 3 as explained in an example from http://www.papou.byethost9.com/notes/Ubuntu_spelling.html (any comment welcome at its e-mail address, as well as WOT votes, thanks)
One issue is not (only) a matter of languages, but of dictionaries. A spell checker's most wanted feature, they say, is multi-dictionary and hunspell has made it. But, in Mozillan terms, it is not "exposed" with an UI, which is what is suggested by bug 676500. The UI allows the user to flag the different dictionaries he wants to use. An additional dictionary may be used for additional words, obviously, such as computer science or chemical terms. And BTW, it's a hint to the purists to add such dictionaries containing English words rather than the whole of English. Unfortunately, the discussion always returns to languages. I'm in fact guilty for introducing bug 676500 badly (and even more Bugzilla for not allowing description updates). Multi-dictionary can be done fairly easily and, when it will work, we can discuss more effectively how to use languages. One key point being not to interfere with those who prefer to use their set of languages. For example, to the question "are there really different French, Belgian, Swiss etc... languages? Or Spanish !!! ;-)" The answer is NO, but YES there are words that are used only in Canada, Belgium or Switzerland and ... in France. But, in fact, Linux makes a single French dictionary containing all the words specific to those countries. Exactly the opposite of the design. And the result is that the debate can be long whether a Belgian word should be included in a worldwide dictionary whereas it would be accepted without much discussion for a Belgian supplement dictionary to be used by those who choose to accept Belgian words. These kinds of discussions about languages are beyond Mozilla scope but exposing the multi-dictionary UI is a kitchen and egg prerequisite for those discussions to have any sense. If one thinks that Mozilla plays an important role in poultry raising ;-)
@Dave Royal Installing a dictionary shouldn't be confused with using it. There is no such thing as enabling and disabling one. Unless you speak in plugin terms and it could become clearer if you installed dictionaries in hunspell instead. See the article on my site. While spelling, you choose to use one among those that are installed by marking its box in the installed list. You can't have both en_GB and en_US used at the same time unless several boxes could be marked, which is presently only possible with Evolution. Same for Dicollecte dictionaries, you'll have to wait Mozilla's multi-dictionary spelling to use both classique and moderne, but "Classique & Réforme" has all the words with both old and new spelling. Given N subsets of a language (classique, moderne, French, Canadian, Belgian, Swiss, Luxemburg), multi-dictionary can do with N=7 dictionaries but Mozilla presently requires N!=5040 ones for all combinations. I think that the best way to have an idea of what automatic language detection would look like is to have some text automatically translated by Google Translation.
(In reply to bugzilla from comment #85) > The proposal of comment 82 sound good, to prevent the effects mentioned in > c83/84 words which came from other dictionaries could be marked in a > different colour (e.g. green). Also the main language could be determined by > a statistical analysis. The dictionary which contains the highest percentage > of found words should be considered the main dictionary for the given text ---- Isn't this pretty much what was suggested in comment #35? I.e. Is this conversation going around in circles or why don't we use a statistical approach for now, while adding support for <span lang=en_US...> to allow for more accurate checking...I don't see why the 'lang' tag (isn't there already some tag for this?) couldn't be applied to any HTML elemement. At worst, could always add it as a separate tag, but it seems that understanding a tag 'moz-lang'? that would apply to any content in it's 'child text', with "inner tags", over-riding outer-tags, would be the best approach. Mixing languages in 1 doc is VERY common -- especially if you speak english...as English is almost entirely composed of words from other languages (~50-70% latin and greek, much in french, spanish and others, though a large number (if not the majority) of the simple roots, in the 1000 most commonly used words, are probably scandanavian/norway/northern germanic. But that's the exact point/problem -- I'll use words from foreign languages -- and try to go for the original spelling...with an accent, and get slapped down by spell-check for throwing in the accent. bleh!.
In TB 9.0 with multiple language add-ins, the spell checker works intermittently. At times it is grayed out. This is not an issue in TB 8.0.
There's an add-in which auto-detects the language of an email and switches the spell checker accordingly: https://addons.mozilla.org/de/thunderbird/addon/dictionary-switcher-for-thunde/
(In reply to Robert G. Siebeck from comment #95) > There's an add-in which auto-detects the language of an email and switches > the spell checker accordingly: > > https://addons.mozilla.org/de/thunderbird/addon/dictionary-switcher-for- > thunde/ Thanks, but unfortunately it doesn't work since TB8 according to https://addons.mozilla.org/de/thunderbird/addon/dictionary-switcher-for-thunde/reviews/343488/
(In reply to bugzilla from comment #96) > Thanks, but unfortunately it doesn't work since TB8 according to > https://addons.mozilla.org/de/thunderbird/addon/dictionary-switcher-for- > thunde/reviews/343488/ Well, I just installed it with Thunderbird 15.0.1, and it works like a charm. Of course, you have to set the property 'extensions.dictionary-switcher.autodetect' to 'true'.
(In reply to Martin Kutschker from comment #8) > A language setting per address book entry. Note: there is the issue of > contradicting settings (use the majority of header enttries? With To: > weighted > higher than Cc:?) I'm glad I found the duplicate of the issue I was about to create hidden here, however, it's been 11 years since this was first reported, and it is still not implemented in Thunderbird. I propose the following default implementation, enablable by an experimental configuration setting: IF composing AND spellcheck-language-setting-changed associate recipient with language IF composing AND recipient HAS spellchecklanguage auto-select spellcheck language from recipient That's all I need, really. I compose emails to people who speak different languages regularly, and it's nice to not have to change this setting every time I compose a mail.
Thanks Kenney for your support of multilingualism. We need tools that allow people to be multilingual, and avoid as much as possible "monolinguism as default". Thanks in advance to any progress on that topic!
You're welcome, but I wouldn't call myself a supporter for any -ism whatsoever, especially not of dividing language as this merely disrupts communication. I'm merely of the opinion that good tools evolve to suit the need of their users. It's an interesting topic though, the idea that not everyone speaks the same language. If they did, we wouldn't be talking about this issue. And, I thank you as well for any progress!
Any system is OK as long as it does not conflict with Bug 676500 and Bug 481884 which request to be able to select multiple dictionaries to be used at the same time for spell checking. Letting the user select all the languages he's able to type and permanently checking them simultaneously is indeed straightforward and the simplest method to deal with spell checking. Any method that would prevent typing one of those languages anywhere or force him to type a language he does not want is wrong. I think that all methods that try to select a single dictionary automatically will fail compared to that obvious simplicity. They will certainly fail to select English + medical dictionary.
Agreed, for exactly those reasons.
As a suggestion for the UI: right now, changing the language requires 5 mouse clicks (or 2 clicks and 4 key presses). Given that people generally only use a handful of languages, it might be more efficient to add toolbar buttons, which reduces the number of required mouse clicks to 1 - that is, until multiple dictionary spell checking is implemented.
I am one of the people responsible for the Hunspell support for the Dutch language. I also write messages in multiple languages and find https://addons.mozilla.org/en-US/firefox/addon/dictionary-switcher/ very handy. Allowing all installed spell checkers to work at once is a bit problematic since, for example, I have installed English (UK), English (US), Dutch, German (DE), German (CH), French (FR) and French (CH). As we are working towards on a similar construction as Dutch + Dutch jargon (like English + medical) I support the idea of multiple language support. So perhaps users should be able to create spell checker profiles in which certain languages (or better spell checking dicts) are combined. This could also solve excluding, in my case languages such as French (LU), German (LI) and Dutch (BE). Apart from what is possible now (with and without add-ons), would it be an idea to create user stories on https://etherpad.mozilla.org/multi-lingual-spell-checker to make in inventory of the desirable functionality as a start for the design?
See my comment on 2011-09-16 16:52:16 PDT about the pitfalls of having several directories active together (at least when they correspond to different languages). I still think that in the Mozilla software, the destination (web site, correspondent, newsgroup) is suficient to determine the language to use.
I also agree not to have multiple dictionaries of different languages active at the same time. The manpage of Hunspell describes how to load multiple dictionaries at the same time, but is only to add e.g. English medical dictionary. I have renamed the MoPad to https://etherpad.mozilla.org/multi-dictionary-spell-checker
There seem to be three issues, so far, that are not really the same, though very related. 1) the original issue of this bug: associating a spellcheck language with recipient email addresses 2) spell-checking multiple languages in a single document. 3) automatically detect the spell-check language(s) to use. "Multi-lingual" has a slightly different interpretation in each case: in 1), it simply means to support changing the dictionary, which is implemented. It can also mean to be able to associate multiple dictionaries to a (combination of) recipient email addresses. Yet the multi-lingual aspect has nothing to do with remembering a preference - only the kind of preference to remember: a single language, or an array of languages. In 2), Multi-lingual refers to using multiple languages in a single email/document. So far at least two use cases are presented: a) using specific domain language extensions, such as a medical dictionary; b) using different, incompatible languages such as Dutch and English. In 3), multi-lingual applies to an algorithm to determine which languages to use based on matches for each installed dictionary. These issues can be developed in parallel and thus decoupled from each other. Whether a single language, or multiple languages are associated with a recipient email address makes no difference for issue 1), as all it is concerned with is remembering a preference parameterized by the recipient(s). Right now, there exists a 'Country' field in the address book, that doesn't seem to be used to determine the dictionary; a simple CSV field with dictionary codes (en_US etc) would suffice. The issue can get tricky quickly, but complicating things by wanting to do it all at once is probably one of the reasons that this issue has been open for over a decade ;-)
> --- Comment #104 from Pander <pander@users.sourceforge.net> 2013-12-21 02:36:33 PST --- > I am one of the people responsible for the Hunspell support for the Dutch > language. I also write messages in multiple languages and find > https://addons.mozilla.org/en-US/firefox/addon/dictionary-switcher/ very handy. Glad to meet you, but I'm not sure if you work for Hunspell (seems to) or Mozilla. Anyway, what I wrote to Hunspell meets what you wrote here, see below. > Allowing all installed spell checkers to work at once is a bit problematic > since, for example, I have installed English (UK), English (US), Dutch, German > (DE), German (CH), French (FR) and French (CH). As we are working towards on a > similar construction as Dutch + Dutch jargon (like English + medical) I support > the idea of multiple language support. I have the same environment :-) Plus Russian in which I can manage to write some sensible text without a dictionary but not without a spelling checker. Add Italian and Spanish, but that's mostly for fun. I have weeded out unnecessary menu entries as follows: http://www.papou.byethost9.com/notes/Ubuntu_spelling/ > So perhaps users should be able to create spell checker profiles in which > certain languages (or better spell checking dicts) are combined. This could > also solve excluding, in my case languages such as French (LU), German (LI) and > Dutch (BE). This is what I had very awkwardly suggested in Hunspell bug 209 and feature-request 37. I have just reissued https://sourceforge.net/p/hunspell/bugs/239/ It does exactly what you suggest at very low programming effort. If would be nice if you or someone drew the attention of Hunspell developpers on the high value for sweat ratio of this suggestion. Multi-dictionary checking without application support Multi-dictionary checking is a great feature but too few applications support it. An environment variable such as MYHUNDICTS=dict,dict2,... could be used to initialize the dictionary list before "-d dict3,dict4,..." or application requests add their own dictionaries to it. The user could specify his multi-dictionary choices either with a permanent session wide variable or with a per application command such as env MYHUNDICTS=dict,dict2,... command ... No modification to the application is needed to achieve multi-dictionary that way.
> I still think that in the Mozilla software, the destination (web site, correspondent, newsgroup) is suficient to determine the language to use. Supposing that some eBay client types a message to some seller. What will be the spelling language? How will it be determined? eBay seem unable to tell themselves in their warning. And if some language helper tag is required in HTML, are we sure that it will be used when we notice that Google Translate does not use the HTML language tags to determine (multiple) source languages? And if the software determines that the recipient's language is Chinese, will anyone be forced to type Chinese or isn't it better to allow trying English, Russian and German, 3 languages I used to write to them? OK, it will not support eBay and other forms. Just the mailing lists. In the case of talk-be@osm.org what will the language be when French, Dutch and German are allowed but they very wisely use English too? > See my comment on 2011-09-16 16:52:16 PDT about the pitfalls of having several directories active together (at least when they correspond to different languages). That opinion is usual and replied in several places. I have been using French+ English checking to my utmost satisfaction during more than 10 years with good old Eudora. There are enough homophones within French itself (sot(s), saut(s), seau(x), sceau(x), ...) or English (its, it's (ahem!)...) to make the addition of English a benign matter compared to the advantage. In addition, mixing English and Russian is perfectly safe and no one should disallow that because French clashes with English. In fact, the bottom line is that, as I said, using a language mix and selecting a single one shouldn't interfere so that no one should try to prevent the adepts of the first method to use it while the adepts of the second one continue to research for another 12 years how to do it practically.
Actually the kind of websites where this assumption doesn't hold is webmail. The decision on language isn't done by the software, it's done by the user; the software just remembers which language I want to use on a given site, in a given newsgroup, or with a given correspondent. Of course this isn't perfect, but it's already vastly better than being wrong 50% of the time. The problem isn't homophones in the same language. These would require some heavy-duty contextual usage statistics, or, for the numerous homophonic verb conjugations that we have in French some grammatical analysis. The problem is accepting a foreign spelling which is not correct in the language actually used (like accepting the English form "connection" while in a French text where is should be spelled "connexion"). Would one use both UK and US English dictionaries at the same time? I'm not disallowing anything, I'm just pointing out that having active dictionaries for several languages will not work in many cases, especially when the languages are close to each other. So it cannot be the universal solution.
(In reply to Bertrand Denoix from comment #110) [...] > Would one use both UK > and US English dictionaries at the same time? [...] Perhaps — when part of the text was quotes from incoming mail (maybe quoted from several messages), in order to have both "color" and "colour" etc. appear with no wavy underline.
For contextual checking, you have to use LanguageTool http://www.languagetool.org/ (for which is a Thunderbird add-on) or LightProof http://extensions.openoffice.org/en/project/lightproof-grammar-checker-development-framework (which can use rules from LanguageTool). Spell checking based on Hunspell cannot take into account anything beyond a space, which for some languages is a bigger problem than for others. This is so by design, hence that support and integration of grammar/style checking for these spelling mistakes is needed.
Sorry, but what is the issue when the user wants to have many dictionaries performing the spell check at the same time? This is one of the most useful features of Evolution. Not all installed dictionaries have to be selected, the user should be able to select which ones are to be used. And it depends on which dictionaries you mix: Russian, Hebrew, Greek and English can be mixed without problems to the best of my knowledge. And if I decide to mix German, English and Spanish, I know that I might find errors, but this is a risk I want to take. Even the hunspell manual has a sample o the syntax mixing English and German (http://www.manpagez.com/man/1/hunspell/). My question is: what is the problem of implementing this as an option that the user decides whether to enable it or not?
> My question is: what is the problem of implementing this as an option that the user > decides whether to enable it or not? None. We just need somebody to implement it. There's been lots of ideas and even more opinions posted here. I think we established that there is a real need for this. Now we just need somebody to step up to do the work. Any takers? If you do it, please do it in a clean way, otherwise the patch will not go in.
@Pablo: I don't think anyone wants to forbid having several directories active together. This can be useful. But this is only something useful to people who need/want several active dictionaries at the same time and are ready to cope with the induced misses. This is not a solution for people who want a single dictionary or just the dictionaries for a given language, because the combination of languages they use would lead to too many misses when using all dictionaries together.
> Now we just need somebody to step up to do the work. Any takers? > If you do it, please do it in a clean way, otherwise the patch will not go > in. Thanks for the reply, Ben. I’m afraid I cannot code. So, I’d have to learn first :-). > @Pablo: I don't think anyone wants to forbid having several directories > active together. This can be useful. But this is only something useful to > people who need/want several active dictionaries at the same time and are > ready to cope with the induced misses. This is not a solution for people who > want a single dictionary or just the dictionaries for a given language, > because the combination of languages they use would lead to too many misses > when using all dictionaries together. Thanks for your reply, Bertrand. I must admit that I don’t know what are several simultaneous active directories or why they are required. (I have no idea of its difference when compared to several simultaneous active dictionaries.) But you tell me that my approach would only be useful only for several simultaneous active directories. Well, this would be a solution for this case. I don’t know which could be the problem when only a single dictionary is used. As far as I understand, this is the way it works now. And when combining many dictionaries from the same language (and the combination could lead to problematic results, well, either the combination isn’t actually desired (sorry, but I cannot imagine mixing English, medical English, geological English and legal English that leads to errors) and the user should enable only one dictionary, or the combination can have all selected dictionaries simultaneously active. I may be missing something, but I think with the approach I’m suggesting it is possible to get a partial solution.
I meant *dictionaries* (a lapse no dictionary will catch :))
My two cents: running spell check with multiple dictionaries at the same time seems to have many drawbacks (for example mixing "connection" and "connexion" in French and English, etc.) but what about providing spell check on sub-parts of the email body instead? I see two possible solutions: - the user selects by hand which parts of the message should be spell checked with the English dictionary, which part with the French dictionary, etc. This could seem tedious, but most of the time, language change occurs only in clearly separated paragraphs; - the spell checking algorithm does a "smart job" by detecting at the paragraph-level which parts of the message are in English (and should be checked with the English dictionary), which parts are in French (and should be checked with the French dictionary), etc. Of course, these solutions could be potentially complex to implement, and are more CPU-intensive than the current single-language spell checking. Hope this helps. My best wishes to all for the holiday season!
YES many people have stated that mixing dictionaries is an INVALID solution because of language confusion. This is why bug 676500 was closed with WONTFIX without any discussion, but they later understood and changed their mind. Today, I have written a message in interspersed English+French+Russian. Very usually, I write French including a lot of English terms. There are often tri- or bilingual messages in Belgium (e.g. talk-be@osm.org) Not switching dictionaries between messages is a valid user desire. I read that multi-dictionary is a key feature of spelling checkers. Mixed dictionaries checking is a very VALID need and means. It's OPTIONAL. I'm glad to hear good sense here. I can't code it either but I can bring ideas. The user interface is just a matter of selecting several languages: a dictionary set. hunspell can use multiple dictionaries but it should be checked how well it works. Evolution is well done, if not a source, an inspiration for code. This dictionary set should be remembered as the DEFAULT ONE. User configuration, such as for mailing lists, can temporarily activate alternative dictionary sets. User permission must be required for any software features to override these user choices with any sort of other automatic language detection. LET'S KEEP IT SIMPLE, at least in a first phase. User satisfaction is the most effective way to experiment if more features are necessary. Best season greetings to you all.
I have another suggestion to improve spell-checking: usually, when you write to someone, it's in one language. What annoys me is thatThunderbird don't know that, when I write to chris@somewhere.nz => it's in english, and when I write to stephane@quelquepart.fr => it's in french. So what about having in the contact *directory* a field "language", that could be used for choosing which dictionnary to use by default. I keep switching manually. I write to stephane= automatically switch to french dictionnary. I write to chris= english one
>So what about having in the contact *directory* a field "language", that could be used for choosing which dictionnary to use by default. Or, for those who have understood the subject matter, "... for choosing which dictionar*ies*" (and with a single "n", especially for those who use spell checkers ;-) ) (this would allow me to set English+French+Dutch+German for my Belgian mailing lists) This said, I would first prefer something setting which of TEXT or HTML+TEXT I must use (instead of just HTML) for some recipients, because, while I can remember which of Stéphane or Chris speaks English or French, I never remember who does not support HTML or wants HTML+TEXT, or I just goof.
For sure it cannot go on like this: http://i.imgur.com/VspfJBl.png :) BTW, Blackberry 10 can handle two languages per message just fine, with autocorrect and spellcheck, without complicated setting up per message. You just set system wide two languages. I have no problems with this, maybe because the languages I use are sufficiently different.
Depends on: 1059835
(In reply to André Pirard from comment #121) > >So what about having in the contact *directory* a field "language", that could be used for choosing which dictionnary to use by default. > > Or, for those who have understood the subject matter, "... for choosing > which dictionar*ies*" (and with a single "n", especially for those who use > spell checkers ;-) ) > (this would allow me to set English+French+Dutch+German for my Belgian > mailing lists) > > This said, I would first prefer something setting which of TEXT or HTML+TEXT > I must use (instead of just HTML) for some recipients, because, while I can > remember which of Stéphane or Chris speaks English or French, I never > remember who does not support HTML or wants HTML+TEXT, or I just goof. It seems that it have been already noticed, but no bug has been created. Now, with Nightly 38.0a1 (2015-01-15), in Windows 8.1, if you're connected to zimbra, a mail server provided by free, spell check is done in English, whereas before, it adapted himself automatically in French. Now, you even can't download French dictionary. How to repeat: Prerequisites: Have an french account with zimbra 1) Log into zimbra french account 2) click on "Nouveau" 3) write "bonjour les amis" Actual: spell checked as error. Expected: No error found Note: I don't know if spell check is done by navigator, zimbra, or both of them. But before, it worked well.
Bug confirmed. I was talking with a friend on Framasphere, in French. He had spell check. I don't.
I've a few remarks regarding previous comments: * Having the website include "lang" or anything alike won't work. Websites can't always determine the language the user will type in. Examples: webmail, social networks, web-based IM. And of course, multilingual websites. * On Thunderbird, a single contact may not have a single language. I've a bilingual friend with whom we speak in English or Spanish at random, both over the internet and in person. Suggestions: * Make the language selection menu check-boxes instead of select-ones. * If most of a text spell-checks OK in a certain language, but another portion of it checks OK in a second language, then yellow-underline that second part, to indicate something like "this was spell-checked as OK, but not in the same language as the rest of the text". This, of course, requires that Firefox spell-checks individual bits of text separately. For example: If 90% of a text matches language_a, and 10% matches language_b, then underline that 10% in yellow.
One comment. Sometimes it is helpful to detect a language by the active language from Language Bar in Windows. As example, many sites have search fields and there can be used different languages. Another example two tabs, on first tab i enter one language, on second another language.
Basic common sense says French for Frenchs, English for Enlishs, Spanish for Spanish, etc... If we want better, we should have an automatic detection of language, like in Google trad. But it's heavy to do for devs. Too heavy, I think.
Language detection, also for short messages such as headers and single paragraphs, can be done with https://github.com/shuyo/language-detection
As the other add-on linked above didn't work for me, I've co-developed a new one: https://addons.mozilla.org/en-US/firefox/addon/automatic-dictionary-switcher/, it guesses the language and sets the spell dictionary as soon as there are 25 characters at least. With more than 30-35 characters the guess should almost always be correct. Except showing feedback in the GUI this was easy to develop, I can only encourage people to add this to core. There are Javascript libraries that solve this problem, e.g. https://github.com/wooorm/franc. The only problem is it doesn't work well for very short texts.
(In reply to Daniel Naber from comment #130) > As the other add-on linked above didn't work for me, I've co-developed a new > one: > https://addons.mozilla.org/en-US/firefox/addon/automatic-dictionary-switcher/ > , it guesses the language and sets the spell dictionary as soon as there are > 25 characters at least. Works nicely, thanks. Though I must mention that one of the goals that I see in solving this issue is to be able to use multiple dictionaries at the same time, to allow a multi-lingual user to craft text containing more than one language that will be spell checked, a problem which your extension does not address. I often find myself writing highly technical texts in a Hebrew context and that often contains many English words. The current "best" practice is to either have all English marked as spelling errors, all Hebrew words marked as error, or not to use spell checking. :-(
I contacted the author and the reason is: " because of this bug which we cannot fix without a fix from Firefox: https://github.com/danielnaber/firefox-dict-switcher/issues/21 " So please upvote https://bugzilla.mozilla.org/show_bug.cgi?id=1240536
I'm told Chrome switches languages automatically by default. Maybe there's code that we could borrow?
Whiteboard: [parity-chrome]
However, being a web browser with a similar architecture and compatible license likely makes Chromium the best candidate for borrowing code from. That's certainly where I'd start looking. (FWIW, I'm the author of the very old Dictionary Switcher add-on which was hacked together in JS using our limited spellchecker API and I don't think that's suitable for adoption.) That might also make this a good mentored project if we can find a suitable mentor. Maybe Ehsan?
Flags: needinfo?(ehsan)
I think an 80% solution could easily be developed as an add-on, if WebExtensions were able to switch the spelling dictionary.
(In reply to Daniel Naber from comment #137) > I think an 80% solution could easily be developed as an add-on, if > WebExtensions were able to switch the spelling dictionary. Performance was the major issue with my add-on. Our spellchecker JS API is not made for this use case. There might be smart ways around this, but figuring this out could well be more effort than following Chrome's implementation. We shouldn't waste time on a 80% solution (on top of the 16 years this bug has already been open) when doing it the right way is easier.
> I'm told Chrome switches languages automatically by default. I cannot find that under Chrome 56 on Ubuntu. It has an option to accept all languages, though. So if you mix German and English, both will get accepted. This should be easy to implement, but there are drawbacks: if someone has a lot of dictionaries installed, this could cause a slowdown. Also, if you write an English sentence, make a typo and that typo is by chance a German word, it will not be noticed. Maybe this has all been discussed, I lack the time to read through 16 years of bug history...
Attached image chrome-spellchecking.png (deleted) —
(In reply to Dão Gottwald [:dao] from comment #134) > I'm told Chrome switches languages automatically by default. No it doesn't. I see lots of confusion around what Chrome does; let me sum up Chrome spellchecking approach/UX with a few words and a screenshot, it's dead simple: 1. User picks the languages they talk. This is done in-preferences, assisted by a link in the spell-checking context menu (see attached screenshot) that opens the preferences pane at Chrome's language settings page (chrome://settings/languages). For example, to Chrome I talk "English (United States), French". 2. Spellchecker does its job as usual and *without in-field language-detection logic*, but does it using a single meta-dictionary (called "All your languages" in the contextual menu, see attached screenshot), which is the *union* of the dictionaries associated to user-picked languages. TL;DR: see the attached screenshot where I fired up Chrome, picked a textarea, and typed two sentences: - One in English, one in French - Both with typos in the two final words of the first sentence. - Both with a few words from the other language casually used. --> Chrome didn't do any magic, it just spellchecked everything using its "All your languages" meta-dictionary, accepting anything that looks like French OR English. I'd *love* to see Firefox adopt Chrome's UX. To my eyes, the convenience benefits (for those who have read the book, its "Don't Make Me Think"-ness [1]) of this UX largely exceeds the costs associated to its few caveats rightfully listed above by Daniel Naber: - "If someone has a lot of dictionaries installed, this could cause a slowdown" First, I'm not sure that's true with a proper implementation. And if it's true, a. how bearable is the slowdown, b. is the use case common? - "If you write an English sentence, make a typo and that typo is by chance a German word, it will not be noticed." True, but I'll take that slight cost over the cost of having to tediously pick the current language each time I'm typing in a field. I suspect many users would too, but that's only a bias, confirmed by the fact that Chrome went for this. - Lastly, one caveat not mentioned by Daniel is the tolerance of such an algorithm to using foreign language text. To me, 1. The Spaniard user insisting on having their anglicisms corrected can still manually unset the "All your languages" dictionary and pick "Spanish". 2. And even omitting this, such tolerance seems perfectly okay. We are not in the context of LibreOffice helping our Spanish author avoiding anglicisms in their Spanish text! We are more certainly in the context of a multilingual user typing text in some forum textarea! Let's help them proof their blurb in the most effortless way possible :) Ideally, a UX analysis would tell us which proportion of multilingual users don't / rarely benefit from spell checking because they never / rarely bother setting the current language, ending up with fully red-wiggle-underlined text and getting along with it. [1] https://www.sensible.com/dmmt.html
Thanks for the summary, this is very helpful. I agree that a meta-dictionary would be a big improvement over what we have today, and being able to borrow Chrome's implementation would be a significant help. If performance was a big problem with that approach, Chrome probably wouldn't have shipped this. So I still think this would be our best shot at moving this bug forward.
I was actually actually looking at the chromium source code for their spell checking support a few weeks ago for another reason and I encountered the code that does the spell checking. Ronan is right in comment 140. What they do is essentially loop over your selected languages and try them out one by one: <https://cs.chromium.org/chromium/src/components/spellcheck/renderer/spellcheck.cc?l=278&rcl=8b3c17d1e53fade1111ec0fe9aed8a6f891d5331> until they find a language which marks the word as misspelled, otherwise they consider the word as not misspelled. (They also have the notion of skippable words, I haven't looked closely to see what that means.) I don't actually think we can really borrow any of this code since our glue code to hunspell is nothing alike chromium's, but the technique that they use is very simple and probably works pretty well in practice, so I think we should just do that. In terms on mentoring someone, right now is unfortunately a really bad time for me since my plate is full with high priority projects. :( But as a multi-lingual person myself I really care about this and would really like to see it fixed. Perhaps Masayuki has more free time?
Flags: needinfo?(ehsan) → needinfo?(masayuki)
(In reply to :Ehsan Akhgari from comment #142) > I was actually actually looking at the chromium source code for their spell > checking support a few weeks ago for another reason and I encountered the > code that does the spell checking. Ronan is right in comment 140. What > they do is essentially loop over your selected languages and try them out > one by one: > <https://cs.chromium.org/chromium/src/components/spellcheck/renderer/ > spellcheck.cc?l=278&rcl=8b3c17d1e53fade1111ec0fe9aed8a6f891d5331> until they > find a language which marks the word as misspelled, otherwise they consider > the word as not misspelled. (They also have the notion of skippable words, > I haven't looked closely to see what that means.) Thanks for looking at Chromium and confirming this :) But isn't it what :rail was proposing too (8 years old, you might have missed it :D ) in comment 41 / attachment 355624 [details] [diff] [review] ? At the time, testers reported unbearable performance regression; do we have an idea of what Chromium does better, or is it just bigger CPUs that makes this approach okay today?
> At the time, testers reported unbearable performance regression; do we have an idea of what Chromium does better, or is it just bigger CPUs that makes this approach okay today? hunspell is very slow when it comes to generating suggestions, so when a user pastes text with a lot of errors (or what the spell checker considers errors), this can become very slow *if* the suggestions are created directly when checking the text. One way to solve this could be to generate the suggestions only when the user needs them, i.e. when opening the context menu.
Design for a new version of Hunspell, which is faster, is being worked on. We recently filed a MOSS application.
(In reply to :Ehsan Akhgari from comment #142) > In terms on mentoring someone, right now is unfortunately a really bad time > for me since my plate is full with high priority projects. :( But as a > multi-lingual person myself I really care about this and would really like > to see it fixed. Perhaps Masayuki has more free time? Hmm, I'm struggling with web-compat issues around editor, remaining issues around e10s with text input and improving UI Events implementation. So, I don't have much time to making my working area wider than now. However, as a module owner of editor, of course, if somebody writes a patch and request review to me (if there is nobody to review), I'll review it. # Looks like the patch is too old and not conforming to our coding rules...
Flags: needinfo?(masayuki)
....and for the newies, any solution on how to get those dictionaries in other languages?
Frankly, spelling in "all languages" doesn't make sense since errors will go undetected. It will certainly be OK for orthogonal language pairs, but not for similar languages. Take: German: Er is rod and er had vile Gutter, six mistakes, all correct words in English. Correct would be: Er ist rot und er hat viele Gatter. Englisch: He las a nice hat und ist fasst, five mistakes using correct German words: Englisch, las, und, ist, fasst. Correct would be: English: He has a nice hat and is fast (talking about someone one a race bike). The only useful thing would be to imitate office software and store a language attribute against a paragraph, span, etc. and then spellcheck the section based on that.
Agreed that mixing language does not bring advantages, only troubles. It is up to the editing software to mark individual pieces of text with what language they are in. LibreOffice is a nice example for this for selecting different languages for parts of texts. HTML forms and e-mails in HTML could also use a similar approach. The only justification to load an additional dictionary is to load the medical dictionary of the same language extra. It was anticipated that more of these special dictionaries would appear, such as a juridical one, etc. That is also how the underlying spell checking software has been made, but not to mix languages. However, at the moment only English and German medical dictionaries are available and one for Dutch in the making. Perhaps a German chemical one based on http://repo.or.cz/wortliste.git/blob/HEAD:/arzneiwirkstoffnamen and http://repo.or.cz/wortliste.git/blob/HEAD:/arzneiwirkstoffnamen-supplement if somebody would make that, but that is it at the moment justifying multiple dictionaries. Since it is very risky, better not offer this mixed dictionary function to end users. For now, I prefer using simply https://addons.mozilla.org/en-US/firefox/addon/dictionary-switcher/
(In reply to Pander from comment #149) > Agreed that mixing language does not bring advantages, only troubles. Depends on the languages in question. Languages that use radically different scripts will not clash (i.e. there will be no cases of one word being correct in one language and being misspelled in another). Not all people on this planet use Latin alphabet or its close derivatives. So this option might be useful for some. > It is > up to the editing software to mark individual pieces of text with what > language they are in. It might be helpful to keep an eye on the keyboard layout that was in use when a particular character was typed. This way the browser will be able to automatically identify the language currently in use. That said, some people might be typing in two different languages using one layout. I don't have the statistics to prove or disprove that such arrangement is widely used, though i'd eyeball it as "rare". For such cases an UI for choosing text language would indeed be needed.
Using language detection software might be more accurate. We used one a few years back but see there is more now such as https://stanbol.apache.org/docs/trunk/components/enhancer/engines/langidengine.html and https://www.tutorialspoint.com/tika/tika_language_detection.htm Might be good to do a re-inventory of what is currently available.
(In reply to LRN from comment #150) > It might be helpful to keep an eye on the keyboard layout that was in use > when a particular character was typed. This way the browser will be able to > automatically identify the language currently in use. Doesn't really work. Among Latin alphabet users, many people have a keyboard for their local language, and still have a significant ratio of mail in English. Conversely, expats use keyboards available in their country of residence to type in their native language. In which case the spell checker is often handy to add the relevant accented characters. And for multi-language countries, this is a mess: https://en.wikipedia.org/wiki/QWERTZ
> LibreOffice is a nice example for this for selecting different languages for parts of texts LibreOffice is a horrible example. In fact, you can hardly do worse than that. I routinely write emails in 3 different languages. They are all mixed. In some cases, I even write 2 different languages to the same person, depending on who else is or was involved in the thread, or depending on the subject. I cannot specify the language for each email, that costs too much time, and I'll rather ignore the spell-checker altogether. The only solution is to once - as a configuration - select a number of languages that I know how to write, and then to allow all words in these languages. I agree that we cannot allow *all* languages that TB knows. But we must allow all languages that the user knows. If we can detect the language that the user writes, so that he doesn't have to manually and explicitly configure it, all the better. But that's a nice to have.
FWIW, Android keyboard allows several languages at once.
Depends on: 1402822
Android keyboard is more used for auto completion, that is slightly different from spell checking.
Just switched from Google Chrome to Firefox, after yesterday's release, and the first thing I notice is the different behavior regarding spell checking. As a user who regularly writes in Dutch, Spanish and English, it is very annoying to constantly switch dictionaries. While Chrome's approach is not perfect (it spell checks against all dictionaries), it still seems better to me than the current behavior. Maybe you could consider adding a flag to enable spell checking against all dictionaries as a temporary solution. I guess I am not the first one facing this problem.
I regularly write Italian and English and I moved yesterday from chrome to Firefox happy of the quantum project and I was surprised that I will miss this feature that I regularly use on chrome
Please help revive or upgrade: - https://addons.mozilla.org/en-US/thunderbird/addon/dictionary-switcher-for-thunde/ - https://addons.mozilla.org/en-US/thunderbird/addon/dictionary-switcher/ - https://addons.mozilla.org/en-US/firefox/addon/dictionary-switcher/ I am one of the people reimplementing Hunspell (the spell checking engine used by Mozilla products and services). Support for multiple languages will not be supported in the engine, at least not in the near future. Also because of false friends and other dangers of mixing language support. At the moment, the best way to go is to reactivate one of these add-ons. Preferably in a way that it remembers the language used before for a domain (and high level path) using HTML forms in Firefox or (sending and) receiving e-mail address when writing in Thunderbird.
However a faster implementation of Hunspell would allow us to loop over Hunspell suggestions just like comment 144 suggested and achieve the same result right?
Okay, so it seems like it is really necessary for Firefox to keep track of the language each word is written in. The way LibreOffice does. > LibreOffice is a horrible example. In fact, you can hardly do worse than that. If you change the language by opening the character format dialog, then yes, it's absolutely awful. However, if you're swapping languages in LibreOffice by clicking the language indicator field in the status bar (at the bottom of the window), it's passable. To explicitly switch languages faster than that, in Firefox, you'd need to have either a language switching widget floating around the text input cursor, and/or a hotkey for language switching. That's why it's better to augment it with implicit switching. Most people don't mix more than two or three languages at one time, and don't know more than 5. Point is, the suggestion above, that the user should be able to set up the list of languages that (s)he actually knows how to write in, is also a good one; that will narrow the search space. This would naturally go along with the list of installed dictionaries (i.e. it should be possible to install a dictionary, but not use it). Also, Firefox can (and should) track the input flow. Unless you're switching from one language to another on every word, you'd likely be writing in one language continuously, then switching to another and writing some more, then switching again. Thus Firefox can make a good guess about the language of a word that is currently being typed in - it is likely in the same language as the previous word ("previous" being "to the left" for LTR systems and "to the right" for RTL systems). Characters are also important. When Firefox sees a 'ß' typed in, the language that is being used is *likely* not English anymore. Same goes with keyboard layout - every time it changes, there's a good chance that the user is switching to a different language. Pasting is just a special case of typing some text at a very fast rate, so that shouldn't be a problem. None of the above is 100% foolproof, but a combination of these heuristics (properly weighted, and ultimately subservant to explicit language switching done by the user) should yield reasonably good results, helping a lot of users. People who would fall into all the corner cases not covered by this will have to make do with addons (and language-handling code should be written in such a way that it's easy to override its behaviour with an addon). I know that i'm asking for a lot, but this bug was filed 17 years ago...if there was an easy fix for it, it would have been fixed by now.
Just to add my two cents and share some user experience... Constant-manual-language-switching is getting so annoying, that I started to consider either turning spell checks off completely in Firefox, or switching to Chrome. I was trying to deal with it, until I was using mainly Polish and English (+ some Spanish or German from time to time). But 3 years ago I met my future wife from Slovakia, and this year we got married. Switching between 3 languages is really tedious work... Plus I'm still learning Slovak, and she is learning Polish, so in communication between us we mix all 3 languages quite frequently if not constantly... ;]
Forgive me if I'm repeating someone else, but I might suggest the following options for spell-check at compose/edit time (or before sending). #1) allow 1 or more spell dictionaries to be used throughout the entire document. Popup indicates dictionary source. Say foo is a word in Spanish and English. User has word: 'faex'. Popup could say: fax (Eng) faux (French, Eng-import) foe (Eng) etc ... #2) -- This options only seems practical when composing in a tag-supporting language (HTML, or more strictly, HTML5, XML, RTF(?), etc). Add "language" attribute with similar syntax & semantics as a font attribute, though user could choose to send the tagged and text-equivalent formats. If user sends in text-only, specifics of language in tags is lost in sent copy, but it could be choice (as is done now with HTML and text) to send both versions. Note: "tag in "#2" _could_ be restricted to some subset of HTML tags (only P, or div or span...etc). #1 is likely most compatible w/existing email, though #2 has strong benefits if editor was shipped standalone as a primitive document compose tool.
Let me repeat myself to prevent users abandoning Thunderbird and switching to Chrome. What "L A" explains in #1 is basically what Evolution does (so, code is available). It presents the suggestions by language. But one should hardly try to implement such a feature at the Hunspell level to avoid repeating the same modification for every application that use spell-checking (even this "bug" is unsure of speaking of Thunderbird or Firefox and it holds for other text processors). The set of languages used fits nicely in a permanent environment variable. A user will be glad to have Hunspell accept/correct any time all the languages he knows. That has nothing to do with tagging the language of a document, or of a paragraph, or of a phrase, or of a word for text mixing languages at those levels, like computer or other science texts mixing English with another language. Auto-recognizing the language is inappropriate if based on words containing mistakes that one tries to correct and is complete nonsense with those fine mixes of languages. Good luck Thunderbird and Firefox.
Mass bug change to replace various 'parity' whiteboard flags with the new canonical keywords. (See bug 1443764 comment 13.)
Keywords: parity-chrome
Whiteboard: [parity-chrome]
Priority: -- → P3
There is a KDE library for automatic language detection: https://github.com/KDE/sonnet This will be useful in thunderbird.
As someone who was used to use chrome this is one of two features who I miss. I though it wasnt that hard

I think that suggestions from different languages should appear separated in the context menu. (see screen-shot of Chrome)

Priority P3? For non English natives this is probably the most annoying missing feature of Firefox.

Chrome can mix and match languages in the same sentence without having to install extra extensions or dictionaries.

Hello!
There is an extension that makes Firefox automatically recognize the language I am writing in: https://github.com/kimsey0/FirefoxAutoDict
Would it be possible to include such a functionality inside Firefox? It would be already a big, big improvement.

Also, the dictionaries download page of Firefox is so confusing.

Unfortunately, Firefox provides a subpar experience for people who write in more than one language, which is quite a lot. This scenario is perfectly handled by Chrome and Safari, so I think it could be feasible for Firefox to improve it.

I have to agree with others here. It seems not really that interesting for Mozilla team (I reckon many native speakers), but for us, foreigners, who frequently type in multiple languages (at least two, but usually three or four), it is a big inconvenient to always needing to switch a language manually or in case we combine the languages (such as writing in one language with some captions/quotes in English) we have simply no choice to have it automatically changed as we write.

I know, probably no one from Mozilla likes a comparison with Google and Chromium, but after couple of years of using Chromium alongside FF, Chromium's automatic language switching for spell-checking has worked, honestly, flawlessly.

And what's more this is none of those advanced spell-checks which would check for a syntactic typos such as combining singular and plural "one languages" and others like that, this is simply a dictionary data matching, I think this should be implemented.

By the way, as Matt suggested,
https://github.com/kimsey0/FirefoxAutoDict
works, however it is 50% there, it will switch a languages depending on what languages you start typing in, however after that it will not switch to other one automatically anymore. Still better than nothing, I guess...

I am still using chromium based browsers because ff doesn't want to fix it, wth

Thanks for the https://github.com/kimsey0/FirefoxAutoDict lads

That is merely a patch, I hope, Mozilla team will not take it as this is fixed. This is very much not fixed. I am 100% sure this is TOP3 essential features for non-English speakers or even whoever who speaks multiple languages and wants to be sure the spelling is correct.

I don't know if this is the same but spell checking seems to be failing. I write an email in English and have the British English dictionary selected and all the words are underlined! Still maybe this is a FIrefox issue as all my words are being underlined here too! But that is whether I select British or American English. In Thunderbird if I select American all the red lines disappear.:(

@musiquegraeme: What you see is unrelated. Please file a new bug, with exact reproduction steps in a new profile.

(In reply to musiquegraeme from comment #175)

I don't know if this is the same but spell checking seems to be failing. I write an email in English and have the British English dictionary selected and all the words are underlined! Still maybe this is a FIrefox issue as all my words are being underlined here too! But that is whether I select British or American English. In Thunderbird if I select American all the red lines disappear.:(

I guess that this has been fixed by bug 1671764 if you use Firefox.

I'm working on some patches for Bug 1402822 to support multiple dictionaries. I'll take this bug too, because there may be further work required once those land.

Assignee: nobody → dminor
Depends on: 1761425
Depends on: 1763123
Severity: normal → --
Assignee: dminor → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: