Wikipedia talk:Naming conventions (use English)/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3 Archive 5

Accents

About (for German umlauts -> letter+e; for French, dropped accents): what does that mean ? French accents are perfectly acceptable and accepted in a title, for all I know (e.g. Coup d'état). I don't know about umlauts, but shouldn't at least the affirmation of french accents be removed ? --FvdP 22:16 Feb 19, 2003 (UTC)

That bit is very unclear, and seems to contradict the bit that says that "Western European accents are acceptable". I see nothing wrong with using accents and umlauts, but I think redirects should be put in place at the same title with the accents removed, so that people typing names without accents into the URL box end up at the right place, and so that they don't accidentally create a new article under such a name without realising that an article already exists. -- Oliver P. 22:26 Feb 19, 2003 (UTC)

Moved from Wikipedia:Village pump:

Maybe we should update the somewhat off topic note in Wikipedia:Naming conventions (use English) "Use Latin-1 (ISO 8859-1) for the title of an article. Note not UTF-8 nor 7 bit ascii. While Western European accents are acceptable, non-Western European accents need to be dropped. (for German umlauts -> letter+e; for French, dropped accents). (Based on the post from Brion)" Docu 07:12 Feb 28, 2003 (UTC)

I've moved the section from the article and pasted in full here for review and updating. -- sannse 11:46 Mar 2, 2003 (UTC)
Use Latin-1 (ISO 8859-1) for the title of an article. Note not UTF-8 nor 7 bit ascii. While Western European accents are acceptable, non-Western European accents need to be dropped. (for German umlauts -> letter+e; for French, dropped accents). (Based on the post from Brion)
If you believe that the bit in parentheses (just above this paragraph) is about what redirects should be created, then it makes sense. In that case, however, it's still incomplete; we should also have redirects from German with the diæreses dropped (in addition to redirects from German with the umlauts spelled with Es), since this is also sometimes found in English. -- Toby 03:36 Mar 3, 2003 (UTC)

Current pratice (discussion now moved to Wikipedia talk:Special characters) seems to be to use Latin-1 spellings for the articles and redirect from basic-ascii spellings, i.e. (for German umlauts -> letter+e; for French, dropped accents).

Sample: Resume would redirect to Résumé

As not everybody has necessarily all characters on their keyboard, I suppose it's ok to create articles with Basic ASCII and have someelse move them later. Wikipedia:Special characters should answer these questions. That article needs updating though, as it was written when the Wikipedia software couldn't handle accents at all Docu 12:37 Mar 2, 2003 (UTC)


Not using accents is crazy in some contexts. In the Irish language, the very word's meaning and pronounciation is created using an accent (usually a fada, as in é (pronounced e fada). Drop the fada and you change the word, meaning and pronounciation. It is ok to create a fada-less redirect, but under no circumstances could you possibly put the main article under what would be a wrongly spelled word. For example, Ireland's fifth president is Cearbhall Ó Dálaigh. Dropping the fadas and you might as well put him in as Karol Odawle for all the sense it would make. A redirect from Cearbhall O'Dalaigh would be OK, but the main article would have to be spelt in the correct manner. JtdIrL 03:47 Mar 3, 2003 (UTC)

So use diacritics in the Irish Wikipedia. If his name cannot be written in the English alphabet, he probably isn't encyclopedic enough to deserve an article in the English Wikipedia anyway, so we don't need to worry about the title. The English one should be the article title; the foreign diacriticals are okay as redirects. Not having the English version in the title also makes it less likely that it will appear in the article itself—and one thing I can never figure out is all the fools who want to hide this information from people using various search engines by not even including the normal English alphabet versions of various words. Gene Nygaard 22:09, 4 October 2005 (UTC)
And I bet this conversation thought itself dead after 2½ years... That comment about search engines is actually a very interesting point. While I doubt that Gene's suggestion of moving all articles with ñs, ás ês, and ös in their titles to the stripped-down equivalent locations would receive consensus support, perhaps we should instigate a policy to include the stripped-down name somewhere in the article -- remmed out, or in a dummy link, somewhere where Google and its ilk could find it. Hajor 22:45, 4 October 2005 (UTC)
Google is very smart about diacritics. Searching (from Canada) for Cearbhall O'Dalaigh gives me 715 results, with the top hit being the English Wikipedia page Cearbhall Ó Dálaigh. Searching for Cearbhall Ó Dálaigh gives me 21,100 results, the second one being the WP page. Searching for Cearbhall O Dalaigh with the diacritics stripped gives me 13,400 results, the WP page still at number two. Clearly, using the precise name as the title hasn't hidden anything from Google. For some reason Google turns off these smarts if you're searching from the USA (at it did least the last time I discussed this with Wikipedians who were in the States).
Hajor, Google tends to ignore any text that's not visible on the page, because it's used by search engine spammers. Inbound links from quality sites score high though, so having those redirects on Wikipedia is important. Of course different spellings that are commonly used should appear near the top of the page, and will also contribute.
By the way, GN, the name-calling is out of place. Michael Z. 2005-10-4 23:29 Z
Valid point about Google ingoring hidden text; wasn't aware of that. And though I remember reading something about the different ways it handles diacrits between its various localized front ends, I don't recall the actual reason for it. I did just run a test, however, searching for /enrique-bolanos/ -- our Enrique Bolaños page came up No. 24 using google.com, No. 16 using google.com.mx, No. 20 on google.ca, and No. 45 on google.de (but the w:de article was placed at No. 1). So, it's certainly not broken; but anything we could do to get those results down to single figures would be worth considering. Hajor 00:03, 5 October 2005 (UTC)
Not only are there differences in various implementations of Google, but there are many other search engines as well. Nobody knows nor should be expected to know the precise details of how all these different engines work, and it is often impossible to determine exactly how they do work. Nobody tells us all the details of the workings of their search engines. And furthermore, the page search functions on browsers are usually strictly literal. Gene Nygaard 18:22, 7 October 2005 (UTC)

Naming policy

I find it somewhat inconsistent. E.g. Some insist that Danzig should be used instead of Polish Gdansk whenever majority of population was German or when city was part of German states. OTOH, trhoughout the encyclopedia only Vilnius is used instead of Wilno. The same is with L'viv and Lviv used almost consitently instead of Lwow or Lwów. Did that means that policy is to use German names whenever possible and local names in other cases????!? szopen

There's been a lot of discussion of this issue on one of the mailing lists recently, do a search on "Polish" in the Wikipedia-l archive for October and you'll get some of it. It seems it is a recurring subject, but doesn't yet have a page of its own. Perhaps we should have one in the Meta? Andrewa 14:17, 4 Nov 2003 (UTC)
See Wikipedia:Naming conventions (use English). We use what the place is best known as in English. It isn't an issue of whether it's a German or Polish or whatever name. Angela
This is exactly problem with what is "English" name. Germans keep saying that German name is "English" name. Also, few have insisted on inserting German names into places e.g. Warsaw which were temporarily under German government. I am not talking this is right or wrong. DOn't care much in fact. But i want consistency. If such policy is adpoted, then Brandenburg would have to include also name Brenna, Vilnius WIlno, L'viv Lwow etc. szopen
I don't think there is a problem. The "English" name is what the majority of English-speakers use, not what people want English speakers to use. Daniel Quinlan 23:51, Nov 4, 2003 (UTC)
Well, it is a problem if it is hard to determine what English speakers currently use. Maybe Danzig/Gdansk is a good example, maybe not, but in any case: Danzig is the form of the name you encounter when you read books on World War II, where disputes over Danzig and a Danzig corridor ultimately ignited the war in Europe. In a historical context the town is mostly not re-named to its Polish equivalent Gdansk. On the other hand, the Solidarność movement was of course ignited in Gdansk - not in Danzig.
But usually, the issue of what's English usage is rather uncontroversial, and it is more an issue of courtesy or political correctness if names in other languages should be listed as more or less equal alternatives. In hot spots, as currently on Polish wikipedia pages, this easily leads to long lists of names on half a dozen different languages.
--Ruhrjung 03:18, 5 Nov 2003 (UTC)

FWIW, this isn't only a Polish issue either. Ukrainians want to use Kyiv as the name of the city usually known in English as Kiev, considering the latter, as a transliteration from Russian, to be an offensive relic of Russian imperialism. However, Kiev is by far the more common English name, so there is disagreement over whether "common, but possibly offensive" or "official, but very uncommon" should take precedence. See also Talk:Kolkata for previous discussion of Calcutta vs. Kolkata and related. --Delirium 23:57, Nov 4, 2003 (UTC)

Exactly. Those are Kiev and Calcutta. The preferred local names can and should be noted, but the articles should be placed where common English usage dictates — we cannot please everyone, so common is safer (and much more friendly to users) than figuring out what is least offensive to everyone. Note that English usage often does eventually change to conform to the whims of other countries. For example, Peking became Beijing (although that was an issue of transliteration, perhaps not imperialism, although some might interpret Peking as being an imperialist name too). Daniel Quinlan 00:37, Nov 5, 2003 (UTC)
Well, since that article is located at Kolkata, this is currently somewhat inconsistent. That one's a little more tricky though, because it is called Kolkata commonly in India, which has English as an official language (and ~200m English-speakers). --Delirium 01:53, Nov 5, 2003 (UTC)
Good point! Kolkata is perhaps more correct as per Wikipedia:Manual of Style. I think India counts as an English-speaking nation. Daniel Quinlan 07:25, Nov 5, 2003 (UTC)

Any tips on naming conventions for non-English organisations? There is temptation by some to translate them where familiar words are used, not just transliterate. Compare Parliament of Sweden with Tweede Kamer or consider Partij van de Arbeid vs alternatives such as "Labour Party (Netherlands)" "Labour Party of the Netherlands", or Front National vs "National Front". Consider also Nederlands Instituut voor Oorlogsdocumentatie vs a translated "Netherlands Institute for War Documentation". ( 20:19, 16 Dec 2003 (UTC)

Another example is Justice and Development Party (Turkey). ( 11:49, 20 Dec 2003 (UTC)
See also Talk:Socialist Party. For Swiss parties, the names are consistent with those used elsewhere, probably by the parties themselves [1] and the World Factbook. -- User:Docu

I'd like to propose a change. We are supposed to use the commonest form in English. How do we know which is the commonest? The Wiki method seems to be to use a crude Google search. Shouldn't we use the form most commonly used by intelligent, knowledgeable, literate speakers of English? Plenty of morons say "Milano" (mangling the pronunciation of course) but the English name is Milan, for example.

Secondly, some Wikipedians will accept the English name for a city, but not for the surrounding area directly named after it. E.g. Seville and its province. I see no logic in this whatsoever. Don't we need an express policy on this? —Chameleon 01:54, 12 May 2004 (UTC)

Good or Bad?

I found that more and more foreign-made words are being poured into English WP as entries. I don't know if these words make sense to English speakers, and I am wondering if it's good or bad for the developement of English WP? --Yacht (talk) 10:17, Jul 17, 2004 (UTC)

Examples, please. David Remahl 10:31, 17 Jul 2004 (UTC)

I can't give you the accurate examples, because English is not my mother tongue, i just come across some words i don't think they are normal English words, or foreign-like words (like Führer, Fribytaren på Östersjön, Yuri, Yuzu etc.) I don't know if they are already widely used in English, or just neologies (aren't there any corresponding English words for them? I don't even know how to read them). I am just worrying if this may happen: every language creates the corresponding synonym entry for that in English.--Yacht (talk) 16:16, Jul 17, 2004 (UTC)

I believe the policy is to use the English name if it exists and it is more popular. Otherwise, the local name is used with any appropriate redirects. Dori | Talk 16:39, Jul 17, 2004 (UTC)

You could make a case that a large part of English stolen foreign words! 'I', 'Found', 'word' all come to Old English from proto Germanic roots, 'foreign' and 'pour' are from old French, 'made' comes to English from West Germanic, 'entry' is from Middle French. Wiki of course is Hawaiian, 'pedia' probably from Greek, misunderstood by Latin scholars. You were 'wondering', which comes to us through old English from proto Germanic, whether this affects the 'development' (a French word). Interestingly, noone seems to know where 'bad' comes from, but 'good' is another proto-Germanic word. I wouldn't let it keep you up at night. PS. Yacht is from Norwegian! Mark Richards 17:17, 17 Jul 2004 (UTC)

There's quite a well-known and oft-quoted saying: The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore. We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary. [2]. Führer is a commonly understood word from the events of the 1930s and 40s; the second example is the title of a Swedish book which I'm unfamiliar with, though the article gives a translation; Yuri I've not come across, while Yuzu apparently is a Japanese fruit, so it's not surprising that there isn't an English name for it. I thought Yacht came from Dutch! [3] -- Arwel 18:33, 17 Jul 2004 (UTC)

Could be Dutch - probably one of them stole it from the other! It's actually good to have the vigor which comes from importing words, it helps to keep language vibrant. The point about mapping Latin grammar is also good, the Victorians thought that latin should be the model for languages. In Latin you CAN'T split and infinitive, so in English you shouldn't? Go figure. Mark Richards 18:49, 17 Jul 2004 (UTC)

the purity of English is one thing, while what i am concerning is another. I hope people don't think English WP is a romanization version of their own languages, and create thousands of entries in their romanized languages that should be in English (just like i created an entry Nie Zi for Crystal Boys, but later, i know there is an English name for it, so i deleted the Nie Zi which may make no sense to English speakers). About Yacht, i think it's a German word, and that's why i chose that. :) --Yacht (talk) 18:03, Jul 19, 2004 (UTC)

In fact, Yacht is a German word, because German has pimped the cribhouse whore that is English so much that it catches its vocabulary like STDs.
BTW: It's never wrong to create a redirect from the original language term. -- Mkill 20:18, 3 December 2005 (UTC)

The problem with anglicisations

The problem is when there is no anglicisation of a term, the anglicisation is really bad, or the anglicisation has fallen out of favour (for political/historical/cultural reasons). Also, there may be two, or a number of competing anglicisations - which may lead to a preference for the most locally appropriate term.

All in all, you'll have to put up with, and expect, some non-English terms. Besides, as noted above, there's plenty of non-English terms have been absorbed into our vocabulary. (Coup d'état isn't exactly English!)

zoney  talk 19:40, 23 Aug 2004 (UTC)

I don't think anybody would object to using other names if there is no anglicisation, or it has fallen out of favour. These days you won't find many people insisting on Peking rather than Beijing, and I don't think Wikipedia should. It also doesn't mean that our preferred usage is set in stone for all time. In twenty years time it may be that most English-speakers understand 'Kolkata' rather than 'Calcutta' (sorry if I have the spelling wrong for the first). DJ Clayworth 19:48, 23 Aug 2004 (UTC)

English names or local names for universities ?

(I have posted a nearly similar paragraph on Talk:List_of_colleges_and_universities -please tell me politely that this double posting is wrong if it is ; I shall learn through my mistakes)

I had thought of editing some pages related to French universities and institutions, and I feel that some coherence should first be given to the choice of name to be used for each of these.

Reading this discussion page (notably the paragraph about parliaments or political parties) I am rather more puzzled than I could be before.

If I browse through various pro country university lists, I can only discover that there has been no common policy as what to do with university names, e.g.

  • for Germany or Italy, the English name is systematically used, with a redirection from the local name ;
  • for Belgium, the English name redirects to the local name, as well in French speaking or Dutch speaking Belgium ;

The page List_of_colleges_and_universities_starting_with_U is a special mess, with its mix of translated names (e.g. "university of Modena") and native names (e.g. "universita degli Studi di Pavia").

I think that some more precise instructions should be precious to future editors. --French Tourist 21:54, 30 Aug 2004 (UTC)

Conventions for transliteration

Name your pages in English and place the native transliteration on the first line of the article

Many languages can be transliterated (romanized) by more than one system (e.g. pinyin, Wade-Giles, and others for Chinese). Is this article a good place to keep track of the convention that are used on Wikipedia?

Is it understood that some names that have an established spelling may not be transliterated using the chosen convention? Do any languages use more than one system? With the impending adoption of UTF-8, is it better to just use the native text and IPA?

I'll start a list here from what I know; please add any systems that seem to be commonly accepted.

Michael Z. 23:53, 2004 Sep 20 (UTC)

Transliteration methods used on Wikipedia

  • Russian
  • Ukrainian (see Romanization of Ukrainian)
    • Many personal names are rendered in English phonetically. This is usually very close to transliteration by theBGN/PCGN system, which is quite intuitive to pronounce for English speakers. Some names have traditional spellings that come from other languages, like Polish, German transliteration, etc.
    • For geographic names in Ukraine, the Ukrainian National system is used. For historic reasons, many names are also presented in Russian, Polish, etc.
    • Linguistics topics often use "scholarly", or "scientific transliteration".

Diacritic marks in article titles

Hi, I'm trying to work out what standard policy should be towards diacritic marks in article titles. There doesn't seem to be a standard policy at the moment - or if there is, this page doesn't really give any clear guidance. I do see some discussion above here, but I'm not clear if rough consensus was reached.

I am aware that Wikipedia:Naming conventions (technical restrictions) says we are restricted to using "ISO Latin-1" in article titles - I'm asking about whether we should use those diacritic marks that are included in that character set.

On the one hand, there seems to be an argument that we should file the articles without the diacritic marks, since i) this is the English Wikipedia, and ii) most English speakers are oblivious to diacritic marks, and usually don't know how to type them.

On the other hand, the policy here for transliterations is that unless an Anglicized form is in common use in the English-speaking world, use the straight transliteration. So extending this would say that for things which are commonly written in the English-speaking world without the diacritic marks, write them without, and for things which aren't well known, use the diacritics.

I was originally of the first school (actually, originally originally I was of the second school, feeling we should always file articles under the full correct names, since redirects are always available, but I got borged into the policy here), but am now moving towards the second one. What does everyone else think?

Of course, one should always add a redirect from the other form (to prevent duplicate article prevention and aid linking, if nothing else).

Anyway, please speak up, and let's work something out and write it down! Noel (talk) 11:41, 16 Nov 2004 (UTC)

I'd really like to hear from people about this, but so far nothing. If nobody objects, I will go ahead and edit the page to contain the policy I outlined above - use Latin-1 diacriticals for those names which are commonly written in English with them, others without. Noel (talk) 00:07, 19 Nov 2004 (UTC)
I think the policy is (and ought to be) to list things under the "English" name (i.e. without diacritics where they are not commonly used in English - e.g. Zurich not Zürich, although it seems to have gone back to Zürich at the moment - and with diacritics where they are - e.g. déjà vu not deja vu - and with diacritics where there is no common English useage, and with redirects from one to the other in any event). -- ALoan (Talk) 10:14, 16 Nov 2004 (UTC)
I'm all for using the correct name, as much as is technically possible. Redirects should handle any problem I can think of for finding articles. There may be some authoring problems to shake out, but hopefully they don't come up too often.
* Users might think all diacritics are acceptable, and start moving articles to Unicode names. I think this can be a pain to fix when it happens.
* We might start having heated debates about which is the correct English name for an article.
Does anyone have any idea when we're moving to UTF-8? That will resolve the technical issue once and for all (at least until someone tries to use Etruscan and Klingon names).
Michael Z. 02:36, 2004 Nov 19 (UTC)
No idea on removing the technical limitation to Latin-1 in article titles, but anyway that is orthagonal to the issue here. (I.e. Latin-1/UTF-8 only affects what kind of diacritic marks can be used in titles, not whether or not we should use them in En:, which is what I'm trying to get straight here.) However, your point is worth remembering - when I add notes on this topic to the article, I will add a warning against using UTF-8, with a link to Wikipedia:Naming conventions (technical restrictions).
As to the "what's the correct name" arguments, yes that's a problem, and it's one of my big problems with the current policy. (Were it up to me, the article would always be at the full correct name, with redirects as useful, but...) However, we seem to mostly do OK with this issue on spellings, so I imagine we can handle the diacritc ones too. Noel (talk) 13:59, 19 Nov 2004 (UTC)

This issue has been debated elsewhere, for instance on Wikipedia:Village pump (policy). Perhaps it ought to be given its own discussion page, as it will no doubt come up again. / Uppland 14:37, 19 Nov 2004 (UTC)


I think there is a widespread and common Wikipedia practice, even if not written, to leave characters in article names which are present in character set "ISO Latin-1".

See these examples of existing Wikipedia titles from various European languages:

La bohème, Résumé, Crêpe, Château, Déjà vu, Ménage à trois (French), Hans Christian Ørsted, Søren Kierkegaard, Tor Nørretranders (Danish); Václav Havel, Emil Zátopek (Czech), Kurt Gödel (Czech-Austrian), Ernst Thälmann, Max Müller (German), Béla Bartók, Ferenc Mádl, Ernö Rubik (Hungarian), Niccolò Machiavelli (Italian), José Saramago, Diogo Cão, Luís de Camões, Nuno Gonçalves (Portuguese), Ildefons Cerdà, Antoni Gaudí, Salvador Dalí, Benito Pérez Galdós, Luis Buñuel (Spanish); Sabiha Gökçen, Turgut Özal (Turkish).

I don't really see why it is a question. It is sure, however, that it should be written down explicitly because there is a minority among names (up to 5% to my experience) which don't follow this practice.

--Adam78 23:08, 19 Nov 2004 (UTC)

For Slovenian, Croatian, Bosnian and Serbian names (these peoples use languages that are written in modified Latin), it has been a fairly common practice to use the name without the Latin2 diacritics in the article title on en:, but use the same diacritics in the article text. Occasionally the same issue applies to Macedonian and Bulgarian names, which are originally Cyrillic but can use Latin2 to transliterate. The letters š/Š (š/Š) and ž/Ž (ž/Ž) are causing some confusion because Latin1 includes them (cf. titles and redirects at Miroslav Krleža or Vinko Žganec), but that's two out of three that are missing so they're generally avoided. The "wrongtitle" template has brought some more attention to the issue, but I don't think it should be used for this purpose. --Joy [shallot] 22:43, 18 Jan 2005 (UTC)

sure, it's easier to drop the diacritics, and I created many diacritic-less article titles myself. I just don't see a reason to remove the correct diacritics if somebody went to the trouble of inserting them. dab () 09:00, 19 Jan 2005 (UTC)
I do. Since I wrote the major part on some Croatian folx with diacritics in their surnames (for instance, Miroslav Krleža), intentionally without diacritics (Miroslav Krleza originally) in the title page so that the page may be "visible" google/yahoo-wise around the world for those who dont have haceks on their keyboards- I strongly object to the change of the name-form in the title (not in the body of the text). In short- various Slavic and Germanic names and surnames are more visible to the English speaking/reading world. Of course, in other national wikipedias (Croatian in my case) the names do appear as they are in original form- and this the Cro wiki case. But, in the case of En wiki I strongly disagree with haceks in the title pages that, essentially, obstruct the accessibility and visibility of articles. Mir Harven 16:57, 22 Jan 2005 (UTC)
One could also argue it's unfair to people with č, ć and đ in their names that their names can't be written exactly properly in the title, while those with š and ž can. Mir Harven also has a point with regard to visibility - search engines are more likely to pick up our article if it's got both the title with and the title without the diacritics shown in an article. For Latin2 names, it's even advisable to have a version without the diacritics (even one day when we get UTF-8 on en:, at least in redirects) because that will prevent them from being dropped from English-language searches. --Joy [shallot] 23:26, 25 Jan 2005 (UTC)

After further discussion at Wikipedia:Naming conventions (technical restrictions)#charset issues, I created Template:Titlelacksdiacritics. Please follow up there, and at Template talk:Titlelacksdiacritics. --Joy [shallot]

Since the Mediawiki 1.5 upgrade also happened to introduce UTF8 support, this can be put ad acta. We've been converting the vast majority of pages using {{titlelacksdiacritics}} to the proper title and dropping the template. --Joy [shallot] 29 June 2005 17:29 (UTC)

Proposal

See earlier discussion: Wikipedia talk:Naming conventions (use English)/diacritics

The project page guidelines are fine. this is about the "less clearcut" cases.

Spelling of non-English terms on Wikipedia

See Transliteration for background on the concepts transliteratino vs. transcription.

Convention: Name your pages in English and place the native transliteration on the first line of the article unless the native form is commonly used in English.

  • Geographical names (cities, rivers etc.)
    • if there is a common English spelling, this is to be used for the article title. examples are Cologne (vs. Köln) and Rome (vs. Roma).
    • If a name (or a name's romanization) is sometimes spelled with diacritics, and sometimes stripped of diacritics, it is likely the more academic publications that include diacritics. If at least some major publications can be shown to have a policy of including diacritics, the article title should be the version with diacritics, but article texts are free to use either version (e.g. Zürich (see Talk:Zürich).
    • Chinese tone diacritics? (Shang Ti)
  • Concepts
    • use native (or romanized, in the case of non-Latin native spelling) spelling: reichsfrei
    • if the term is common as a loanword in English, use the commonly used English spelling (doppelganger, mantra, robot,... cotton)

Comments

Please address specific points of the proposal (The "Zürich" case is the most borderline/controversial imho). dab 15:12, 20 Nov 2004 (UTC)

I would prefer:

If a name (or a name's romanization) is sometimes spelled with diacritics, and sometimes stripped of diacritics, editors are encouraged to ammend the diacritics in the text. Using charcters of the ISO 8859-1 character set is similarly encouraged for article titles.

As I see it, the critical issue is whether a policy should encourage or disencourage usage of Latin-1 characters. I could live with both, but this might be the good moment to express this a tad more clearly.

--Johan Magnus 16:59, 20 Nov 2004 (UTC)

I would prefer:

All names should be in the 26 letters of the English alphabet. If the page is about a topic which can be spelt with the Latin alphabet with diacritics, then the name with diacritics should be shown on the first line in the format:
  • English Alphabet (language name:Foreign Alphabet)
Eg
  • Zurich (German:Zürich)
This rule should be used for all names unless it is universally spelt in English using diacritics. After the first line the primary author of the page is free to use which ever version they feel most comfortable with.
To alow search engine to find all pages in Wikipedia using a English speaker's standard keyboard, on pages which uses a word which can be spelt with diacritics, then at least one visible instance of the word on the page must be spelt without diacritics. For the rest of the page secondary authors should follow the lead of the primary author and not start an edit war over the use or none use of diacritics.

--Philip Baird Shearer 14:30, 21 Nov 2004 (UTC)

I don't understand:

a topic which can be spelt with the Latin alphabet with diacritics, then the name with diacritics should be shown on the first line

Inside articles, UTF works fine (we're only restricted to Latin-1 in article names), so there's no reason not to give the full proper name on the first line (as you suggest), in whatever script. Yes, it should also give a Romanization (which may use characters which are not in the Latin-1 set - a lot of Japanese names are like that, with the macrons on o and u, not available in Latin-1), so people know how to say it.

However, this is all somewhat afield from article titles, which is what I enquired about. When it comes to titles, I think "universally" is a bit much - if the preponderant use is in either direction (with or without diacrits), then go that way, I would say.

Noel (talk) 02:26, 22 Nov 2004 (UTC)

Noel for most languages which are not based on the Latin alphabet there are well known rules for translating into the 26 letters of the English language. Whether one uses Beijing or Peking is not something I am addressing, because in such cases there is no "Latin alphabet with diacritics" issues and I for one am not debating the Beijing or Peking issue. I am addressing the issue of other languages which use the Latin/Roman based alphabet with diacritics which are not used in English. In the case of a page names, unless the name is (near) universally spelled with diacritics in English, the name should be in the 26 letters of the English language (please note there is no mention here of character sets). (The reason for "near universally" this is because of words like Zurich even where "preponderant[s]" is clear some will argue that it is not, and that leads to edit wars, eg Zurich). In the case of page names if the word is also known to some in English with diacritics then the word with diacritics should also be shown eg:
  • El Nino (Spanish:El Niño)
The technology used to display the word in none English letters (or English letters for that matter) is not my primary concern and as technology changes then this may change. It makes sense to use the lowest common denominator which will do the job and if you were to say to me that is ISO 8859-1 then it would seem sensible to recommend that. If some cases need a more fancy solution then use the simplest solution which fits. Along the lines of Occams razor . This is an extention of my basic argument which is a cultural one not a technical one, but it is the application of the same idea. Philip Baird Shearer 14:59, 22 Nov 2004 (UTC)

This is about article titles. Since there will be wikilinks in the article texts, it is also about article texts (piped links notwithstanding, it will be most common practice to just link to the article title). I would finally like PBS to come clear on his opinion: You would like to avoid all non-ascii titles, for any article type? I do believe there will be a strong consensus against this, but can we vote on this, just to get it out of the way (or, if it is decided to go this way, we can tell the software people that the UTF-8 transition is not required)? This will mean we will not only have Zurich, Catalhuyuk, but also Kurt Godel, Bekesy Gyorgy, Albrecht Durer, Neue Zurcher Zeitung and so on. Note that this would be a major change compared to the de-facto practice on WP. The de-facto practice on WP is that we give the name in its native spelling if possible in Latin-1, with a redirect that is stripped of any diacritics.

PBS, I will not reply to any argument involving (a) search engines or (b) keyboads or (c) grammar again. It's simply not the issue. I have pointed out very often that these are entirely beside the point without your showing the slightest inclination towards even recognizing the existence of my objections.

Logically, we will proceed as follows:

  1. decide if we want to disallow non-ascii letters in titles altogether (which I believe is PBS's position)
  2. if we decide to allow non-ascii titles, we should decide if non-ascii titles should be used in redirects exclusively (i.e. the article itself will always be at an ascii-only title) — this could also be a position of PBS's, I am not sure.
  3. if we decide to allow articles with non-ascii titles, we would have to decide if we want to restrict them to Latin-1. This used to be a technical restriction, and I see no reason to keep it up once it is possible to have UTF-8 titles.
  4. finally, if we make it this far, we can agree on when to use native spellings which may or may not include diacritics. The general idea will be preponderance. My proposal above attempts to give a clearer definition of such preponderance. Mere google hits are not a good idea (especially for more obscure subjects, WP mirrors will make for the majority of hits, and will be difficult to filter out. Also, we do not want to give random blogs weight equal to reputed publications. It will be a question of figuring out which spelling is preponderant in serious, encyclopedic or scientific English publications).

can we get points 1-2 out of the way now, please, and then either scrap this proposal and go to a (imho, archaic 1970s-style) "ascii only" policy, or decide to make use of 21st century's UTF and concentrate on the (granted, difficult) policy discussion of how to handle them? dab 09:51, 22 Nov 2004 (UTC)

KEEP IT SIMPLE. If there is near universal usage of funny foreign squiggles when using a name with funny foreign squiggles then use it otherwise use English letters. So "all" is not what I am advocating. Your "preponderant" argument is too convoluted and open to misinterpretation and misunderstanding. Look at Zurich as an example. My rule is unambiguous.
You say that you do not want to hear technical arguments but in the second point you number you rabbit on about "ASCII". There good technical reasons for not using letters which do not appear on the typical English language keyboards. However this is not my main argument. My main argument is cultural. dab you have not answered the question about what is you mother tongue. For a moment imagine that you are not from a country with four official languages with a large number who have a working knowledge of English. Instead imagine you are a child in an inner city council estate in any medium sized town in Britain, or the projects of any large America city, or a fisherman who lives in Invercargill in New Zealand, or a farmhands wife who lives near Abilene, Texas, or a child in Sappro Japan (who is learning English at school). Keeping names to the ENGLISH alphabet makes the Wikipedia more accessible to ALL those who may choose to use this encyclopaedia. The Sun newspaper uses about 500 words to describe the world. This encyclopaedia should be accessible to people who read nothing else. Using the format:
  • El Nino (Spanish:El Niño)
covers the eduction remit of any Encyclopaedia. Suppose I do not know how to spell El Nino. It might be "ell neno". If I use Google to do the search it will correct my spelling, but it will not find the Wikipedia page El Niño (other than through the one external link which has el-nino in the link name). Why limit access with complicated rules when when "'keep it simple" helps people by making Wikipedia more accessible? So here are my suggestions:
  1. unless an article name is near universally spelt in English articles using diacritics, then the article name on the page must be spelt first using the 26 letters of the English alphabet. If the article name sometimes spelt in English with diacritics, it should be shown with diacritics on the first line of the article eg:
    • El Nino (Spanish:El Niño)
  2. On the rest of the page it can be spelt with or without diacritics at the descression of the primary author.
  3. On pages where it is near universally spelt in English articles using diacritics, then the word must also appear in the 26 letters of the English alphabet on the first line.
  4. The page link name may or may not be in the 26 English letters providing there is at least one redirect which is.
  5. On other pages there must be at least one[a] visible instance of the article name spelt without diacritics. On the rest of the page it can be spelt with or without diacritics at the descression of the primary author. I am not sure how to define this, but the 26 letter version must be visible in the text as in [[Zurich|Zurich]] not [[Zurich|Zürich]]. Perhapse someone else can suggest some words to cover this.
[a] note that "the one" does not have to be the first one, it could for example, be in a "see also" section at the bottom.Philip Baird Shearer 14:59, 22 Nov 2004 (UTC)

for your reference: ASCII; Diacritic; ISO 8859-1; UTF-8 (if I'm using a term you don't understand, look it up rather than accusing me of 'rabbiting on' or obscurantism). It seems that we do not entirely disagree. It is rather a question of "nearly universal" vs. "preponderance". I agree that diacritic-less spellings that are frequently found in reputable publications should be used. I do not think that google hits should be decisive. Ok. let's find some common ground. Do you agree that personal names should be treated differently (prefer native spelling) from geographical names? We may need guidelines for cases where other encyclopedias can be shown to support either spelling (as researched by Jallan on Talk:Zurich). I agree that we could go either way in the Zurich case, but I insist that Pāṇini (personal name) is useful. dab 15:20, 22 Nov 2004 (UTC)

Actually, Google seems to be smart enough to find "el Niño" when I search for "el nino" [at google.ca]. For me, the second hit only has the name spelt with tilde over the "n", on the page and in the window title. I'm sure the technology will only improve.
Note: http://www.google.com.au http://www.google.co.nz http://www.google.co.uk http://www.google.co.za all work the same way, they differentiate on diacritics eg a search on "Zaire" and "Zaïre" return diffrent pages. http://www.google.ie http://www.google.ca seem to be set up as bylingual (one of which uses diaeresis) Germany http://www.google.de returns similar results to google.ca and google.ie. So it is a perceived cultural diffrence by Google not a technical one Philip Baird Shearer 16:54, 19 Jan 2005 (UTC)
To me it seems that missing diacritics in names are like using hyphens for dashes, underlining for italics, using SAMPA for IPA, only more common than most of these. It's a poorer version of the real name, routinely used out of convenience, ignorance, or technological limitations, even in professional pursuits like journalism. Similarly, Germans often write "oe" for "ö", and "ss" for "ß". It's acceptable in most applications, but being an encyclopedia, WP ought to endeavour to be academically correct about such things.
"Ñ" is not a "Spanish letter". N is a letter in the English alphabet set of Latin letters (or 'alphabet') conventionally used for the English language (sometimes described as 'English-language' or, loosely, 'English'), whether it has a tilde over it or not. If we can display the tilde over a name where it's used, then we should. [updated for literalists Michael Z. 18:27, 2004 Nov 22 (UTC)]
By the way, UTF does not work inside article text on the English Wikipedia. UTF characters must be stored using Latin or numeric entities. Microsoft's CP-1252 extensions to ISO-8859-1 (including curly quotes and dashes) will work in some browsers, but these characters won't display correctly in all cases, and eventually will be destroyed when someone makes an edit with a browser that sticks to the ISO standard.
Michael Z. 16:17, 2004 Nov 22 (UTC)
It seems to depend on which country you are in as to what the defaults on Google are, see Talk:Zürich for more on this. In the UK it differentiates the two. So the searches do not return the same thing. Philip Baird Shearer
Note: http://www.google.com.au http://www.google.co.nz http://www.google.co.uk http://www.google.co.za all work the same way, they differentiate on diacritics eg a search on "Zaire" and "Zaïre" return diffrent pages. http://www.google.ie http://www.google.ca seem to be set up as bylingual (one of which uses diaeresis) Germany http://www.google.de returns similar results to google.ca and google.ie. So it is a perceived cultural diffrence by Google not a technical one Philip Baird Shearer 16:54, 19 Jan 2005 (UTC)
I was asked by Johan Magnus to join the discussion here, but I'm afraid it is pointless. As far as I know the English wikipedia is the only wikipedia out there that does not support the ISO 8859-2 encoding. That is, barely any of the Central European characters is supported. Because of that we've been using the no-diacritics work-around for quite some time now. I must say it is a pain in the back to have to write almost every internal link twice (for instance [[Jozef Pilsudski|Józef Piłsudski]]). It is also an absurd that the leading wikipedia is the most retarded in terms of character encoding support. However, I have no idea what could be done about it.
The problem is also the same with article titles. It is not a problem to place a redirect from "non-diacritical" version of the name to the proper name. However, unless the English wiki is upgraded nothing can be done.
Anyway, when this happens - please let me know, I'd be happy to join this discussion. Regards - [[User:Halibutt|Halibutt]] 17:05, Nov 22, 2004 (UTC)
please look at Wikipedia:Naming_conventions_(technical_restrictions)#Latin-1, where it is clearly stated that en: will support UTF. The point of this discussion is to come up with a policy of what is desirable, provided the technical limitations are lifted (i.e. we are looking into the (near) future here. UTF redirects are already possible, btw (Pāṇini), but as a title, it will be mangled ([4]). dab 17:39, 22 Nov 2004 (UTC)
I'm not sure when the change to UTF-8 will happen - see the comments at WP:RfD#July 25. The reason it comes out mangled is that the relevant Unicode chacters are two-byte characters, which the En: site interprets as two separate characters. Noel (talk) 18:33, 22 Nov 2004 (UTC)
PS: If the name wasn't written out in non-diacritic letters in the article, I'd have no idea what it is, because on my browser, the third character (ṇ) shows up as a nice little square box, just like all the other Unicode characters it doesn't understand. Noel (talk) 18:43, 22 Nov 2004 (UTC)
This sometimes happens when the display font is missing the character in question, even though the system software and web browser are technically capable of displaying it (and automatically substitute another font in most cases). I've updated my monobook.css style sheet (at User:whomover/monobook.css) to use 'arial unicode ms' as the first choice font. My Mac's 'lucida grande' looks better, but doesn't include italics. Michael Z. 19:11, 2004 Nov 22 (UTC)
I am well aware of why it happens. And obviously my browser didn't "substitute another font", otherwise I wouldn't have seen that little box, right? My point is that the average English-speaking person who comes here for information is likely to run into the same issue. So people can stick correct diacritics all over the place, and all it will do is result in a lot of people seeing nothing but a little [] (don't know the Unicode for a box) all through the article. Noel (talk) 00:53, 23 Nov 2004 (UTC)
Which OS/browser are you using? Under Windows 2000/XP, both IE6SP1 and Mozilla Firefox 1.0 have no problem with editing and displaying Unicode characters. The 'common' diacritics are included in standard Western ISO set ('Western Europe and United States' language) that is installed in ANY localization of Windows by default, so there should be no problem with ISO 8859-1 characters in article names.
The symbols that could possibly display as a box would be non-Western scripts, i.e. Arabic, Armenian, Baltic, Central Europe, Cyrillic, Georgian, Greek, Hebrew, Indic, Japanese, Korean, Simplified Chinese, Thai, Traditional Chinese, Turkic and Vietnamese. But even if they display as boxes, they are stored and even copied and pasted to the text editing box just like any Unicode character and then converted by Wiki to decimal Unicode references (ex. А), so there's no chance to lose them in the edition (with the annoying exception of Unicode accent mark, ́ which reverts to ' and in wrong position when pasted, but this can be dealt with and other universal diacritical marks copy/paste just fine).
And if you do want to see what these characters look like, just go to Control Panel, Regional Settings, check addtitional languages and provide the installation CD. All the needed fonts and pre-Unicode conversion tables will be copied in a minute. --DmitryKo 08:00, 13 Mar 2005 (UTC)
btw, there is no "English alphabet", (yes, the link is blue, but incorrectly so, c.f. Talk:English alphabet) unless you mean the Anglo-Saxon runes. N is a letter of the Latin alphabet, and Ñ is a letter of the Latin alphabet with a Diacritic. dab 17:44, 22 Nov 2004 (UTC)
"There is no English alphabet" LOL. Philip Baird Shearer
Yes, Philip, you know - it's really the Latin alphabet - full of all those letters like W and U that you see in Latin inscriptions all over the place on Roman monuments, etc. (How well I remember them from the 5 years of Latin I took!) Noel (talk) 00:53, 23 Nov 2004 (UTC)

KEEP IT SIMPLE use the same rule for people and places:

  • Panini (ancient Hindu Pāṇini)

Apart from anything else one would end up with inconsistencies like Baron "X von Zürich" born in Zurich which is an unnecessary complication.

  • "reputable publications" and "preponderance" are too vague and open to interpretation eg Zurich/Zürich. Keep the rules simple: unless nearly universal. Otherwise the tabloid press in Britain my call some French man a fraudster without using diacritics on his name, but because he contributed one article to a "reputable publication" some people will insist that the name goes in with diacritics. Don't bother replying that tabloid in Britain are reputable... ;-)
  • We could end up with two people with the same name one spelt with diacritics who is mentioned in "reputable publications" and his son, with the same name, who is not. That would be silly. Keep it simple "unless nearly universal".
  • Also what about place and personal names which are translated for the first time? Keep the rules simple, use the 26 letters in the English alphabet followed by the word with diacritics.

BTW To make a point, rabbit means talk, any inner-city London kid could tell you that, as could most other children in England, but they might not know about diacritics! Keep the rules as simple as possible. Philip Baird Shearer 18:11, 22 Nov 2004 (UTC)

I know 'to rabbit is Cockney. from the Monty Python Lumberjack sketch. thank you. Panini is just as "ancient Hindu" a name as Pāṇini, it's just spelled more lazily (and it's certainly not English, just because you wrote it in the "English alphabet". dab 18:17, 22 Nov 2004 (UTC)
'two people'? You mean two articles on the same person? This already happens anyway, and one is turned into a redirect when discovered, as 'appened with Naram-sin recently.
I'm getting tired of this. I rest my case, 'oping for other voices arguing some of my points. Including that it is a bad idea to try to "keep it simple:". dab 18:21, 22 Nov 2004 (UTC)

I agree with Philip Baird Shearer. I'm sick of being told that writing "Zurich" instead of "Zürich" is "lazy" or "unacademic" - I write it that way because that's its English name, and I have both common usage and the Oxford Manual of Style to back me up on that. I see no reason to make an exception to the "use English" rule on an English-language encyclopaedia just for diacritics, which are just as much an inter-language variable as spelling and pronunciation. Proteus (Talk) 13:25, 23 Jan 2005 (UTC)

sure, if it was as clear-cut as you think. E.g. nobody suggests changing Rome to Roma. Clearly, the English spelling is Rome, so Rome it is. Zurich is more difficult, as has been pointed out by people on that talk page for ages, with some publications choosing Zurich, and others, including the Encyclopædia Britannica and that Columbia Encyclopedia, choosing Zürich. We are here to delineate the names that have no English spelling, and Zurich is a borderline case. I am happy with Zurich, Nuremberg, Prague, etc., but we need to find guidelines here that are applicable universally (see the examples in the suggestion). dab () 18:29, 23 Jan 2005 (UTC)
And very few (if any) Wikipedians expect you to use characters you don't have on your tastature. The issue is rather that when someone else has changed Malmo to Malmö, there should be a guideline to tell if this change is an incremental step on the road to the ideal, or if it's not. /Tuomas 17:40, 24 Jan 2005 (UTC)
It gets more complicated. I wrote a page on the Second Battle of Zurich where all the references to Zurich were changed to Zürich. The justification for this was "For consistency with Zürich. Get consensus for a change to "Zurich" and I'll be happy to move these battles. Gdr 19:54, 2004 Nov 3 (UTC)" See Talk:Second Battle of Zürich. This is what triggered my interest in this subject. It annoys me that for native English speakers in many of countries will not find my work if they search with Google because someone has decided that because the Zurich page has been moved to Zürich the Battle pages should be moved as well. My views on "keep it simple" and strip diacritics and add the word with diacritics on the first line if appropriate is expressed above, but possible compromise would be to use the same rule as is used for UK and US English: Keep whatever is used by the primary author. Philip Baird Shearer 19:21, 24 Jan 2005 (UTC)

Wikipedia:Naming conventions (use English) are fine with just about every non-geographical or non-personal stuff - for example, no-one should be bothered to list a German or Russian spelling in Aluminum, Automobile etc. But I'm fine when a name or place 1) contains accented/diacritic Latin symbols, as long as the language is from Indo-European family 2) is transliterated in accordance to modern phonetical-based guidances, even when they conflict with commonly accepteed names, unless there are specific guidances, as for monarchial/noble titles. Both Hermann Göring and Moskva, mentioned in other sections, are the perfect examples.

I personally see no point of keeping Moscow spelling, which originates from ancient name of grad Moskov and was never used in Russian language since the first England embassy under Alexey Mikhailovich. I don't expect the article to be moved under Moskva right now, although I believe the common spelling would gradually change to a phonetically correct one and so will the title. Witness the power of redirects - everyone can still hyperlink to Moscow in their articles, but when someone clicks it he arrives at Moskva.

I also don't have a problem with accented German, French etc. characters. There's little phonetical difference between Göring, Goering or Goring, so I don't see why we shouldn't use national name. As an example, some articles after Russian persons with a commonly accepted name, such as Khruschev, have already been transliterated according to the Wikipedia guidances with pre-existing versions listed at the beginning, and I've yet to see any problems. Witness the power of redirects again - the searches or hyperlinks by common name leads to a properly transliterated name, no pain!

Similarily, I hate hearing Churchill's name spelled as Chyerchile (Черчиль) in Russian, while Chyorchil (Чёрчил) would be phonetically correct; Hudson spelled as Goodzone (Гудзон) instead of Khadson (Хадсон); Paris spelled as Paridge (Париж) instead pf Paree etc.

To summarize, I'm all for phonetical transcriptions in both personal and geographic names (with IPA notation of native name) and the use of ISO 8859-1 in titles, but redirects from common English names so that they could be properly linked. GOD BLESS REDIRECTS!!! --DmitryKo 22:48, 12 Mar 2005 (UTC)

I'm not that bothered, but my hackles rise when someone claims that Wikipedia should strive for "correctness". Take the recent GöringGoering move suggestion. I'm in two minds about this. Usually I think we should just leave the article where it was created, because quite often both the native spelling and the anglicised spelling are common in English speaking text. Here however you have a name that is almost universally spelled as Goering by English speakers. In that case my feeling is that we should put it where the English speaker expects to find it and keep a redirect for people who know, and will use, the native spelling. To say that the native spelling is "correct" in English is simply wrong. And most English language keyboards and browsers don't make the accents and diacritics easily accessible and English speakers cannot learn the accenting conventions of every language in which names of people and places are likely to be presented, so it really isn't a good idea to keep accented names except where the accented version is used almost universally by English speakers (Mallarmé and Fauré, for instance, may be examples where the accented form is more common in English). --Tony Sidaway|Talk 13:51, 21 Jan 2005 (UTC)

I agree that if Goering was overwhelmingly more common, it would be a case like Charlemagne where English use trumps native spelling. But, is it? It seems both are common enough. Interestinly, on Hitler_has_only_got_one_ball#Variants we have Göring next to Goebbles... dab () 14:09, 21 Jan 2005 (UTC)
  1. Original version of the article you cite "Goering has two but very small". Somebody "corrected" it.
  2. Google: "about 233,000 English pages for goering"
  3. Google: "about 37,400 English pages for göring"

Thus Göring is outnumbered more than 6 to 1 by Goering on English pages according to Google. --Tony Sidaway|Talk 14:40, 21 Jan 2005 (UTC)

Funny, Google gives me "about 293,000 English pages for Göring" and "about 292,000 English pages for Goering". Yes, I know Google works differently in the USA and UK. But English is an official language here. Why do people keep claiming that Google search results prove something, but they rely on the way Google works specifically in two or three countries, and reject results in the other 150 or so? I'd like to see the policy state that for use in resolving disputes, only Google results in the USA are valid. Or else drop the whole pretence that they prove anything anyway. Michael Z. 2005-01-21 17:12Z

Yes, I know Google works differently in the USA and UK.
If you really are aware of this, then I fail to see your point. You clearly are aware that sometimes Google makes a translation between one spelling and another. However the British google (http://www.google.co.uk) gives a good picture of the sheer weight of numbers on the side of the Goering spelling, because it distinguishes explicitly between the spellings. Don't take my word for it, go and visit the individual pages and see for yourself. --Tony Sidaway|Talk 03:52, 22 Jan 2005 (UTC)
Google hits are a fine tool for getting a very rough idea. In this case, they show that 'Göring' is less common than Goering, but still reasonably frequent. Any other conclusions, especially about 'correctness' may not be drawn from this. dab () 17:20, 21 Jan 2005 (UTC)

Michael what you are saying lets set up a search that does not differentiate between the two words and then say that they have the same usage! Using the Google as set up in Australia, New Zealand, South Africa, and the UK (and probably other English speaking countries) they all differentiate, and all show that the the difference in the word usage which is about five or six to one. Secondly, dab,you and I disagree about correctness (see above) but in the past you have said that authoritative refrences should he taken into account, a primary source for much of what is written uses Goering, the references at the end of the article also are 2 to one in favour of Goering with the decenter being David Irving who is hardly a credible source, so I assume you agree with using Goering. I would put the article under Herman Goering with a first line of:

Hermann Wilhelm Goering (German:Hermann Göring)...

If Goering was not the most popular entry under English then I would put the article under Herman Goring

Hermann Wilhelm Goring (German:Hermann Göring)...

--Philip Baird Shearer 18:15, 21 Jan 2005 (UTC)

What you are saying is lets set up a search that differentiates between the word as represented by two technologies, and then say that they have different usage!
Actually, I'm saying that I'm tired of seeing Google results misused and misquoted a dozen different ways. And now you're telling me that the way Google has configured search for the U.S. is correct and the way they've set it up for most of the world is wrong.
In one instance, discussing whether an article should be named Old Russian language or Old Ruthenian language, a user claimed that a Google search for 'old+russian' proved overwhelmingly that the former was correct. After I pointed out that thousands of pages about old Russian vodka or old bartenders who mix a black Russian didn't say anything about 10th century languages, the user, and others continued to insist that Google had proved their point.
This is just one frustrating example of many that I've seen personally. Google's methodology is untransparent, and number of occurences of a word, phrase, or cluster of words (each of which may give wildly different results) usually has zero correlation to complex and different questions like "what is the most common current usage?" or "what is the most historic usage?" of something. But still, people see numbers, which looks very authoritative, and they become convinced that something's been proven.
Google searches shouldn't be endorsed as a method for determining most popular usage, or anything else. Michael Z. 2005-01-21 21:32 Z

Nobody's talking about endorsing any one method. The fact is that Goering is by about six-to-one the most common spelling in this case, however, and this is easily shown by using tools such as Google. While the tool isn't endorsed, this doesn't mean we ignore the facts just because Google is a good method for demonstrating them. If instead I go to my newspaper site, The Guardian, I find hundreds of Goerings and one Göring (from an article written by an actor who played the role in a play in which the German spelling was used). --Tony Sidaway|Talk 19:16, 22 Jan 2005 (UTC)




[Moving this here from the other Göring discussion. / Uppland 20:04, 21 Jan 2005 (UTC)]

I would like to suggest a partial compromise which would work in anglicizing some names: I suggest that in those cases where a person has actually lived and worked in an English-speaking country and can be shown to have regularly used an English form of his or her name (i.e. not the occasional use inhibited by an English typewriter), the English form of the name would be fine. This has no relevance for the Göring article, but would solve the Masaryk article as T. G. Masaryk actually lived in England for some time. I checked the Times Digital Archive and got one hit for Tomas Masaryk, one for Tomás G. Masaryk, but 18 for Thomas Masaryk which is a more fully anglicized form, including the obituary from 1937. (There are more hits for Masaryk: "President Masaryk", "Professor Masaryk" etc.). Interestingly enough, the obituary for Eduard Beneš in 1948 spells Beneš in the Czech orthography (not plain Benes), but mentions Masaryk with the English Thomas.

I don't think we should expect any consistent application of Czech orthography from the typographers at The Times, but the last example shows that English press did not consistently strip diacritics from foreign names (the article also mentions a couple of accented Czech place-names), but also suggests that at least The Times at least on this occasion seems to have made the distinction I am proposing here between a person who had himself used his name in an English form and somebody who had not. It also shows that Tomas Masaryk, without diacritics, is really the worst alternative, being neither Czech nor English. / Uppland 18:34, 20 Jan 2005 (UTC)

Does the Goering example mean that the majority of humans have two names: one in their native language and one in English? Halibutt 20:11, Apr 7, 2005 (UTC)

Tucson

I don't see how these guidelines aren't followed for Tucson... the only thing that seems to be missing is a redirect at Cuk Ṣon. I'll leave it to Node ue to make one :) --Joy [shallot]

The native name is not given on the first line of the article, as prescribed. --Node 06:55, 22 Mar 2005 (UTC)
Well, I know a lot of people who are native to Tucson who call it Tucson, but ignoring that, the O'odham placename is given in the first paragraph, which has been my suggestion for all of these locations: giving the non-English placename a line in the first paragraph, explaining its significance (most people won't understand why Cuk Ṣon is listed after Tucson, or why Vaṣai S-veṣonĭ is listed after Scottsdale with little context.) The RFC (as found on the Tucson talk page) in Tucson wound up in favor of including the name as a line in the first paragraph. This is precisely what is prescribed, as "The body of such an article, preferably in its first paragraph, should list all of the other names by which the subject is known, so those too can be searched for." (This was how it was when an anonymous user, presumably node, placed the Tucson statement). I feel this should be applied to many other place names in the area, as mentioned on Talk:Tucson, Arizona, but node_ue has run a successful campaign of reversion-until-I'm-too-tired-of-it-to-care. His inclusion of this false information in this policy is just another example of the frustration. Anyway, I'm removing it, since it's wrong, as I've explained. kmccoy (talk) 23:54, 8 Jun 2005 (UTC)

Time to discard this policy

It's time to discard this policy. We say "Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the anglicized form", but this is just not being followed. Almost every time an argument comes up over it, the policy loses out.

Look at Wurttemberg, Riksdag, Goering, Tweede Kamer, Zurich (that's one's particularly ludicrous - walk up to the average person on the street in Auckland or New York or Sydney or London or Toronto etc and ask them to write "Zurich" on a piece of paper [so keyboards don't come into it] and they'll write "Zurich", not "Zürich"), etc, etc, etc, etc, etc. (The latest one is Spion Kop.)

I'm tired of getting in arguments with people, trying to apply this policy, only to have it ignored. It wouldn't be so upsetting if we just said "use the local name, with a redirect from the common English version, and mention the English version in the opening sentence", I would quite happily go along with that.

But it's really trying to have the policy say one thing, and then do something different in practise. It's time to document reality, which is that every page is done on a case-by-case basis, and it doesn't matter what name is most common in the English-speaking world. Unless people are prepared to actually follow this policy, and rename Zurich and all the rest, I'm going to change the page to give the policy actually followed in practise, which is "no uniform policy".

And no, this is not "disrupting Wikipedia to make a point". I am dead serious about changing this policy, because it's not what we are doing. Noel (talk) 19:31, 1 Feb 2005 (UTC)

There are a committed bunch of people who are desperate to turn this from the "English Wikipedia" into the "International Wikipedia (which, by the way, happens to be written mostly in English)", who contest every move to an English name. I'm sick of being told by people who aren't native speakers of my language that I'm using it wrongly (or even that the language itself is somehow wrong), and that every time I mention a foreign person or place by using its English name (or with English spelling, which normally leaves out diacritics) I'm an ignorant buffoon who should learn to be more "international". However, I'm convinced that the vast majority of the userbase of this encyclopaedia supports the current policy (and I suspect that a well publicised poll asking if we should use the English names of things in the titles of their articles would massively confirm it), so I see no need to give in to a committed minority just because they're far more aware than most of the existence of WP:RM. There should, however, be a change at RM requiring people to specify which policy supports their vote, so that votes like "use official name" and "that's how it's spelt in its own country" can just be ignored. Proteus (Talk) 20:02, 1 Feb 2005 (UTC)
Another example of forced internationalism on the en.wikipedia, is on the page Ionian Islands, which (until I changed it just now (and who know's how long the change will last!)) used only Kerkyra for Corfu in their list of names despite a Google hit rate for Corfu of 1 million to 60k for Kerkyra in English, but on the de:Ionische Inseln page only de:Korfu is listed, as it is with French, in Dutch it is "Korfoe (Kerkyra)". In English we benefit with a much larger article presumably because Greeks have contributed, but why force Kerkyra on English speakers? Philip Baird Shearer 14:00, 3 Feb 2005 (UTC)
It's fashionable to simplify English words and leave out the diacritics, but that doesn't make the e in café a "foreign" letter. It's still a Latin E in an English word. Although these orthographic forms are rarely used these days, attaché, naïve, and coördination are all English words.
I think it is more than a fashion to strip diacritics (besides the fashion in en.wikipedia is to put them in). I think it is the process by which a borrowed word becomes anglicized. This is also seen in the way that the word is pluralised. For example What is the plural of Virus? (This is not the place to debate this, but an example of a word which is being absorbed into English). This is of course not unique to English and happens in any dynamic language. Philip Baird Shearer 13:03, 8 Feb 2005 (UTC)
actually, I've never seen coordination accented at all. The other two I have seen.
It's usually written co-ordination these days, to indicate the syllable break; my spell checker complains about un-hyphenated coordination. I think the New Yorker magazine still uses the old form—I consider that a bit of an archæic affectation, but still English. Michael Z. 2005-02-8 03:32 Z
The name Zürich, or Zurich, isn't "English". It's a German name, whether you simplify the orthography, or not. Some non-English place names are English names, like Moscow, Rome, and Bombay. The policy as written, is aimed at names like these, and at names in languages in a different writing system, hence the instructions to transliterate which is not necessary in a Latin-alphabet language. It's also written to cover a lot more than proper nouns like place names.
I am in Manitoba, Canada, where English is an official language. There are municipalities in my province called Taché and St François Xavier. Are you going to tell me that these spellings are incorrect? They're not "foreign" places. You can write them Tache and St Francois Xavier, if you like, but that wouldn't make them any more or less English.
Obviously, you interpret the situation differently. So then let's just disagree. Or go ahead and hold a poll to settle this, if you think that would be profitable (but if you are seriously proposing that we ignore votes of people who don't share your personal world-view, then I'd just as soon see someone else do the counting). But please stop using the oh-so-injured tone, and please stop trying to label Wikipedians as non-native speakers or "internationals". The English-language Wikipedia is international. People from English-speaking countries don't have some sort of special privilege here, and we never will. Here: I'm a native English-speaker, and I disagree with you; so you can now stop blaming someone else for your woes.
Michael Z. 2005-02-1 21:34 Z

You appear to be replying to Proteus, but I would very much like to hear your response to my original point.

The page says "use the most commonly used English version of the name for the article" (my emphasis). I don't think there's much question that the average person-in-the-street in the English-speaking world would spell it "Zurich", not "Zürich", but the article is at "Zürich", nonetheless (after lengthy debate, too). So clearly the policy described in the page does not describe the actual current state of affairs. I think this page should document our actual policy. Therefore, it needs to be changed to say something like "it's decided on a page-by-page basis".

(BTW, the Bombay page is now at Mumbai. Rather ironic, therefore, that you should have picked that as an example of a page to name using the "English name"....)

I care far less what policy we pick than that we pick one, document it accurately, and stick with it. Noel (talk) 22:20, 1 Feb 2005 (UTC)

It's at Zürich because of the people who were involved in the discussion. The vast majority of users won't even have been aware that the discussion was taking place. (16 vs 14 hardly demonstrates encyclopaedia-wide consensus. I mean, pages on VfD regularly get more votes than that.) Proteus (Talk) 22:25, 1 Feb 2005 (UTC)

If you read the rest of the sentence quoted above, it says "unless the native form is more commonly used in English than the anglicized form". That is a good policy, and I think it's usually followed. "Mumbai" follows it pretty well; Mumbai is now the normal name of the city (for example, on the BBC website), a change which I can remember happening in the last few years. It's a good policy and we should stick to it. (Mumbai is now no longer an example on this page). DJ Clayworth 22:35, 1 Feb 2005 (UTC)

Okay, I didn't explain that as well as I could have, and perhaps I reacted a bit strongly based on other debates in the past.
The way I see it, Zürich and Zurich are the same spelling, but with a finer difference in orthography, or typography. It's acceptable to omit the diaeresis in casual writing, but I prefer the 'full' version of the name when it's being presented in an authority like WP. The difference is like the difference between typewriting and typesetting. In an email (or Wikitext) you might type vertical "quotation marks" and typist's apostrophes, double-hyphens -- for parenthetical dashes -- type four hyphens for a divider, and omit diacritics in some foreign-language names. But in a printed encyclopedia (and in the ideal Wikipedia renderer) these same entities would appear as “curly quotes” and typographer’s apostrophes, long dashes—of the en or em dash variety—horizontal rules, words like attaché and naïve, perhaps encyclopædia and fœtid, and Latin-alphabet foreign names would retain their diacritics, tildes, cedillas, etc. (incidentally, some transliteration systems for languages like Arabic and Chinese require diacritics)
Wikipedia:Manual of Style#Use straight quotation marks and apostrophes --Philip Baird Shearer 08:10, 4 Feb 2005 (UTC)
(Bombay and Mumbai was a bad example, because they're not the same name in different languages, but two different names. One was the Portuguese name "Good Bay", the other a Hindi name after a goddess. We used to use one, now we've switched to the other.)
The gist is that I feel this example does reflect the policy ("Languages like Spanish or French should need no transliteration"). I just disagree with you that removing diacritics makes a name English.
I do agree that the policy could be made a bit more specific in some ways. On the other hand, there will always be debatable or edge cases, and the Wikipedia way seems to be to hash out the individual cases until consensus is reached. Michael Z. 2005-02-2 02:33 Z

Have you read anything I have written above about "keep it simple" and the reason for doing so? It is not incorrect to drop funny foreign squiggles when writing a word in English. The examples you give are just as correct if the are spelt as Attache, Naive, Encyclopaedia. An example I used before was El Nino which used to show up because one of the external links had the word "el-nino" as part of its link name. Now it does not, so that page will not be found by the many people in many English speaking countries. What is the point of producing a pages which limit access to them by English speaking people because some wikipeodia editors like funny foreign squiggles? Why not use the format for such pages as:

El Nino (Spanish:El Niño)

Using this format would fulfil both ease of access for most English people and an educational remit of an Encyclopaedia. At the moment there is often nothing to indicate why funny foreign squiggles are appearing on a word. If such a format was used then at least someone not familar with the word would learn that it was German, Spanish, French etc, which is not something most pages do at the moment.

At the very least people who like funny foreign squiggles on words, could respect the primary author and leave words alone if the appear without them.

I wonder why is it that there is a debate on umlauts on Zurich in the en.wikipedia and not in fr.wikipedia?

I do not think it is time to discard the policy but to strengthen and to modify it so that both usages are included. See my 5 point suggestion under "Keep it simple" in the #Comments above. Philip Baird Shearer 03:30, 3 Feb 2005 (UTC)

It's not easy to take you seriously or reply respectfully, when you insist on referring to diacritics in the English and other languages as "funny foreign squiggles". Michael Z. 2005-02-3 16:18 Z
When diacritics are used on a foreign word in a sentence in that foreign language then they have a significant meaning and are not "funny foreign squiggles". Philip Baird Shearer 08:10, 4 Feb 2005 (UTC)

I'm tired too, Jnc. The reason we're not making progress is that people still refuse to make the distinction between transliteration and language. Dropping diacritics is a question of (English) orthography, not one of the English language Zurich is an English orthography for Zürich, but that doesn't make Zurich an Anglo-Saxon name.

Ionian Islands on the other hand is the English name for Ionia Nesia, which is the "English" diacritic-less orthography of the transliteration of Greek Iónia Nēsiá, in the Greek alphabet Ιόνια Νησιά. Because there is an English name, in this case, we do not use Ionia Nesia. This (and nothing else) is what we mean by "use English".

Can we at least agree to separate these cases, and argue about two unrelated policies, one about English, and the other about diacritics. I just refuse continue to respond to comments that argue diacritics are un-English, and therefore "no diacritics" equals "English". So, no, Zürich is no breach of this policy, and "El Nino" is not "English for El Niño". dab () 15:31, 3 Feb 2005 (UTC)

I would like to, but it is part of a continuum: Kerkyra for Corfu, Göring for Goering, Zürich for Zurich. What about a compromise that the first version of the word must be diacritic free and after that it can be either depending on the primary author? BTW the English spelling checker I am using, (in Mozilla,) throws up every one of the italic words as an error but not the English equivalents. It has no suggestions for Kerkyna and suggests umlaut stripped versions for the other two. Philip Baird Shearer 08:10, 4 Feb 2005 (UTC)

I agree that the policy should be changed into 'Use Latin names and redirect from common English names' for both personal and geographic names in foreign-related topics. Naming an article in accordance with local phonetics wouldn't stop anyone from using its English counterpart for both referencing and hyperlinking, but it should provide a better insight into the culture of a particular region. Wiki is not paper and not an ASCII-based database of the past. I can't see why a (misspelled) anglicized name is better than a native name, because there are Mighty Redirects™.

A single potential trouble with diacritical marks is that every Unicode symbol can be combined with diacritic post-nominal codes, #769 to #869. This means that acute accent letters Á and Á look the same, but the corresponding Unicode character strings are different - in the first case, it's a single character Á but the second is a two-character combination of A&amp#769;. The Wiki software should be updated to convert such double-character cases into a single-character symbol and also account for these differences when searching. Besides that, I don't see any other technical problem with diacritic or national symbols in article names, as long as they are Latin-based. --DmitryKo 11:13, 13 Mar 2005 (UTC)

this policy has its uses. It is intended to simply mean: Use the English name if there is any. It has nothing to do with transliteration, and this whole discussion was only possible because many people don't know the difference between English language and ASCII. The policy is useful, for example, to prescribe Pope John II instead of Pope Giovanni II, and Charlemagne or Charles the Great instead of Karl der Grosse. It should be made clear that these are the cases within the scope of this policy. dab () 12:21, 13 Mar 2005 (UTC)
The cases you mentioned are not geographical or personal names, but rather courtesy/noble titles and commonly accepted (nick)names. If you saw my reply in the Comments section, I specifically excluded these cases, at least for now.
That said, as more and more foreign personalities with no commonly anglicized name are added to the Wikipedia, it would put a great stress on some of the established English names. For example, one would have a hard time explaining why composer Nikolay Rimsky-Korsakov is properly transliterated from Russian Николай, but Tsar Nikolay I has to bear a Nicholas anglicization, why John Paul II article doesn't mention the official Latin/Italian names of the Pope, and most of all, why Charlemagne is more English than Charles the Great... OK, just kidding on the last one.
The universal adoption of a phonetic name of a person or place as in its original language should be the permanent solution; all of the other common names could be easily mentioned and redirected from to avoid confusion. Again, Wikipedia is not a printed-text enciplopædia of 1900s which could not have color, sound, animation, hypertext, multiple languages etc., not to mention search capabilities and millions of free editors eager to verify any article of their interest. --DmitryKo 14:24, 13 Mar 2005 (UTC)

Does this and italso :

  1. Adds the original language information.
  2. Adds a link to the article in the language if someone is interested.
  3. Most importantly it keeps the common English diacritic free version in the text.

This last one is critical because when someone move the article to a name like El Niño they also change every instance of the name to that version, and if the first person does not a later editor does. This means that the page in the English Wikipedia does not show up in a search unless the search engine wraps "O" "OE" and Ö which not all English search engines do. (EG Google.co.nz or Google.co.uk). People who are not English of have a good grasp of English need to understand that for the vast majority of native English speaking people, diacritics are meaningless and are not used when searching for a word.

Using redirects for "El Nino" to a page called "El Niño" is not sufficient because the redirect "El Nino" does not show up in a search external to Wikipedia. For this to happen the text must be embedded in the page. Philip Baird Shearer 15:53, 13 Mar 2005 (UTC)

I can't see any critical problems you're talking about (as long as alternative English spellings are implicitly mentioned in the article).
On a Google.com, El Niño gets 3,380,000 hits and El Nino gets 1,640,000 (1,240,000 and 1,280,000 for English language only respectively); the most relevant site for El Niño is http://www.elnino.noaa.gov; and the 48th relevant one is http://en.wikipedia.org/wiki/La_nina. If you search en.wikipedia, you'll get to the El Niño article as most relevant using both variants.
Is my Google smarter than yours? --DmitryKo 18:23, 13 Mar 2005 (UTC)

Which Google are you using? http://www.google.com.au http://www.google.co.nz http://www.google.co.uk http://www.google.co.za all work the same way, they differentiate on diacritics eg a search on "El Nino" and "El Niño" return diffrent pages. http://www.google.ie http://www.google.ca seem to be set up as bylingual (one of which uses diaeresis) Germany http://www.google.de returns similar results to google.ca and google.ie. So it is a perceived cultural diffrence by Google not a technical one. Using google.co.uk:

  • about 2,870 English pages for "El Nino" +wikipedia
  • about 4,220 English pages for "El Niño" +wikipedia

With +wikipidia the "El Niño" page is only picked up because of the HTTP link in the external link "NOAA explanation" includes the string "el-nino-story". If I use another one like "Second battle of Zurich" UK Google does not pick up the main Wikipedia page (because all instences of Zurich have been changed to Zürich):

  • about 74 English pages for "Second battle of Zurich" +wikipedia
  • about 3 English pages for "Second battle of Zürich" +wikipedia

-- Philip Baird Shearer 02:36, 14 Mar 2005 (UTC)

I was never able get the numbers you cite. I'm usually searching with non-localized Google, http://www.google.com/ncr (sets up a "no redirect" cookie), but whatever version I chose, the search results were all the same:
El Niño El Nino
non diacritic-aware
Web 3,380,000 1,640,000
English 1,240,000 1,640,000
en.wikipedia.org 51 26
diacritic-aware
Web 3,000,000 2,970,000
English 2,180,000 2,150,000
en.wikipedia.org 71 71
Battle of Zurich Battle of Zürich
non diacritic-aware
Web 212,000 61,500
English 206,000 27,900
en.wikipedia.org 29 18
diacritic-aware
Web 212,000 66,100
English 206,000 29,900
en.wikipedia.org 23 13
Now you've got to tell what does it all mean and how do you make any decision based on these numbers.
As a side note, I've added the most common English names to both El Niño and Second Battle of Zürich, as per my suggestion, this should effectively clear these names for search. --DmitryKo 11:19, 14 Mar 2005 (UTC)

Just try using http://www.google.co.uk don't set any defaults because most users will not do so. The run the test.

The trouble with adding the common English names is that you have to do it and the chances are that pedants will remove them again. Arguing that it is not the most common English version see the Zurich talk page. Besides if it is the most common English version why not use it for the Page name?

If we could agree that the format should be El Nino (Spanish:El Niño) the problem would be "fixed". Your format does almost the same but it does not include the additional information that it is a Spanish word and that there is an article in the native language for those who are interested: See Battle of Hurtgen Forest for an example. Philip Baird Shearer 13:47, 14 Mar 2005 (UTC)

It's not about articles in foreign languages, it's about phonetical correctness of foreign names, i.e. whether to anglicize names or stay with native spelling; the articles are still written in plain English.
Even written as El Nino, this word should be spelled El Niño (El Ninho) because it's Spanish. I don't know Spanish language, but I do know how to spell a diacritic ñ in Spanish words; for those who can't, there's little difference, they would just treat the ñ as n with funny foreign squiggles. And a historical section should explain why and when this word was borrowed from Spanish, that's way more useful than linking to an article in a foreign language (not to mention that many articles are already Interwikified)
Likewise, Goering/Goring should be spelled as Göring and Zurich should be spelled as Zürich just because they aren't English names; again, diacritics give a hint on pronunciation and those who don't know how to spell them will easily revert to English counterparts.
You are very dogmatic stating "this word should be spelled El Niño. Who are you to say that El Nino is wrong? As borrowed words get absorbed into English they tend to loose their funny forign squiggles. Common English usage is Goering. It was used in the war crime trials and has been used ever since. Why should Zurich be spelt with a umlaut in the English version of wikipedia but not in the French version? Most English speaking people would not include the umlaut when writing Zurich in English. To use Göring is an affectation for most people because they have no idea what an umlaut does. The vast majority would not know the German rule of changing "Ü" to "UE" to them Goering and Goring are not the same word, although they could guess that Goring and Göring were. There are several advantages of the link in the text over in the side bar. But one is that it explicitly links the article to the native version of the word else who is to say if El Nino is from Български, Česky, Dansk or Bahasa Sunda? Philip Baird Shearer
Who are you to say that El Nino is wrong?
The one who speaks English and finds it improper to anglicize the writing of a Spanish word but maintain the Spanish spelling. And I don't even work for U.S. Government (http://www.elnino.noaa.gov).
The vast majority would not know the German rule of changing "Ü" to "UE" to them Goering and Goring are not the same word
A hyperlink to the Umlaut does the trick.
who is to say if El Nino is from Български, Česky, Dansk or Bahasa Sunda
An explanation section is better than a plain link to the Spanish-language article. BTW, recent studies conclude that it's actually borrowed from Polish, pl:El Niño. Ha ha.
DmitryKo 08:53, 15 Mar 2005 (UTC)
Speaking of your Hurgten Forest example, I just can't see why Hürtgenwald was anglicized and Buchenwald was not. It doesn't seem there was a consistent policy, so applying these precedents is not consistent either; we just need to establish a new and carefully thought-out policy on naming foreign persons and places.
It was named in English because of 24,000 Americans casualties make it one of the biggest American battles ever. There does not have to be a consistent policy for a names like "Battle of Hurgten Forest" that is the name used in military histories and is common usage. Philip Baird Shearer
That's what I'm talking about. The policy of the past was to translate; the policy of yesterday was to transliterate; the policy of tomorrow is to maintain native phonetics. The life would be much easier if we didn't have to remember alternative spellings for every foreign name or person; maintaining native orthography is not an uncommon solution. DmitryKo 08:53, 15 Mar 2005 (UTC)
As for Google searches, I don't see the point of using "+wikipedia"; you should either use the "Search" button in the left pane and choose the Wikipedia option over WWW, or specify the domain implicitly by either adding site:en.wikipedia.org to the search string or supplying the corresponding parameter in the Advanced Search dialog. DmitryKo 17:11, 14 Mar 2005 (UTC)
The use of Wikipedia was to reduce the search that is all. The search should be done from outside Wikipedia so that you can try the standard Google search engines used in diffrent English speaking countries. From an Ordinary Google search engine prompt outside Wikipedia. use "Site:en.wikipedia.org" if you prefer it to +wikipedia Philip Baird Shearer 18:53, 14 Mar 2005 (UTC)
"+wikipedia" is not equivalent to "site:en.wikipedia.org" or "Wikipedia" search option. DmitryKo 08:53, 15 Mar 2005 (UTC)
Let's care more about correctness, and less about Google's technical shortcomings. The latter are likely to improve, but bad habits are harder to get rid of. --Johan Magnus 17:35, 14 Mar 2005 (UTC)
Are you saying it is more correct to use diacritics in English? If so what makes you think that they are more than an affectation when used in English? BTW It is not a shortcoming of Google, google.de or google.ca will both wrap Zurich, Zürich and Zuerich into one word. I guess it is a device used by Google to reflect cultural differences. I do not understand why so many people are so attached to funny foreign squiggles which are meaningless to most English speakers and which hide many Wikipedia pages from their main audience (native English speakers). Also why this is forced onto the why English version and not the other versions of Wikipedia? Eg the French:Zurich, Spanish:Zúrich Italian:Zurigo, Portuguese:Zurique, etc? Not one uses the German spelling so why shoud the English version be any diffrent? Philip Baird Shearer 18:53, 14 Mar 2005 (UTC)
Not one uses the German spelling so why shoud the English version be any diffrent?
Because there aren't many German-speaking contributors who know Italian comparing to those who know English. And English editors are so nice and non-ignorant, even more so than Polish editors. He he.
--DmitryKo 08:53, 15 Mar 2005 (UTC)
I do not understand why so many people are so attached to funny foreign squiggles which are meaningless to most English speakers and which hide many Wikipedia pages from their main audience (native English speakers) — Philip, you just don't want to listen, or learn, do you? Have a look at Encyclopedia. That's right, we are here to impart knowledge. Are you suggesting we redirect Diacritic to Funny Foreign Squiggle, since diacritic is (*shudder*) a Greek word? At least do me a favour and refrain from reiterating the entirely void "hiding" argument. That was dispelled about five minutes after you'd first brought it up. You are needlessly polarizing this discussion. We are here to decide a technicality, namely how to transcribe a few borderline cases. We are not here to discuss a reckless ASCIIzation of WP because non-ascii symbols are "meaningless to our audience". dab () 09:49, 15 Mar 2005 (UTC)

This is supposed to be an ENGLISH encyclopedia not an International one. Diacritic may have originated as a Greek word but it is used in English as an English word. If you were to read the diacritic page you would see that there are no diacritics in modern English unless it is on a borrowed word which had not been fully integrated in to English. One of the steps of integration is to strip off the diacritics. So please explain how That was dispelled about five minutes after you'd first brought it up.

It is not a technical issue it is a cultural one. As Noel said in the second paragraph of this article walk up to the average person on the street in Auckland or New York or Sydney or London or Toronto etc and ask them to write "Zurich" on a piece of paper [so keyboards don't come into it] and they'll write "Zurich", not "Zürich". Ask then to spell El Nino and very few would write El Niño. Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the anglicized form

If the words are first spelt without diacritics and then explained in brackets where the borrowed word comes from and alternative spellings after that, then if it is used with or without diacritics throughout the rest of the article it has been defined at the top and is self referencing. Several other things things have been achieved.

  • The vast majority of English speakers would be able to find the word or phrase; (Either directly or with a search engine).
  • The original source of the word would be covered (educational remit).
  • It would be systematic and consistent because it works for words which are transliterated from languages with other character sets (Not just Latin alphabet eg Prague Offensive.
  • Doing this is a simple rule which brings some consistency to articles. "Keep it simple" (Occams razor) is normally the best way to go. The current introduction to the Second battle of Zurich reads: The Second Battle of Zürich (also known as The Second Battle of Zurich) happened on 25th and 26th of September 1799. The person who moved it from "The Second Battle of Zurich" ignoring common usage (Internet and English military history sources) and primary author, has now forces someone else to stick a sticking plaster on the first line which is inelegant to say the least. Philip Baird Shearer 11:33, 15 Mar 2005 (UTC)


"The vast majority of English speakers would be able to find the word or phrase": this is the "argument" I'm referring to as "dispelled after five minutes". finding an article is a technical question, not a cultural one, and the problem is solved via redirects, so let's not bring that up again. Let's also not get wound up with the Zurich case, which has become the prototypical borderline case: Yes I agree it could be at Zurich, both solutions have been convincingly argued, including use of diacritics in major English language publications. Both would be right, ok? I'm not objecting to Zurich. What I'm objecting to is your apparent insistence that it is not possible to write about non-English names in the English language. Get it? names, not ordinary words. Philip is not an English name. Æðelflæð and Byrhtnoþ, on the other hand, are , very much, English names. After all this time, you still confuse orthography with language? this policy is not about diacritics, it is about cases Giovanni vs. John, or Roma vs. Rome, so why don't we just argue about these? As for "walking up to people in the street and asking them to spell something" are you serious?? So why am I spending time fixing factual errors in articles if what we're aiming at is not the correct information, but the most widespread misconception? dab () 12:14, 15 Mar 2005 (UTC)

I agree that I should not have written "vast majority of English speakers" it should have been "many English speakers". At a technical level, it is not just with Google (www.google.nz etc), but with other search engines as well, Eg "Ask Jeeves" (a search on "First Battle of Zürich" puts Wikipedia top of the list it does not find the article using "First Battle of Zurich".
Second, to repeat a point, it is not just a technical issue, it is also a cultural one: to write and use Zurich in English is not wrong. The Second battle of Zurich is a very good example of the problems with the conversion of names to names with diacritics. The move from Second Battle of Zurich, broke at least thee Wikipedia guidelines (and probably more):
  • Primary author.
  • Common usage.
  • Use English.
If the original article had been written using Zürich then there would be some position to defend, but it was moved regardless of guidelines. So what is the point of having guidelines if they are not going to be supported? Philip Baird Shearer 14:00, 15 Mar 2005 (UTC)
again, let's not talk about Zurich, I granted you that Zurich would be fine. As for "not being able to find the article", where exactly do you end up, clicking on Zurich? My version of Wikipedia seems to take me to the right article, right away. the "primary author's" usage is to be observed if you're just making a few addition. Once an article has been reworked many times, whatever the first stub-author may have written ceases to be relevant. "Use English" I grant you, provided you do mean the English language, not ASCII, and not the so-(incorrectly)-called "English alphabet", or the Anglo-Saxon Futhorc. So it boils down to "Common usage". We have been through that, and both usages are common, in English in the Zurich case. We may need to discuss a policy on whether which usage is preferable in cases where academic usage is different from popular usage. This will be a different policy from this one. Here we are arguing about English versions of names. And I grant you, again, that Zurich is a borderline case and may be considered an anglicized name (although it's really just the French spelling). How about mentioning Zurich in the policy to illustrate the line where disagreement seems to become possible? dab () 11:35, 16 Mar 2005 (UTC)

Before the system went into readonly mode yesterday, I had typed out a long reply to Dbachmann's last post. But on reflection (apart from stressing, I was talking about using search engines like "Ask Jeeves" 'outside Wikipedia) I am just repeating things which have already been written higher up the page. So I suggest that Dabchmann and I give it a rest and let some others who are interested in the subject contribute. Philip Baird Shearer 13:48, 17 Mar 2005 (UTC)

I've just come to this page at your suggestion on another talk page, and briefly read through the comments. Clearly you have a strong anti-diacritic viewpoint, but I don't understand why this is such a big issue. I am of a view that redirects are there to cater for the different ways people may express a term (i.e. the "write down how you'd express it" rationale above)
I don't see this issue being terribly different to the British vs. American English rule. Following the "average user" line of logic we should only write in American English because that is the prevalent strain population wise, however Wikipedia policy is you can use either - or the most appropriate for geographic specific articles. I don't see how one can you reconcile a "no diacritics at all costs" policy with the general spirit of the spelling rule. Indeed the latter part of that inclines me to believe keeping diacritics is more in line with this, and is at best suited to a neutral policy.
Diacritics are still used in English written expression, there is no technical or confusion involved in using them in Wikipedia either (because variants can use redirects). I don't see how someone accessing "Zurich", which is then redirected to "Zürich" will be confused by the diacritic in the name. --kjd 13:49, 31 Mar 2005 (UTC)

I have nothing against having the word with diacritic on the first line of an article if some people spell it that way providing that there is also a diacritic free version of the word or phrase. Indeed I would like to change this guide line to recommend that both version must appear on the first line of an article for words or phrases which can be spelt with diacritics. What I object to is the removal of all references to a diacritic free version of a word in the text of a page because a word with diacritics is "correct" and one without is "incorrect". If the problem were like AE v. CE English then it would be solved by primary author, but with numerous cases copy-editors ignore the initial diacritic free version and change it to include a diacritic version. (and I am sure that some copy-editors do it the other way around, but I have not seen any articles so modified as they have stayed around for long). Philip Baird Shearer 15:02, 31 Mar 2005 (UTC)

I have changed initial diacritic-free forms in articles to diacritic versions in the past. I think stuff like "Zürich (also spelled Zurich)" or "Zurich (also spelled Zürich)" is unnecessary because it should be blindingly obvious to the English-speaking reader that this Zürich thing the article is talking about is probably the same as the Zurich they're thinking of (or vice versa). Since having both forms is redundant, IMO it's better to have the one form be the form with diacritics since that conveys more information than a stripped version. Redirects take care of anyone who tries to find the article with the non-diacritic spelling. You could make an argument that having the non-diacritic form appear at least once is necessary to feed search engines, but Wikipedia is a non-commercial site, our goal should be to be the most correct and informative source possible, not to get the highest search engine rankings so we can get more non-existent ad revenue or something. DopefishJustin (・∀・) 21:15, Apr 5, 2005 (UTC)
I hear you, but I have no problem with "Zürich (in English also Zurich, which is also the French spelling)": I think it sums it up nicely without being guilty of SEO. It is important to make that the distincion that this is about different spellings of the same name, not different names in different languages. dab () 10:14, 6 Apr 2005 (UTC)
It is not "blindingly obvious" to all search engines, all the time. Using the Wikipedia search engine (run by Yahoo?), neither a search for "vaagaa" nor for "vaga" will find the Vågå stave church article, though in this case the latter "vaga" search will find a redirect at Vaga stave church. If that redirect didn't exist, the "vaga" search would have been no more successful in finding this article than the "vaagaa" search (for which no redirect has yet been made) was. Gene Nygaard 18:10, 7 October 2005 (UTC)

PBS' last edit

That last edit implies that "English name" and "Anglicized spelling" are synonyms. Confusing, because that's not necessarily correct.

The repetition of two similar but subtly different conventions is also confusing. E.g., are titling and naming a page the same or different? Michael Z. 2005-04-7 14:32 Z

The repetition was a mistake. I had put it in when editing so that I could make sure what was there was as close as possible to what was there before recent edits. But a "FEED ME" demand from children distracted me. I hurried (they can be very demanding and distracting), I did not write what I wanted to and I forgot that I had the old version in the same edit block when I saved it sorry.

"Take two" is now in place. Lets discuss that as it is an attempt to put in place what there before the recent changes which were not agreed upon before they were made on this talk page. Philip Baird Shearer 16:39, 7 Apr 2005 (UTC)

Looks better. Does "anglicized form" refer to the "English name", the "Latin transliteration", or "either English name or Latin transliteration"? Michael Z. 2005-04-7 17:26 Z