Talk:Dotted and dotless I in computing

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Pronunciation[edit]

Can someone please explain in layman's terms how the dotless I is pronounced? No IPA stuff, and no scientific terms about which parts of the mouth should move where, thanks. JIP | Talk 10:02, 27 July 2005 (UTC)[reply]

There's a sound sample at close back unrounded vowel. Muntfish 10:11, 27 July 2005 (UTC)[reply]
The dotless I sound is the same sound as the ae sound in the name "Michael". Hope this answer, arriving 2+ years late, helps! ;-) Incidentally, if I were to write Michael in Turkish as it sounds, it would be "Maykıl". Todd (talk) 19:35, 23 February 2008 (UTC)[reply]
The dotless I sound isn't quite the same as the ae in Michael. But to express the difference would require IPA stuff and scientific terms about which parts of the mouth should move where, all of which seem to be allergic to the OP. 72.141.211.91 (talk) 02:14, 10 April 2017 (UTC)[reply]

Frankly, this is generally a problem with Wikipedia. Why the convention is to use the international phonetic alphabet on English Wikipedia pages, when the English phonetic alphabet would be far more understandable to the vast majority of readers, is totally without logical explanation. Actually, it's easily explained: people feel smart when they use the IPA, and don't really care about whether these entries are actually useful. This current runs throughout Wikipedia. —Preceding unsigned comment added by 66.28.250.194 (talk) 00:01, 9 December 2009 (UTC)[reply]

The IPA is an international standard. And what is the English phonetic alphabet? Can it really be used for other languages but English?Thorn0 (talk) 16:24, 29 July 2015 (UTC)[reply]
There is no such thing as the "English phonetic alphabet". There are various attempts at approximating the sounds of other languages using English spelling conventions (e.g. "bru-NAI" for the pronunciation of Brunei), but they all fail somewhere and they're completely illogical for speakers of anything other than English. I agree with Thorn0. IPA is an international standard, and it is the most useful for the "vast majority of users". There are certain sarcastic comments I could add about narcissism here, but I'll save it. 72.141.211.91 (talk) 02:14, 10 April 2017 (UTC)[reply]

Casing[edit]

Why cause Turkish dotted and dotless letter I problems on Turkish computers? --84.61.43.168 14:02, 5 October 2005 (UTC)[reply]

Most codepages did make Turkish difficult. What should have been done was to encode six letters:

LATIN CAPITAL LETTER I
LATIN SMALL LETTER I
LATIN CAPITAL LETTER DOTLESS I (Turkish, Azerbaijani)
LATIN SMALL LETTER DOTLESS I (Turkish, Azerbaijani)
LATIN CAPITAL LETTER I WITH DOT (≡I+combining dot above)
LATIN SMALL LETTER I WITH DOT (≡i+combining dot above)

This would have made representation of multilingual texts much easier, and would have done away with any language-specific case mapping. —Typhlosion 18:14, 24 October 2005 (UTC)[reply]

but encoding homographs in a traditional code page would have meant losing other letters or symbols and would have introduced all the issues associated with homographs (text that looks identical but isn't,having to make sure you enter the correct one of a set of homographs etc). Plugwash

Open questions[edit]

Why has neither Unicode nor ISO-8859-9 separate code points for English and Turkish small dotted I? --84.61.35.152 09:12, 28 March 2006 (UTC)[reply]

Presumablly whoever standardised them considered them to be the same letters. Plugwash 18:12, 28 March 2006 (UTC)[reply]
Precisely. Similarly, there is no special code point for Turkish lowercase dotted I. Its just a lowercase I. -- Jmabel | Talk 03:14, 3 April 2006 (UTC)[reply]

How can a small capital dotted i be represented in OpenType? --88.76.248.120 10:29, 9 March 2007 (UTC)[reply]

Font size should be independent of what character is being represented. Or am I missing something? - Jmabel | Talk 06:49, 29 March 2007 (UTC)[reply]
And it looks like it works fine:
       İ İ
       İ İ
Jmabel | Talk 06:53, 29 March 2007 (UTC)[reply]

JDK 6 fixes for dotless-I[edit]

Specific issues in the JDK relating to improper treatment of 'I' and 'i' in Turkish locale have been fixed in JDK 6. But it is misleading to say that this was a single bug which is now fixed. These usages were simply usages of the basic API, similar to those in any Java software. The basic problem is that the methods String.toLowerCase() and String.toUpperCase() without arguments exhibit behavior depending on the default locale, which in practice means that they work consistently for everyone except Turkish users. Developers must be explicitly aware of this danger and should use the variants of these methods which take a Locale argument - Locale.getDefault() if that is what is really intended, or (for example) Locale.US for simple ASCII conversions.

--66.30.204.182 02:56, 15 April 2007 (UTC)[reply]

Writing a lowercase dotless i with a non-Turkish keyboard[edit]

Does someone know if there is a way of quickly entering a lowercase dotless i (or an uppercase dotted i) with a non-Turkish keyboard (possibly by using ALT or ALT GR plus a combination of numbers)?

For example, if you hold the ALT key, type 225 with the numeric keypad and then release ALT, you will get the German letter ß (Eszett), no matter if your keyboard is German or not. The same applies to the Spanish ñ (ALT+164) or the Swedish å (ALT+134). Is there any way to type Turkish letters similarly? I think that would be worth knowing.

On my keyboard (UK), AltGr+Shift+i produces (slightly counter-intuitively) a lower-case dotless i (ı). An upper-case over-dotted I can be produced by the dead-key combination AltGr+QuestonMark, followed by the letter I (İ). 82.36.26.70
Oh, and incidentally, try AltGr+s some time when you're in a Germanic mood. 82.36.26.70 —Preceding signed but undated comment was added at 17:17, 26 September 2007 (UTC)[reply]
I get Ą for alt+164 and ß with alt+225 and ć from alt+134. PiotrGrochowski000 (talk) 10:49, 17 March 2015 (UTC)[reply]
I use WinCompose. I type "Alt . i" to get "ı" and "Alt . I" to get İ. Other non-ASCII characters are generally similarly easy to type. Jordan Brown (talk) 22:07, 8 June 2022 (UTC)[reply]

Official Crimean Tatar[edit]

Is there an official script for Crimean Tatar?

Please see Talk:Crimean Tatar language#Is there an "official" script?.

Thanks. --Amir E. Aharoni 16:34, 5 August 2007 (UTC)[reply]

French[edit]

RÉPUBLIQUE D'HAÏTİ (sic), twice, with no tittles in other words (because of the Ï?). Anothername (talk) 12:26, 30 November 2008 (UTC)[reply]

Other software packages with the same bug[edit]

I just find out about this common problem through a Dota 2 issue. It is a fairly known game and highly used software (at least from my perspective). I'm not sure though whether it should be included in the list, so I'm asking out first. :) The relevant link is https://github.com/ValveSoftware/Dota-2/issues/41.

English[edit]

As this is en.wikipedia.org, it should be pointed out that dotted capital "I" and dot-less lowercase "i" are not present in the English language. 104.228.101.152 (talk) 23:49, 26 October 2018 (UTC)[reply]

Proposed split[edit]

This article currently deals with two different letters: Latin letter İi and Latin letter Iı, which are fundamentally distinct from both each other and the base Latin letter Ii, despite Unicode's merging of the lowercase and uppercase variants of the İi and Iı with those of Ii respectively. The letters also have different Wikidata entries. For these reasons, I propose a content split of the article with Dotted and dotless I being converted into a disambiguation page with a short description, and two new articles, Dotless I and İ being created. – anlztrk (talk) 07:11, 15 October 2021 (UTC)[reply]

Although the three letters are different, there is considerable room for confusion between them, due to the overlap in shapes. While each of the three letters can have its own article, this article can be about the confusion caused by the matching shapes and lack of distinction in Unicode and other character coding systems. John Sauter (talk) 11:41, 15 October 2021 (UTC)[reply]

Merge with Dotless I?[edit]

This article and Dotless I overlap considerably, yet are not even linked to each other. It seems to me that clarity would be best served by considering the whole dotted-and-dotless-I question on one page. It's not like it's so large that it would be unwieldy, and it would avoid cases like mine where I found one article and not the other and so missed some key information. Jordan Brown (talk) 22:17, 8 June 2022 (UTC)[reply]

Alternatively, Dotless I and Dotted I could discuss the characters proper, and push all discussion of the various encoding and mapping conflicts to Dotted and dotless I, with prominent links in both directions. Jordan Brown (talk) 01:42, 9 June 2022 (UTC)[reply]
I don't see why we even need to have a Dotted and dotless I article about both characters in the first place. Only having the articles for Dotless I and İ would suffice. – anlztrk (talk) 10:13, 14 June 2022 (UTC)[reply]
Speaking as somebody who is working on software support for these cases... splitting them makes understanding the problems worse. The software issues are all about how i, ı, İ, and I relate to one another; they are not specific to any one of the letters. From a linguistic perspective it makes some sense to separate Latin/Roman i/I from Turkic ı/I and Turkic i/İ, but from a software perspective they are all tied together and it doesn't make sense to talk about them separately. Naïve case mapping using the Unicode tables will end up equating different combinations of the four depending on the details. If you do case-insensitive mapping by folding to upper case, you fold both i and ı to I, and so i, ı, and I are all equivalent and İ is separate. If you do it by folding to lower case, you fold both I and İ to i, and so I, İ, and i are all equivalent and ı is separate. Which of the three articles would describe that problem? These are not problems associated with any one of the letters; they are about the relationships between them in various languages. As I think about it more, I think my "Alternatively" suggestion above is better: have the three articles that each describe their respective letter, and have them each have a "Issues in computing" section with a one-sentence summary and a link to the Dotted-and-dotless article. Jordan Brown (talk) 15:34, 14 June 2022 (UTC)[reply]
I've moved and edited the article so that it deals with issues in computing specifically. – anlztrk (talk) 13:42, 28 June 2022 (UTC)[reply]