New York State Identification and Intelligence System

From Wikipedia, the free encyclopedia

The New York State Identification and Intelligence System Phonetic Code, commonly known as NYSIIS, is a phonetic algorithm devised in 1970 as part of the New York State Identification and Intelligence System (now a part of the New York State Division of Criminal Justice Services). It features an accuracy increase of 2.7% over the traditional Soundex algorithm.[1]

Procedure[edit]

The algorithm, as described in Name Search Techniques,[2] is:

  1. If the first letters of the name are
    'MAC' then change these letters to 'MCC'
    'KN' then change these letters to 'NN'
    'K' then change this letter to 'C'
    'PH' then change these letters to 'FF'
    'PF' then change these letters to 'FF'
    'SCH' then change these letters to 'SSS'
  2. If the last letters of the name are[3]
    'EE' then change these letters to 'Y␢'
    'IE' then change these letters to 'Y␢'
    'DT' or 'RT' or 'RD' or 'NT' or 'ND' then change these letters to 'D␢'
  3. The first character of the NYSIIS code is the first character of the name.
  4. In the following rules, a scan is performed on the characters of the name. This is described in terms of a program loop. A pointer is used to point to the current position under consideration in the name. Step 4 is to set this pointer to point to the second character of the name.
  5. Considering the position of the pointer, only one of the following statements can be executed.
    1. If blank then go to rule 7.
    2. If the current position is a vowel (AEIOU) then if equal to 'EV' then change to 'AF' otherwise change current position to 'A'.
    3. If the current position is the letter
      'Q' then change the letter to 'G'
      'Z' then change the letter to 'S'
      'M' then change the letter to 'N'
    4. If the current position is the letter 'K' then if the next letter is 'N' then replace the current position by 'N' otherwise replace the current position by 'C'
    5. If the current position points to the letter string
      'SCH' then replace the string with 'SSS'
      'PH' then replace the string with 'FF'
    6. If the current position is the letter 'H' and either the preceding or following letter is not a vowel (AEIOU) then replace the current position with the preceding letter.
    7. If the current position is the letter 'W' and the preceding letter is a vowel then replace the current position with the preceding position.
    8. If none of these rules applies, then retain the current position letter value.
  6. If the current position letter is equal to the last letter placed in the code then set the pointer to point to the next letter and go to step 5.
    The next character of the NYSIIS code is the current position letter.
    Increment the pointer to point at the next letter.
    Go to step 5.
  7. If the last character of the NYSIIS code is the letter 'S' then remove it.
  8. If the last two characters of the NYSIIS code are the letters 'AY' then replace them with the single character 'Y'.
  9. If the last character of the NYSIIS code is the letter 'A' then remove this letter.

References[edit]

  1. ^ Rajkovic, P.; Jankovic, D. (2007), "Adaptation and Application of Daitch-Mokotoff Soundex Algorithm on Serbian Names" (PDF), XVII Conference on Applied Mathematics, Novi Sad, Serbia, archived from the original (PDF) on August 27, 2011{{citation}}: CS1 maint: location missing publisher (link)
  2. ^ Taft, R. L. (1970), "Name Search Techniques", New York State Identification and Intelligence System, Albany, New York{{citation}}: CS1 maint: location missing publisher (link)
  3. ^ "Unicode Character 'BLANK SYMBOL' (U+2422)".

External links[edit]