# Soundex (Rexx)

Other implementations: C | Rexx | Sed

This is an implementation of the SOUNDEX phonetic-matching algorithm. The goal of the algorithm is to encode names with the same pronunciation to the same value, allowing them to be compared without regard to spelling variations. It works best for English surnames — there are other algorithms (e.g., NYSIIS) and variations on SOUNDEX that work better for other names.

While true to the definition of SOUNDEX, this version is not the exact algorithm published so many years ago in Knuth or patented by Russell and Odell. That algorithm was heavily loop-based and didn't perform well in an interpreted language. This version produces the same results much more quickly, relying on some REXX-isms like `Space()` and `Translate()`.

## Algorithm

1. We begin by accepting the first letter of the name as it is. This is the only letter in the input that will appear as a letter in the output.

```<<Take first letter>>=
Result = Left(Source, 1)
```

2. Next, we convert each character in the name into a numeral that groups similar-sounding letters together. The groupings are as follows. Note that the original Soundex algorithm doesn't use "0" or "7". We use "0" to mark letters that we will eventually delete but must retain for the moment as separators. Anything which must be deleted but is not a separator is marked with a "7" then immediately removed. We do this using the REXX `Translate(Input, ToTable, FromTable, PadChar)` function, which will replace all characters in `Input` that are in `FromTable` with the corresponding characters in `ToTable`, and will replace any characters not in `FromTable` with `PadChar`.

NumeralLetters
0A, E, I, O, U, Y
1B, F, P, V
2C, G, J, K, Q, S, X, Z
3D, T
4L
5M, N
6R
7H, W
```<<Convert letters to numerals>>=
SoundexNum = "01230127022455012623017202"
/* ABCDEFGHIJKLMNOPQRSTUVWXYZ */
Source = Translate(Source, SoundexNum, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "7")
```

Now take out the sevens.

```<<Remove all 7s>>=
Source = Space(Translate(Source, " ", "7"), 0)
```

3. We then replace sequences of the same numeral with a single instance followed by the requisite number of zeroes.

```<<Make multiple digits single>>=
Do I = 1 to Length(Source)
Do J = I+1 to Length(Source) While Substr(Source, I, 1) = Substr(Source, J, 1)
Source = Left(Source, J-1) || "0" || Substr(Source, J+1)
End
End
```

4. Almost done, we now drop the first numeral (i.e., the first letter which we handled at the start) and remove any zeroes in the rest. The `Translate(Source, " ", "0")` function will return `Source` with any zeroes replaced with spaces, and the `Space(..., 0)` function will delete any spaces in that result.

```<<Remove zeroes after first position>>=
Result = Result || Space(Translate(Substr(Source, 2), " ", "0"), 0)
```

5. Lastly, we pad the result on the right to six characters in length using zeroes. Unlike previous zeroes, these will survive into the result.

```<<Fill on right with zeroes>>=
Result = Left(Result, 6, "0")
```

## As a function

With a little code to pull this all together into a nice function, we're done:

```<<Soundex Function>>=
/* ------------------------------------------------------------------------ */
/* code = Soundex(name)                                                     */
/*                                                                          */
/* Compute and return the SOUNDEX code corresponding to the specified name. */
/* ------------------------------------------------------------------------ */
Soundex: Procedure
Source = Arg(1)
Take first letter
Convert letters to numerals
Remove all 7s
Make multiple digits single
Remove zeroes after first position
Fill on right with zeroes
Return Result
```

## Main program for testing

The following main program will repeatedly prompt the user to supply a name and compute and display its SOUNDEX equivalent until cancelled by the user.

```<<soundex_test.rexx>>=
Do Forever
Call LineOut , "Enter the name: "
Name = LineIn()
If Name = "" then Leave
Say "SOUNDEX value = " || Soundex(Name)
End
Exit 0
Soundex Function
```