Convert integer to words (QBASIC)
From LiteratePrograms
- Other implementations: Java | QBASIC
Here is the function for changing a number into words. It works for English, for US English and for French. So far it doesn't work for other languages. Also please note that my French is pretty rusty, so I don't guarantee that the current algorithm produces valid French in all cases. The assistance of a fluent French speaker to add more test cases would be very much appreciated. Luckily such a person has added some test cases to the program and some helpful observations to the discussion page. The issues raised will be dealt with shortly.
The function is split into a "run once" initialisation portion followed by the part which calculates the actual words required to do the job. These portions will be discussed in a little more detail later in the article.
<<definition>>= FUNCTION Num2Lang$ (aNumber AS LONG, aLang AS STRING) variable declarations initialisation call function body initialisation block END FUNCTION
Two types of scope are used for the variables in this function. DIMensioned variables are automatically reinitialised to their default values whenever the function is called. STATIC variables retain their value between calls to the function. They are not strictly necessary in this function but their use makes the function faster by avoiding the need to set up the vocabulary arrays each time the function is called.
<<variable declarations>>= STATIC dLang AS STRING, dLog10 AS DOUBLE STATIC dUnits() AS STRING, dTens() AS STRING, dPowers() AS STRING DIM dBuffer AS STRING, dDigitGroup AS LONG DIM dTensGroup AS LONG, dPowersGroup AS LONG DIM dRange AS INTEGER
Note that QBASIC doesn't allow for the sizing of a static array at the time of its declaration. This has to be done later.
Basically the vocabulary arrays are loaded with data before use on the first call to the function and then just used on subsequent calls. The only case in which these arrays will need to be reloaded is when the language used is changed.
Finally, if you have missgivings about the use of the GOSUB command in this piece of code please read on. Its use is discussed at a later point in this article.
<<initialisation call>>= IF dLang <> aLang THEN GOSUB Num2LangInit dLang = aLang END IF
After the soup and salad we come to the meat of the function. In the two (and a half) languages covered so far we only really need to deal with three cases: zero to nineteen; twenty to ninety-nine; everything else. Since the specification for this task (the bottles of beer song) implies that we only need to deal with positive whole numbers and since I have arbitrarily decided that I only want to deal with 32-bit signed integers (the LONG type in BASIC) "everything else" implies any whole number between one hundred and 231-1. It would be trivial to handle negative numbers and not too much work to handle decimals but there's no need in this case. However with an eye to bugs or future expansion a fourth case has been added to handle numbers outside the range.
<<function body>>= SELECT CASE aNumber first case second case third case other cases END SELECT Num2Lang = dBuffer EXIT FUNCTION
The first case contains a highly idiosyncratic set of numbers with little or no pattern and the easiest way to handle it is via the pre-initialised lookup table, dUnits.
<<first case>>= CASE 0 TO 19 dBuffer = dUnits(INT(aNumber))
The second case is pretty straightforward for English but handling French adds three minor complications. Firstly, some ten-words, "septante" for instance, don't exist in Parisian French, although they do in other dialects such as Belgian French. In those cases the base twenty system has to be used starting from the previous existing ten-word. Hence the first IF/ENDIF section in the following code. Secondly, "quatre-vingts" doesn't need an "s" at the end when followed by units and otherwise needs to be treated differently from the "-ante" words. Thirdly, umpty-one values have to be treated specially by adding the word "et" in between the tens and the units. Hence the second IF/ENDIF section and its internal cases.
<<second case>>= CASE 20 TO 99 dTensGroup = INT(aNumber / 10) dBuffer = dTens(dTensGroup) IF dBuffer = "" THEN dTensGroup = dTensGroup - 1 dBuffer = dTens(dTensGroup) END IF dDigitGroup = aNumber - dTensGroup * 10 IF dDigitGroup > 0 THEN IF LEFT$(aLang, 2) = "fr" AND RIGHT$(dBuffer, 1) = "s" THEN dBuffer = LEFT$(dBuffer, LEN(dBuffer) - 1) END IF IF LEFT$(aLang, 2) <> "fr" OR dDigitGroup MOD 10 <> 1 THEN dBuffer = dBuffer + "-" ELSEIF aLang = "fr" AND dTensGroup = 8 THEN dBuffer = dBuffer + "-" ELSE dBuffer = dBuffer + " et " END IF dBuffer = dBuffer + Num2Lang(dDigitGroup, aLang) END IF
The third case has the most complex code The basic idea is to identify which range the number falls into (thousands, millions, etc.) then use recursive calls to get the text for groups of three digits. That simple picture is clouded a little by the first range, the hundreds, which are treated a little differently in US English from other English variants. It's also complicated by French which uses the phrases "cent" and "mille" rather than "un cent" or "un mille" for 100 and 1,000 and has rules on when to use plurals for powers of ten and when not to.
Also note the addition of .4 to the number when calculating the range. This shouldn't have been necessary but a floating point approximation error leads to the wrong value being calculated for 100 if it isn't present. The calculation worked for all other values, even without the addition but them's the breaks.
<<third case>>= CASE 100 TO 2147483647 dRange = INT(LOG(aNumber + .4) / dLog10) IF dRange > 3 THEN dRange = INT(dRange / 3) * 3 dPowersGroup = INT(aNumber / 10 ^ dRange) IF aLang = "fr" AND dPowersGroup = 1 AND dRange < 5 THEN dBuffer = "" ELSE dBuffer = Num2Lang(dPowersGroup, aLang) END IF dBuffer = LTRIM$(dBuffer + dPowers(dRange)) dDigitGroup = aNumber - dPowersGroup * 10 ^ dRange IF LEFT$(aLang, 2) = "fr" AND (dPowersGroup = 1 OR dDigitGroup > 0) THEN IF RIGHT$(dBuffer, 1) = "s" THEN dBuffer = LEFT$(dBuffer, LEN(dBuffer) - 1) END IF END IF IF dDigitGroup > 0 THEN dBuffer = dBuffer + " " IF dDigitGroup < 100 AND aLang = "en-uk" THEN dBuffer = dBuffer + "and " END IF dBuffer = dBuffer + Num2Lang(dDigitGroup, aLang) END IF
Finally a default case was added during development to handle cases which hadn't been handled yet. If the code is extended to handle negative or floating point numbers at some time in the future this might come in handy again, so it has been left. At the moment it will catch negative numbers and produce a "reasonable" answer which will at least indicate that there is a problem in the input.
<<other cases>>= CASE ELSE dBuffer = LTRIM$(STR$(aNumber))
Now we have the initialisation code for the function. It basically loads arrays with the vocabulary required for the current language. It also sets the LOG10 constant. This is required because QBASIC's built-in LOG functon deals in natural logarithms and we actually need base 10 logarithms to identify the right powers-of-ten word.
Just a word on the use of GOSUB and a label here. Many people recoil in horror from the GOSUB command nowadays with some vague fear that it is the GOTO command in disguise and that therefore its use is "unstructured". In fact it has been removed altogether from the latest incarnation of BASIC, VB.NET and that is a pity. There is no doubt that GOSUB in the wrong hands can be misused badly. However it has at least one legitimate use and that use is the provision of structuring within a function or subroutine where the creation of extra functions or subroutines to carry out that structuring would be overkill. That is how it has been used here. While it could have been replaced altogether in this function, its use makes the code more readable than it would otherwise have been and thus its use is justified.
<<initialisation block>>= Num2LangInit: REDIM dUnits(19), dTens(9), dPowers(9) SELECT CASE LEFT$(aLang, 2) CASE "fr" dUnits(0) = "zero": dUnits(10) = "dix": dTens(0) = "": dPowers(0) = "" dUnits(1) = "un": dUnits(11) = "onze": dTens(1) = "": dPowers(1) = "" dUnits(2) = "deux": dUnits(12) = "douze": dTens(2) = "vingt": dPowers(2) = " cents" dUnits(3) = "trois": dUnits(13) = "treize": dTens(3) = "trente": dPowers(3) = " mille" dUnits(4) = "quatre": dUnits(14) = "quatorze": dTens(4) = "quarante": dPowers(4) = "" dUnits(5) = "cinq": dUnits(15) = "quinze": dTens(5) = "cinquante": dPowers(5) = "" dUnits(6) = "six": dUnits(16) = "seize": dTens(6) = "soixante": dPowers(6) = " millions" dUnits(7) = "sept": dUnits(17) = "dix-sept": dTens(7) = "": dPowers(7) = "" dUnits(8) = "huit": dUnits(18) = "dix-huit": dTens(8) = "quatre-vingts": dPowers(8) = "" dUnits(9) = "neuf": dUnits(19) = "dix-neuf": dTens(9) = "": dPowers(9) = " milliards" CASE "en" dUnits(0) = "zero": dUnits(10) = "ten": dTens(0) = "": dPowers(0) = "" dUnits(1) = "one": dUnits(11) = "eleven": dTens(1) = "": dPowers(1) = "" dUnits(2) = "two": dUnits(12) = "twelve": dTens(2) = "twenty": dPowers(2) = " hundred" dUnits(3) = "three": dUnits(13) = "thirteen": dTens(3) = "thirty": dPowers(3) = " thousand" dUnits(4) = "four": dUnits(14) = "fourteen": dTens(4) = "forty": dPowers(4) = "" dUnits(5) = "five": dUnits(15) = "fifteen": dTens(5) = "fifty": dPowers(5) = "" dUnits(6) = "six": dUnits(16) = "sixteen": dTens(6) = "sixty": dPowers(6) = " million" dUnits(7) = "seven": dUnits(17) = "seventeen": dTens(7) = "seventy": dPowers(7) = "" dUnits(8) = "eight": dUnits(18) = "eighteen": dTens(8) = "eighty": dPowers(8) = "" dUnits(9) = "nine": dUnits(19) = "nineteen": dTens(9) = "ninety": dPowers(9) = " billion" CASE ELSE dUnits(0) = "0": dUnits(10) = "0": dTens(0) = "": dPowers(0) = "" dUnits(1) = "1": dUnits(11) = "1": dTens(1) = "1": dPowers(1) = "" dUnits(2) = "2": dUnits(12) = "2": dTens(2) = "2": dPowers(2) = "" dUnits(3) = "3": dUnits(13) = "3": dTens(3) = "3": dPowers(3) = "" dUnits(4) = "4": dUnits(14) = "4": dTens(4) = "4": dPowers(4) = "" dUnits(5) = "5": dUnits(15) = "5": dTens(5) = "5": dPowers(5) = "" dUnits(6) = "6": dUnits(16) = "6": dTens(6) = "6": dPowers(6) = "" dUnits(7) = "7": dUnits(17) = "7": dTens(7) = "7": dPowers(7) = "" dUnits(8) = "8": dUnits(18) = "8": dTens(8) = "8": dPowers(8) = "" dUnits(9) = "9": dUnits(19) = "9": dTens(9) = "9": dPowers(9) = "" END SELECT SELECT CASE LEFT$(aLang, 5) CASE "fr-be" dTens(7) = "septante" dTens(8) = "octante" dTens(9) = "nonante" CASE "fr-ch" dTens(7) = "septante" dTens(8) = "huitante" dTens(9) = "nonante" CASE ELSE REM Do nothing END SELECT dLog10 = LOG(10) RETURN
The next piece of the file is a scaffold which you can use to test the Num2Lang function. When there are so many ways that things can go wrong, it's important to automate the testing process so that the same tests are run every time.
The floating point approximation error discussed above demonstrates the need for comprehensive testing. There was no logical error in the code before the "+ .4" was added to it. Nevertheless the function did not return the correct result when the input value was 100, so the cause had to be identified and a workaround created. Comprehensive unit testing will find this sort of error where logic and code writing skills will not.
Note: the tests results are now formatted according to the TAP format (see http://en.wikipedia.org/wiki/Test_Anything_Protocol).
<<unit tests>>= DECLARE FUNCTION Num2Lang$ (aNumber AS LONG, aLang AS STRING) DIM mTestCount AS INTEGER DIM mStatus AS STRING DIM mTest AS STRING DIM mLang AS STRING DIM mNumber AS LONG DIM mExpected AS STRING DIM mGot AS STRING DIM mDelay AS SINGLE DIM mTimer AS SINGLE mTestCount = 0 RESTORE TestCases DO READ mTest IF mTest = "" THEN EXIT DO ELSE READ mLang, mNumber, mExpected mTestCount = mTestCount + 1 END IF LOOP CLS PRINT "1.." + LTRIM$(STR$(mTestCount)) mTestCount = 0 RESTORE TestCases mStatus = "" DO WHILE INKEY$ <> CHR$(27) READ mTest IF mTest = "" THEN EXIT DO ELSE mTestCount = mTestCount + 1 READ mLang, mNumber, mExpected mGot = Num2Lang$(mNumber, mLang) mStatus = "ok" + STR$(mTestCount) IF mExpected <> mGot THEN mStatus = "not " + mStatus END IF mStatus = mStatus + " - " + LEFT$(mLang + ": " + LTRIM$(STR$(mNumber)) + SPACE$(15), 15) + "'" + mGot + "'" IF mExpected <> mGot THEN mStatus = mStatus + " (Expected '" + mExpected + "')" END IF PRINT mStatus IF mExpected = mGot THEN mDelay = .2 ELSE mDelay = 2 END IF mTimer = INT(TIMER * 10) DO WHILE mDelay > 0 IF mTimer <> INT(TIMER * 10) THEN mTimer = INT(TIMER * 10) mDelay = mDelay - .1 END IF LOOP END IF LOOP SYSTEM TestCases: DATA "*","en-uk",0,"zero" DATA "*","en-uk",1,"one" DATA "*","en-uk",9,"nine" DATA "*","en-uk",10,"ten" DATA "*","en-uk",11,"eleven" DATA "*","en-uk",19,"nineteen" DATA "*","en-uk",20,"twenty" DATA "*","en-uk",21,"twenty-one" DATA "*","en-uk",100,"one hundred" DATA "*","en-uk",101,"one hundred and one" DATA "*","en-us",101,"one hundred one" DATA "*","en-uk",1000,"one thousand" DATA "*","en-uk",1001,"one thousand and one" DATA "*","en-uk",1958,"one thousand nine hundred and fifty-eight" DATA "*","fr",10,"dix" DATA "*","fr",11,"onze" DATA "*","fr",21,"vingt et un" DATA "*","fr",22,"vingt-deux" DATA "*","fr",29,"vingt-neuf" DATA "*","fr",60,"soixante" DATA "*","fr",61,"soixante et un" DATA "*","fr",62,"soixante-deux" DATA "*","fr",70,"soixante-dix" DATA "*","fr-be",70,"septante" DATA "*","fr-be",71,"septante et un" DATA "*","fr",71,"soixante et onze" DATA "*","fr",79,"soixante-dix-neuf" DATA "*","fr",80,"quatre-vingts" DATA "*","fr-be",80,"octante" DATA "*","fr-ch",80,"huitante" DATA "*","fr",81,"quatre-vingt-un" DATA "*","fr-be",81,"octante et un" DATA "*","fr",82,"quatre-vingt-deux" DATA "*","fr",90,"quatre-vingt-dix" DATA "*","fr-be",90,"nonante" DATA "*","fr",99,"quatre-vingt-dix-neuf" DATA "*","fr",100,"cent" DATA "*","fr",101,"cent un" DATA "*","fr",900,"neuf cents" DATA "*","fr",999,"neuf cent quatre-vingt-dix-neuf" DATA "*","fr",1000,"mille" DATA "*","fr",1100,"mille cent" DATA "*","fr",100000,"cent mille" DATA "*","fr",200000,"deux cents mille" DATA "*","fr",200025,"deux cents mille vingt-cinq" DATA "*","fr",1000000,"un million" DATA "*","fr",1000100,"un million cent" DATA "*","fr",2000000,"deux millions" DATA "*","fr",2003201,"deux million trois mille deux cent un" DATA "*","fr",2000000000,"deux milliards" DATA "*","fr",2000003201,"deux milliard trois mille deux cent un" DATA "*","fr",300,"trois cents" DATA "*","fr",301,"trois cent un" DATA ""
<<NUM2LANG.BAS>>= unit tests definition
Download code |