bitwix

Tangential comments about Software Development

Tuesday, June 27, 2006

Regex, Roman Numerals and TDD

I read a blog on Roman Numeral Conversion and decided to do the reverse, roman to integer conversion. I wanted to use Regular Expressions as I thought they would allow a neat treatment. And I wanted to use Test Driven Development (TDD) to ensure that I had written what I thought I had written.

I started with a simple test, and a simple class (in that order). After one and a half iterations, the code was like this. By one and a half iterations, I mean one working test, one failing test.
[TestFixture]
public class RomanToIntTester
{
[Test]
public void TestZero()
{
RomanToInt r2i = new RomanToInt();

Assert.AreEqual(0, r2i.Convert( "" ) );
}
[Test]
public void TestOne()
{
RomanToInt r2i = new RomanToInt();

Assert.AreEqual(1, r2i.Convert("I"));
}
}

public class RomanToInt
{
public RomanToInt()
{
}
public int Convert(string str)
{
return 0;
}
}


I chose not to work through tests for II = 2, III = 3, IV = 4, but instead to write the daddy of all tests, using Excel to do the repetitive coding for me. I set up a 4,000 row spreadsheet, and put the following formulae into the cells on row 4000 -
A4000 =A3999+1
B4000 =ROMAN(A4000,0)
C4000 =CONCATENATE("Assert.AreEqual(",A4000,", r2i.Convert(""",B4000,"""));")

I then copied these into rows 1 to 3999, and then typed 0 into cell A1. Column C was now my test code. I copied it into my test class and had the following -

[Test]
public void TestZeroTo3999()
{
RomanToInt r2i = new RomanToInt();
Assert.AreEqual(0, r2i.Convert(""));
Assert.AreEqual(1, r2i.Convert("I"));
Assert.AreEqual(2, r2i.Convert("II"));
...
Assert.AreEqual(3997, r2i.Convert("MMMCMXCVII"));
Assert.AreEqual(3998, r2i.Convert("MMMCMXCVIII"));
Assert.AreEqual(3999, r2i.Convert("MMMCMXCIX"));
}


I then set about the task. I wrote tests then code to remove say "IV" and replace it with "4". I used a Regex Replace call for this. Then tests and code to replace all two-letter values - IV, IX, XL, XC, CD and CM - with a single digit or a letter not otherwise used - "F" for "XL" (forty), "N" for "XC" (ninety), "A" and "B" for "CD" and "CM".
    Assert.AreEqual("9", r2i.ReplaceAll("IX"));
Assert.AreEqual("MBN9", r2i.ReplaceAll("MCMXCIX"));


I wrote tests and a function to count the number of occurences of a character and return that count multiplied by a base value.
    Assert.AreEqual(3, r2i.RepeatCountAndMultiply("XXXVIII", "I", 1));
Assert.AreEqual(5, r2is.RepeatCountAndMultiply("XXXVIII", "V", 5));
Assert.AreEqual(30, r2i.RepeatCountAndMultiply("XXXVIII", "X", 10));

All that remained was to implement the Convert function in full, as a call to ReplaceAll followed by incrementing a value with successive calls to RepeatCountAndMultiply. And guess what? The TestZeroTo3999 test passed first time. Does the smugness come through in the tone of this post? I quickly
refactored the class and tests to make most of the functions protected, and therefore deriving the tester class from the implementation class (is this a good idea?) and I considered myself done.

Here's the conversion class in full.
namespace MusingsLib
{
public class RomanToInt
{
public RomanToInt()
{
}
public int Convert(string str)
{
int retval = 0;
str = ReplaceAll(str);
retval += RepeatCountAndMultiply(str, "I", 1);
retval += RepeatCountAndMultiply(str, "4", 4);
retval += RepeatCountAndMultiply(str, "V", 5);
retval += RepeatCountAndMultiply(str, "9", 9);
retval += RepeatCountAndMultiply(str, "X", 10);
retval += RepeatCountAndMultiply(str, "F", 40);
retval += RepeatCountAndMultiply(str, "L", 50);
retval += RepeatCountAndMultiply(str, "N", 90);
retval += RepeatCountAndMultiply(str, "C", 100);
retval += RepeatCountAndMultiply(str, "A", 400);
retval += RepeatCountAndMultiply(str, "D", 500);
retval += RepeatCountAndMultiply(str, "B", 900);
retval += RepeatCountAndMultiply(str, "M", 1000);
return retval;
}

protected string ReplaceAll(string str)
{
str = ReplaceRomanWithAlternative(str, "IV", "4");
str = ReplaceRomanWithAlternative(str, "IX", "9");
str = ReplaceRomanWithAlternative(str, "XL", "F");
str = ReplaceRomanWithAlternative(str, "XC", "N");
str = ReplaceRomanWithAlternative(str, "CD", "A");
str = ReplaceRomanWithAlternative(str, "CM", "B");
return str;
}

protected int RepeatCountAndMultiply(string inputstr, string roman, int baseval)
{
Regex r = new Regex( roman );
return r.Matches(inputstr).Count * baseval;
}

protected string ReplaceRomanWithAlternative(string inputstr, string roman, string alt)
{
Regex r = new Regex( roman );
return r.Replace(inputstr, alt);
}
}
}


Reference : www.differentpla.net/node/58