`Math::Roman` using Unicode characters

#!/usr/bin/perl
#
# Tests on Roman numbers modules
# Test on Unicode characters and the W3C specification

use strict;
use warnings;
use Math::Roman;
use Convert::Number::Roman;
# U+2160 = o20540 = 8544, U+2170 = 0o20560 = 8560
Math::Roman::tokens( "\x{2160}",  1, "\x{2161}",  2, "\x{2162}",  3, "\x{2163}",  4,
                     "\x{2164}",  5, "\x{2165}",  6, "\x{2166}",  7, "\x{2167}",  8, "\x{2168}",  9,
                     "\x{2169}", 10, "\x{216A}", 11, "\x{216B}", 12, "\x{2169}\x{2162}", 13,
                     "\x{2169}\x{2163}", 14, "\x{2169}\x{2164}", 15, "\x{2169}\x{2165}", 16,
                     "\x{2169}\x{2166}", 17, "\x{2169}\x{2167}", 18, "\x{2169}\x{2168}", 19,
                     "\x{2169}\x{2169}",  20, "\x{2169}\x{216A}", 21, "\x{2169}\x{216B}", 22,
                     "\x{2169}\x{2169}\x{2162}", 23,
                     "\x{2169}\x{2169}\x{2163}", 24,
                     "\x{2169}\x{2169}\x{2164}", 25,
                     "\x{2169}\x{2169}\x{2165}", 26,
                     "\x{2169}\x{2169}\x{2166}", 27,
                     "\x{2169}\x{2169}\x{2167}", 28,
                     "\x{2169}\x{2169}\x{2168}", 29,
                     "\x{2169}\x{2169}\x{2169}", 30,
                     "\x{2169}\x{2169}\x{216A}", 31,
                     "\x{2169}\x{2169}\x{216B}", 32,
                     "\x{2169}\x{2169}\x{2169}\x{2162}", 33,
                     "\x{2169}\x{2169}\x{2169}\x{2163}", 34,
                     "\x{2169}\x{2169}\x{2169}\x{2164}", 35,
                     "\x{2169}\x{2169}\x{2169}\x{2165}", 36,
                     "\x{2169}\x{2169}\x{2169}\x{2166}", 37,
                     "\x{2169}\x{2169}\x{2169}\x{2167}", 38,
                     "\x{2169}\x{2169}\x{2169}\x{2168}", 39,
                     "\x{2169}\x{216C}",  40, "\x{216C}",  50, "\x{2169}\x{216D}",  90, "\x{216D}",  100,
                     "\x{216D}\x{216E}", 400, "\x{216E}", 500, "\x{216D}\x{216F}", 900, "\x{216F}", 1000);

for my $n (1..3999) {
  my $mod = $n % 50;
  my $cnr = Convert::Number::Roman->new($n);
  my $mr  = Math::Roman->new($n);
  if ($cnr->convert ne $mr) {
    print "Error 1 for $n\n";
  }
  if ($mr->as_number != $n) {
    print "Error 2 for $n\n";
  }

}

For the argument to tokens, the initial idea was to enumerate all Unicode values from "I" to "XII", with their 1 to 12 values. It did not work, because 13 was converted to \x{216B}\x{2160}, that is "XII" + "I" instead of \x{2169}\x{2162}, that is, "X" + "III". So I had to add this pair of values. Then the same problem arose with 14, and then 15, and so on, until 40. Starting with 40, I could at last leave some holes.

For the sake of comparison, here is a script in which I decided not to use \x{216A} for "XI" and \x{216B} for "XII". Like the script above, we compare the output of Math::Roman with the output of Convert::Number::Roman, after having replaced "XI" by "X" + "I" and "XII" by "X" + "II". As you can see, it is much simpler and much more concise.

#!/usr/bin/perl
#
# Tests on Roman numbers modules
# Test on Unicode characters and the W3C specification, while ignoring the characters for XI and XII

use strict;
use warnings;
use Math::Roman;
use Convert::Number::Roman;
# U+2160 = o20540 = 8544, U+2170 = 0o20560 = 8560
Math::Roman::tokens( "\x{2160}",  1, "\x{2161}",  2, "\x{2162}",  3, "\x{2163}",  4,
                     "\x{2164}",  5, "\x{2165}",  6, "\x{2166}",  7, "\x{2167}",  8, "\x{2168}",  9,
                     "\x{2169}",  10,
                     "\x{2169}\x{216C}",  40, "\x{216C}",  50, "\x{2169}\x{216D}",  90, "\x{216D}",  100,
                     "\x{216D}\x{216E}", 400, "\x{216E}", 500, "\x{216D}\x{216F}", 900, "\x{216F}", 1000);

for my $n (1..3999) {
  my $mod = $n % 50;
  my $cnr = Convert::Number::Roman->new($n);
  my $cnr_str = $cnr->convert;
  $cnr_str =~ s/\x{216A}/\x{2169}\x{2160}/g;
  $cnr_str =~ s/\x{216B}/\x{2169}\x{2161}/g;
  my $mr  = Math::Roman->new($n);
  if ($cnr_str ne $mr) {
    print "Error 1 for $n\n";
  }
  if ($mr->as_number != $n) {
    print "Error 2 for $n\n";
  }

}

Note that these two scripts can read and produce all valid numbers. But they do not catch invalid numbers. This is left as a (painful) exercise to the reader.

© 2011 Jean Forget and Les Mongueurs de Perl. The code in this file is free software; you can redistribute it or modify it under the same terms as Perl 5.10.0. The text is under Creative Commons license with paternity, no modification.

Main page Return to text

Math::Roman using Unicode characters

`Math::Roman` using Unicode characters