Property based testing

I had a delightful experience with property based testing this morning! Property based testing is a form of unit testing where instead of manually writing the tests based on examples, we generate the tests based on some property. This started at the end of the last century with QuickCheck, a Haskell program by Koen Claessen and John Hughes. It's taken a while to catch on, but these days you can get some sort of QuickCheck library for just about every programming language there is.

This morning, I was doing the Clojure Challenge at the end of the PurelyFunctional.tv Newsletter:

This week’s challenge

Paul Cipher

Here’s an interesting cipher.

Treat all letters as uppercase, and convert them to uppercase if needed. The first alphabetical character of the string will not change. All subsequent alphabetical characters are shifted toward Z by the alphabetical position of the preceding alphabetical character. Non-alphabetical characters are left as-is.

Your task is to write an encoder and decoder for this cipher

Examples

(encode "") ;=> ""
(encode "a") ;=> "A"
(encode "hello") ;=> "HMQXA"
(encode "newsletter") ;=> "NSBPEQYNYW"
(encode "1 hug") ;=> "1 HCB"

(decode "") ;=> ""
(decode "1") ;=> "1"
(decode "HMQXA") ;=> "HELLO"

Note that you should always be able to decode a string that was encoded and get back the original string uppercased.

I love Clojure, but I decided to try it in Rust. We could start by writing those examples as example based unit tests.

  #[cfg(test)]
  mod tests {
      use super::*;

      #[test]
      fn test_encode() {
          assert_eq!(encode(""), "");
          assert_eq!(encode("a"), "A");
          assert_eq!(encode("hello"), "HMQXA");
          assert_eq!(encode("newsletter"), "NSBPEQYNYW");
          assert_eq!(encode("1 hug"), "1 HCB");
        
      }

      #[test]
      fn test_decode() {
          assert_eq!(decode(""), "");
          assert_eq!(decode("1"), "1");
          assert_eq!(decode("HMQXA"), "HELLO");
      }
  }

Now write encode and decode routines:

  use std::char;

  fn encode(s: &str) -> String {
      let mut prev = 0;

      s.to_ascii_uppercase()
          .chars()
          .map(|ch| {
              if ch.is_ascii_uppercase() {
                  let curr = 1 + ch as u32 - 'A' as u32;
                  let succ = 1 + (26 + curr + prev - 1) % 26;
                  prev = curr;
                  char::from_u32(succ - 1 + 'A' as u32).unwrap()
              } else {
                  ch
              }
          })
          .collect()
  }

  fn decode(s: &str) -> String {
      let mut prev = 0;

      s.chars()
          .map(|ch| {
              if ch.is_ascii_uppercase() {
                  let curr = 1 + ch as u32 - 'A' as u32;
                  let succ = (26 + curr - prev) % 26;
                  prev = succ;
                  char::from_u32(succ - 1 + 'A' as u32).unwrap()
              } else {
                  ch
              }
          })
          .collect()
  }

These pass all the tests…hooray! Ship it!

But what if there aren't enough tests? Instead of thinking of more examples, property based testing generates examples based on a property. Often, the hard part about property based testing is coming up with a useful property. It's not always obvious what a property of the thing you want to test is. But the end of the problem description practically cries out for property based testing!

Note that you should always be able to decode a string that was encoded and get back the original string uppercased.

This is a property! Here it is written with Rust's proptest library:

  use proptest::prelude::*;
  proptest! {
      #[test]
      fn encode_then_decode(s in r"\PC*") {
          prop_assert_eq!(decode(&encode(&s)), s.to_ascii_uppercase());
      }
  }

When we add that to our lib.rs and run cargo test again, we get a failure. Moreover, we get a minimized version of the failing test case.

thread 'encode_then_decode' panicked at 'Test failed: attempt to subtract with overflow;
minimal failing input: s = "z"

Ah, a pesky fencepost error! This code works correctly for "a" through "y," but fails for "z". It's supposed to be 26, but it's 0. If we change this line

let succ = (26 + curr - prev) % 26;

in that decode routine to

let succ = 1 + (26 + curr - prev - 1) % 26;

then all should be well.

Now all the tests pass! But we're not done. We should also record this example by adding it to our unit tests.

          assert_eq!(decode("Z"), "Z");

Here is the final lib.rs

  use std::char;

  fn encode(s: &str) -> String {
      let mut prev = 0;

      s.to_ascii_uppercase()
          .chars()
          .map(|ch| {
              if ch.is_ascii_uppercase() {
                  let curr = 1 + ch as u32 - 'A' as u32;
                  let succ = 1 + (26 + curr + prev - 1) % 26;
                  prev = curr;
                  char::from_u32(succ - 1 + 'A' as u32).unwrap()
              } else {
                  ch
              }
          })
          .collect()
  }

  fn decode(s: &str) -> String {
      let mut prev = 0;

      s.chars()
          .map(|ch| {
              if ch.is_ascii_uppercase() {
                  let curr = 1 + ch as u32 - 'A' as u32;
                  let succ = 1 + (26 + curr - prev - 1) % 26;
                  prev = succ;
                  char::from_u32(succ - 1 + 'A' as u32).unwrap()
              } else {
                  ch
              }
          })
          .collect()
  }

  #[cfg(test)]
  mod tests {
      use super::*;
      #[test]
      fn test_encode() {
          assert_eq!(encode(""), "");
          assert_eq!(encode("a"), "A");
          assert_eq!(encode("hello"), "HMQXA");
          assert_eq!(encode("newsletter"), "NSBPEQYNYW");
          assert_eq!(encode("1 hug"), "1 HCB");
      }
      #[test]
      fn test_decode() {
          assert_eq!(decode(""), "");
          assert_eq!(decode("1"), "1");
          assert_eq!(decode("HMQXA"), "HELLO");

          // proptest found this one!
          assert_eq!(decode("Z"), "Z");
      }
  }

  use proptest::prelude::*;
  proptest! {
      #[test]
      fn encode_then_decode(s in r"\PC*") {
          prop_assert_eq!(decode(&encode(&s)), s.to_ascii_uppercase());
      }
  }

I guess the moral of the story is, if you've got an obvious property staring you in the face, write a property test for it!