martes, 25 de septiembre de 2018

A Fast Functional Algorithm to Generate Random Latin Squares

Introduction


In this blog post, I will describe briefly an algorithm I have developed in the Scala language, which is useful to generate Random Latin Squares (LSs). In fact, it is the functional, stateless and recursive version of another Java algorithm that I had done as part of my master thesis, back in 2015.
 I called this algorithm "the SeqGen (sequential-generation type) with a replacement chain conflict resolution strategy" or more succinctly "SeqGen with replacement chain", or just "the replacement chain algorithm". Its main advantage is that is faster than (by a factor of around 2) the Jacobson & Matthews' (J&M) algorithm, the most widely known and accepted algorithm in the area. On the other hand, its main disadvantage is that it is not uniformly distributed (as the J&M's), but its distribution is acceptable under certain conditions, as it is approximately uniform.


LS Definition


A Latin Square (LS) of order n, is a matrix of n x n symbols (to simplify just numbers from 1 to n), in which every symbol appears exactly once in each row and column. For example:

1 2 4 3 
2 4 3 1
4 3 1 2
3 1 2 4

is a random LS of order 4.

LSs are used for many applications, for example for games, combinatorics and some cryptographic applications.

A remarkable property is that L(n), the number of different LSs of order n, is only known when n<=11. For larger orders, the number L(n) is unknown up to now, and it is so high that only upper and lower bounds can be given:



The "random"-LS part means that the numbers are "mixed" as much as possible along the matrix as if they were a face of a dice in each position. An algorithm is "uniformly distributed" if every possible LS has the same probability of being generated by the algorithm.


The Scala algorithm


We will explain a generator class, where the algorithm is divided into different methods. 

We start with an empty matrix, a matrix full of 0s or nulls, as you prefer. We begin by row with index 0, extracting a random number one at a time. When we finish the row 0, we can continue with row 1, in order. As the matrix is immutable, we need a function that takes a partially completed LS with n rows filled and a row index and returns a new LS with n+1 rows completed. There is a theorem that says that this can always be done, row by row, up to the last row.

How can we model a computation like this in FP? This was the first "eureka" moment: using the foldlLeft function and the empty matrix as the base element. This is the "main" method to generate an LS:


override def generateLS(): LatinSquare[Int] = {
  val ls0 = VectorLatinSquare.getEmptyMatrix(order)

  val rows = (0 to (order - 1)).toList
  rows.foldLeft (ls0) { (lsi, i) => this.generateRow(i, lsi) }
}


With the 0-th index, beginning with the empty matrix ls0, we generate row 0. Similarly, with the index i-th, and an LS filled up to the row i-1, we generate row i.

The getEmptyMatrix auxiliary method is simply:

def getEmptyMatrix(order: Int) = {
  val vector = Vector.tabulate(order, order)
                       { (i, j) => RandomUtils.getNullElem() }
  new VectorLatinSquare(vector)

}

Where we generate a Vector full of a special element called "null" element. Then we create and return a vector-based implementation of an LS.

We continue with the generateRow method:

def generateRow(i: Int, partialLatinSquare: VectorLatinSquare[Int]): VectorLatinSquare[Int] = {
  val emptyRow = Vector()
  val row = this.generateRowWithPartialRow(i, 0, partialLatinSquare, emptyRow)
  partialLatinSquare.setRow(i, row)

}

This method is just the wrapper and initialization for a recursive method:

protected def generateRowWithPartialRow(i: Int, j: Int, partialLS: ImmutableLatinSquare[Int], partialRow: Vector[Int]): Vector[Int] = {

  if (partialRow.size == partialLS.order) {
    partialRow
  } else {
    val elem = this.generateElem(i, j, partialLS, partialRow)
    if (!partialRow.contains(elem)) {
      this.generateRowWithPartialRow(i, j + 1, partialLS, partialRow :+ elem)
    } else { //conflict: elem is in partialRow
      val emptyPath = Vector()
      val partialRowWithElem = partialRow :+ elem
      val availInRow = partialLS.allNumsVector.diff(partialRow)
      val idxOfFirstElem = partialRow.indexOf(elem) //search for the root of the problem
      val newPartialRow = ControlStructures.retryUntilSucceed(
                            this.makeRoomForElem(elem, idxOfFirstElem, partialLS, partialRowWithElem, availInRow, emptyPath)
                               )
      //now elem fits in position (i,j), continue with column (j + 1)
      this.generateRowWithPartialRow(i, j + 1, partialLS, newPartialRow)
    }//conflict
  }//not the last elem

}//method

Here we have a lot to explain. But let us explain by cases. The first condition checks if the row is completed (the base case), that is, if its length is equal to the order of the LS. If this condition holds, we return the completely generated row.

Then, if the row is incomplete, we try to generate the element in position (i,j) randomly, possibly causing a conflict (repetition) in that position with previously generated elements in the same row. If a conflict is created (there is not another possibility in that position) we trigger the resolution strategy called "replacement chain". Basically, we make room for the element elem, taking into account, that this chain of replacements can fail. So, we repeat the procedure until we can accommodate the element elem without repetitions. In that case, we can continue the generation of the element in position (i, j + 1) recursively.

We highlight here the creation of a new control structure retryUntilSucceed that repeats an operation in case of an exception, extending in a way the Scala language:

import util._

object ControlStructures {

  def retryUntilSucceed[T](fn: => T): T = {
    util.Try { fn } match {
      case Success(x) => x
      case Failure(e) => retryUntilSucceed(fn)
    }
  }
... }

Just for the sake of completeness, we list here the generateElem method, using set operations like diff (for set subtract), intersect (for set intersection). It returns a new random element that can be positioned at the position (i, j), using the java.SecureRandom class, that is secure for cryptographic applications:

private def generateElem(i: Int, j: Int, partialLS: ImmutableLatinSquare[Int], partialRow: Vector[Int]): Int = {

  val availInRow = partialLS.allNumsVector.diff(partialRow)
  val availInCol = partialLS.availInCol(j)
  val availInPos = availInRow.intersect(availInCol)
  if (availInPos.isEmpty) {
    RandomUtils.randomChoice(availInCol) //there is a conflict here, allow a repetition in row but never in column j
  } else {
    RandomUtils.randomChoice(availInPos)
  }

}

Last, here is the algorithm that makes the replacement chain, that is, the sequence of replacements to make room for the selected element:

private def makeRoomForElem(elem: Int, prevIdx: Int, partialLS: ImmutableLatinSquare[Int], currentRow: Vector[Int], availInRow: Vector[Int], path: Vector[Int]): Vector[Int] = {

  //calculate available to choose at idxOld
  val availInCol = partialLS.availInCol(prevIdx)
  val available = (availInCol).diff(path).diff(Vector(elem)) //must be in available in col but not in path, and must not use elem
  
  if (available.isEmpty) {
    // Path no good, begin again
    throw new Exception("No good path... failure")
  }

  //choose a new element for inxOld position
  val newElem = RandomUtils.randomChoice(available);
  val idxNew = currentRow.indexOf(newElem); //index of this newElem before replacement because it will be repeated
  //make the replacement:
  val newRow = currentRow.updated(prevIdx, newElem) //replace and generate a repetition of elem newElem
  //store in path
  val newPath: Vector[Int] = path :+ newElem
  //remove from available in row
  val newAvailInRow = availInRow.diff(Vector(newElem))
  if (idxNew == -1) { //elem is now available in row and there are no repetitions
    return newRow
  } else {
    return this.makeRoomForElem(elem, idxNew, partialLS, newRow, newAvailInRow, newPath)
  }

}

In each recursive call, we take the previously generated and repeated element, and replace it by some other possible element in that position, taking care of not repeating elements in the column of that element. The base case of the recursion is when the updated row does not contain the newElem element, and the generation can proceed from that point.


Conclusion


We have seen a complete algorithm to generate Random LSs. The algorithm is very efficient and behaves better than the most widely known algorithm of J&M's. Although the Java iterative version of the replacement chain algorithm is more efficient, the Scala recursive version is more clear and less error-prone, composable, more robust and easier to test.

Ignacio Gallego Sagastume
Blue Montag Software
@igallegosagas