Memoization is a program optimisation technique that stores
the results of a computationally expensive function in a cache and then returns
values from that cache if the function is called again with the same input(s).
Wikipedia has
the term coined by (a personal hero) Donald Michie in 1968 although I can’t
find the word in the original paper (dated 1967) [link: https://saltworks.stanford.edu/assets/kr445kk4694.pdf ] so it probably came
into being as the idea was applied.
I had previously used the technique in JavaScript but that
was a while ago and it took me a few moments to even remember the precise term –
let alone work out how to do the trick in C#.
The archetypal example of memoization in JavaScript is a
function to calculate Fibonacci numbers and I found this one on http://stackoverflow.com/
var fibonacci = (function () {
var memo = [0, 1];
var fib = function (n) {
var result = memo[n];
if (typeof result !== 'number') {
result = fib(n - 1) + fib(n - 2);
memo[n] = result;
}
return result;
};
return fib;
}());
The var fibonacci contains a function that can be used to
calculate a result. It will return a cached value for any previously requested
calculation and will recursively call itself to calculate a new value – which will
often be returned from the cache (well certainly if a sequence of values are
requested).
The cache takes the form of an array (memo[]) and this is wrapped
in a “closure” so that it remains accessible to the inner function fib().
Closure, recursion and memorization in one function – has all the dangerous signs of an
interview question.
I wanted a function in C# to look-up a subset of values
previously stored in a database (in this instance identity integers) with a
high likelihood that individual values would be repeatedly requested over the
run of the process but without being able to sort the requests in any
meaningful way. So assuming that database reads are relatively slow, memorization
looked like the technique to try.
My requirement is a simple case – I have a function into
which I pass a string as a key and the function does a database lookup and
retunes a long integer. So let us start there
private long lookUpDimension(string dkey)
{
long idVal = 0;
try
{
// code to access the databse and do the lookup
}
return idVal;
}
Just a simple function (well after all the database access
code is removed for clarity). Point being – it is just an ordinary function –
takes a single argument and returns the output.
The class containing that look-up function also has two
static functions to work some magic.
private static Func<TArg, TResult> Memoize<TArg, TResult>(Func<TArg, TResult> function)
{
return Memoize(function, new Dictionary<TArg, TResult>());
}
private static Func<TArg, TResult> Memoize<TArg, TResult>(Func<TArg, TResult> function, IDictionary<TArg, TResult> cache)
{
return key => cache.ContainsKey(key) ? cache[key] : (cache[key] = function(key));
}
So – what is this Func<TArg, TResult> stuff?
Func is a predefined Delegate, this one takes a single
argument in and outputs a single result. There are more defined (up to 8
inputs) and we could express the second one (2 inputs) thus:
public delegate TResult Func<in T1, in T2, out TResult>(
T1 arg1,
T2 arg2
)
But Func<T1, T2, TResult> is syntactic sugar of the
right sort.
My code needed to declare something like this
private Func<string, long> memLookup = null;
which declares a delegate instance with a string input
and a long integer output
and then the following code needs to be executed:
memLookup = Memoize<string, long>(lookUpDimension);
sometime when the class is being created perhaps or
certainly before the memorized function is required. The class then exposes a
method like:
public long LookupDimValue(string dkey)
{
return memLookup(dkey);
}
Because memLookup() now “contains” a memoized version of the original lookUpDimension() function taking a string input and returning a long integer.
One note of caution about the sort of memorized function I
have used in this example. If the function fails to locate a database record
corresponding to the given key then it returns a value indicating that (a zero).
One option might be to then insert a new record but the memorized function will
“remember” that the key is not valid and continue to return the original
response. The insert would therefore have to be managed from within the memorized
function and the relevant value for the new record returned.
Just a thought (added 11/10/15)
What is the simplest C# code to match the JavaScript function we started with?
If you are wondering if memoization (at least in some form) is an effective technique then try writing a fibonacci function without and give it a run - as the input n increases the response time can be considerable.
And more (added 13/10/2015)
C#6 now allows a different form for initialising a Dictionary
which is bound to be useful when the assignment can add value (just a thought).
Just a thought (added 11/10/15)
What is the simplest C# code to match the JavaScript function we started with?
Dictionary<int, long> memo = new Dictionary<int, long>() { { 0, 0 }, { 1, 1 } };
private long fib(int n)
{
long result;
if (!memo.TryGetValue(n, out result))
{
result = fib(n - 1) + fib(n - 2);
memo.Add(n, result);
}
return result;
}
Fewer lines but less "functional" as it makes use of a Dictionary object that is external to the function. Memoization would often be a better approach.If you are wondering if memoization (at least in some form) is an effective technique then try writing a fibonacci function without and give it a run - as the input n increases the response time can be considerable.
And more (added 13/10/2015)
C#6 now allows a different form for initialising a Dictionary
Dictionary<int, long> memo = new Dictionary<int, long>
{
[0] = 0,
[1] = 1
};
which is bound to be useful when the assignment can add value (just a thought).
No comments:
Post a Comment