09. October 2013

A Lattice class

Consider if you need a 7-dimensional lattice data structure. An array is great for one-dimensional data but once you add dimensions things quickly get difficult to manage. Here’s a data structure that takes away some of that headache:

more

27. September 2013

Using Shared Strings to Reduce Memory Usage

As of Excel 2007, files are saved in the Open XML format. This format is comprised of a grouping of XML files and assets, which are then zipped up and given the .xlsx extension. It’s a lot more readable from other programs than an old fashioned .xls file.

One means that was used to reduce the file size was setting up a shared strings table. Strings stored in a spreadsheet are given a numeric index and this numeric index is then stored in the xml file. In general, if a string is reused frequently the overhead of the shared string map will be payed off by the saving of only storing string indices.

more

09. September 2013

A timer class

It’s often useful to time aspects of your applications. In environments without access to profiling tools like xdebug, it is necessary to roll your own. Here’s one that relies heavily on calls to microtime. Unfortunately making many thousands of calls to microtime takes a significant amount of time on its own.

more

16. August 2013

A Bloom Filter in c#

A bloom filter is a probabilistic data structure meant for checking whether a given entry does not occur in a list. It is meant to be quite fast, and is used as a way of not doing costly queries when it can be determined that no results will be returned. E.g., if you could turn this:

costly_lookup(key)

Into this:

if (!cheap_check_that_key_isnt_there()) {
    costly_lookup()
}

Then that’s a win.

more