Locality of Reference

In my studies of Computer Science I learned about the principle of locality which might be better called locality of reference. Two basic kinds of locality are temporal locality (locality in time) and spacial locality (locality in space). The theory basically says that similar things will group together. We see the same principle many places in daily life and other disciplines.

Some examples of locality include:

  • people speaking the same language tend to group together,
  • wheat fields being on land close together,
  • forests having many of the same species of tree,
  • minerals like gold being in high concentrations in certain areas,

In data compression, locality is important to reduce context windows to be small enough to fit in memory, to reduce context windows to reduce processing. A context is a kind of locality. A window is a term meaning the area being looked at (or evaluated). Many compression algorithms use a sliding window. A sliding window is a view of several blocks of data that shifts such that when one block is done being evaluated, the window moves one block.

Why am I talking about locality of reference in data compression? On April 14th, 2004 (2004/04/14), I wrote the following note to myself:

  • duplicate files
  • principle of locality
    • files by time
    • files by directory
    • files by type

n(n-1)=0(n^2) Every file in front of every other file can be done in parallel.

This means that for file ordering in an archive, there are some short cuts to finding the optimal order that can take advantage of multi-processor systems. Now checks can also take better advantage of increased parallelism and faster random access provided by solid state disk drives (SSD’s).

In the above note, O(n^2) is Big O notation for order n squared. That means for every extra unit of input, the processing time roughly takes twice as long. This is a simple exponential curve.

For further references look up “locality of reference”, “sliding window” compression, “parallel processing”, and “Big O notation”. Also see my evolving nots on data compression including some future information on “Drew Daniels’ compression magic”.

Drew Daniels

Comments are closed.