Wednesday, September 19, 2012

Systematic Resampling

The idea is simple, but I don't know why I took so much of time to understand it. I am still not sure if I got it right. Anyway, let me explain what I understood.

Please refer to this page to understand the basics of systematic sampling. The starting point is chosen at random and then the other points are selected at regular intervals. 

Let us assume that we have population X[N] with N elements. The population has an associated weight vector W[N]. We need to sample Y[n] of n elements out of X using systematic sampling. The algorithm that I could understand from Section 8.6 of [1] is as follows:
  1. Compute the sum of all weights $ \displaystyle S = \sum_i^N W_i $
  2. Compute the selection interval I = S/n ; 
  3. To initiate the random selection process, select a uniform random number R in the open interval (0, I].
  4. The n samples are given as : R, R+I, R+2I, ... , R+(n-1)I

void systematic_resampling1(gsl_rng *r, double *dest, size_t m, double *src, size_t n, double *w)
{
  if(m > n)
  {
    cerr << "choose m <= n" << endl;
    exit(-1);
  }

  float sum_w = 0.0;
  for(size_t i = 0; i < n; ++i)
    sum_w += w[i];

  // selection interval
  float I = sum_w / m ;  

  //generate a random number between 0 and I
  float u = gsl_rng_uniform_pos(r) * I;

  size_t j = 0;
  while(j < m)
  {            
    float C = 0.0;
    for(size_t i = 0; i < n; ++i)
    {
      C = C + w[i];      

      if(u <= C)
      {
        dest[j] = src[i];
        j = j + 1;
        u = u + I;
      }
    }
  }
}

The other Algorithm [2], that seems to achieve the same is as follows:

void systematic_resampling2(gsl_rng *r, double *dest, size_t m, double *src, size_t n, double *w)
{
  if(m > n)
  {
    exit(-1);
  }
  float sum_w = 0.0;
  for(size_t i = 0; i < n; ++i)
    sum_w += w[i];

  //generate a random number between 0 and 1
  float u = gsl_rng_uniform_pos(r);

  int j = 0;
  float C = 0.0;
  for(size_t i = 0; i < n; ++i)
  {
    C = C + w[i]/sum_w;      
    if(C > 1.0)
      C = 1.0;

    while((u+j)/m <= C)
    {
      dest[j] = src[i];
      j = j + 1;
    }
  }
}



Reference:

[1]  Kirk M. Wolter, "Introduction to Variance Estimation", Statistics for Social and Behavioral Sciences series, Springer, 2nd Edition, 2007.

[2] Murray Pollock, "Introduction to Particle Filtering", Algorithms & Computationally Intensive Inference Reading Group Discussion Notes, 2010. 

No comments:

 
pre { margin: 5px 20px; border: 1px dashed #666; padding: 5px; background: #f8f8f8; white-space: pre-wrap; /* css-3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ }