James Arvo wrote a note on sampling from the Haar measure on SO(3) using the following prescription. Choose x1 from uniform[0,1] and x2 and x3 from uniform[-1,1] and let R = [ cos(2 pi x1) sin(2pi x1) 0; -sin(2 pi x1) cos(2 pi x1) 0; 0 0 1] and let v = [ cos(2 pi x2) sqrt(x3); sin(2pi x2) sqrt(x3); sqrt(1-x3)], and H= I – 2 v v^T. Then M = -HR is sampled from the Haar measure on SO(3) as a matrix.

Now suppose f(x) is a density with respect to the Haar measure on SO(3). We can sample from the uniform distribution as above, and also choose u from uniform[0,1] and if u < f(x) accept x as a sample from f(x) and reject it otherwise. This is the standard rejection sampling method.

Given a sample x1, …, xN of data points on SO(3) we can estimate f(x) using a variety of density estimation techniques, for example Pelletier’s density estimation which allows us to evaluate a smooth density based on the data, or we can use a maximum entropy density for example. The Pelletier formalism for a given x uses the riemannian distance function from x to each of the data points.

The data points of interest to us is the relative twists pivoted on amino acids that occurs from the twist decomposition of approximately 30,000 protein structures. The above considerations can be used to produce statistical models of twist sequences for proteins.