SciPy: distribution de von Mises sur un demi-cercle?

Aug 17 2020

J'essaie de trouver la meilleure façon de définir une distribution von-Mises enroulée sur un demi-cercle (je l'utilise pour dessiner des lignes sans direction à différentes concentrations). J'utilise actuellement vonmises.rvs () de SciPy. Essentiellement, je veux être capable de mettre, disons, une orientation moyenne de pi / 2 et d'avoir la distribution tronquée à pas plus de pi / 2 de chaque côté.

Je pourrais utiliser une distribution normale tronquée, mais je perdrai le wrapping des von-mises (disons si je veux une orientation moyenne de 0)

J'ai vu cela dans des documents de recherche sur la cartographie des orientations des fibres, mais je ne peux pas comprendre comment l'implémenter (en python). Je ne sais pas trop par où commencer.

Si mon von Mesis est défini comme (à partir de numpy.vonmises):

np.exp(kappa*np.cos(x-mu))/(2*np.pi*i0(kappa))

avec:

mu, kappa = 0, 4.0

x = np.linspace(-np.pi, np.pi, num=51)

Comment pourrais-je le modifier pour utiliser un enroulement autour d'un demi-cercle à la place?

Quelqu'un avec une certaine expérience dans ce domaine pourrait-il offrir des conseils?

Réponses

1 SeverinPappadeux Aug 17 2020 at 17:57

Il est utile d'avoir un échantillonnage CDF inverse numérique direct, cela devrait fonctionner très bien pour une distribution avec un domaine borné. Voici un exemple de code, la création de tableaux PDF et CDF et l'échantillonnage à l'aide de la méthode CDF inverse. Pourrait être optimisé et vectorisé, bien sûr

Code, Python 3.8, x64 Windows 10

import numpy as np
import matplotlib.pyplot as plt
import scipy.integrate as integrate

def PDF(x, μ, κ):
    return np.exp(κ*np.cos(x - μ))

N = 201

μ = np.pi/2.0
κ = 4.0

xlo = μ - np.pi/2.0
xhi = μ + np.pi/2.0

# PDF normaliztion

I = integrate.quad(lambda x: PDF(x, μ, κ), xlo, xhi)
print(I)
I = I[0]

x = np.linspace(xlo, xhi, N, dtype=np.float64)
step = (xhi-xlo)/(N-1)

p = PDF(x, μ, κ)/I # PDF table

# making CDF table
c = np.zeros(N, dtype=np.float64)

for k in range(1, N):
    c[k] = integrate.quad(lambda x: PDF(x, μ, κ), xlo, x[k])[0] / I

c[N-1] = 1.0 # so random() in [0...1) range would work right

#%%
# sampling from tabular CDF via insverse CDF method

def InvCDFsample(c, x, gen):
    r = gen.random()
    i = np.searchsorted(c, r, side='right')
    q = (r - c[i-1]) / (c[i] - c[i-1])
    return (1.0 - q) * x[i-1] + q * x[i]

# sampling test
RNG = np.random.default_rng()

s = np.empty(20000)

for k in range(0, len(s)):
    s[k] = InvCDFsample(c, x, RNG)

# plotting PDF, CDF and sampling density
plt.plot(x, p, 'b^') # PDF
plt.plot(x, c, 'r.') # CDF
n, bins, patches = plt.hist(s, x, density = True, color ='green', alpha = 0.7)
plt.show()

and graph with PDF, CDF and sampling histogram

1 JohanC Aug 17 2020 at 14:45

You could discard the values outside the desired range via numpy's filtering (theta=theta[(theta>=0)&(theta<=np.pi)], shortening the array of samples). So, you could first increment the number of generated samples, then filter and then take a subarray of the desired size.

Or you could add/subtract pi to put them all into that range (via theta = np.where(theta < 0, theta + np.pi, np.where(theta > np.pi, theta - np.pi, theta))). As noted by @SeverinPappadeux such changes the distribution and is probably not desired.