Fitting linecuts using the lmfit package

Import functions

from nxs_analysis_tools.datareduction import load_transform, Scissors
from nxs_analysis_tools.fitting import LinecutModel
from nxs_analysis_tools.datasets import cubic

from lmfit.models import GaussianModel, LinearModel
import numpy as np
import matplotlib.pyplot as plt

# Download example data and save cache directory as sample_directory
sample_directory = cubic(temperatures=[15])

Load data

data = load_transform(f'{sample_directory}/cubic_15.nxs')
data:NXdata
  @axes = ['Qh', 'Qk', 'Ql']
  @signal = 'counts'
  Qh = float64(100)
  Qk = float64(150)
  Ql = float64(200)
  counts = float64(100x150x200)

Perform linecuts

s = Scissors(data=data)
linecut = s.cut_data(center=(0,0,0), window=(0.1,0.5,0.1))

The LinecutModel class

The workhorse of the fitting algorithm here is the LinecutModel class which wraps both the Model and ModelResult classes of the lmfit package, among other information.

lm = LinecutModel(data=linecut)

Create lmfit model

Use the .set_model_components() method to set the model to be used for fitting the linecut. The model_components parameter must be a Model, CompositeModel, or list of Model objects.

# Using a list of Model objects
lm.set_model_components([GaussianModel(prefix='peak'), LinearModel(prefix='background')])

# Using a CompositeModel
lm.set_model_components(GaussianModel(prefix='peak') + LinearModel(prefix='background'))

Upon setting the model, the .params attribute is initialized with a Parameters object which holds the parameters for all components of the model.

lm.params

Performing an initial guess

Use the .guess() method to perform an initial guess, which will overwrite any changes made to the parameter values and constraints set in the .params attribute.

lm.guess()
[Parameters([('peakamplitude', <Parameter 'peakamplitude', value=482528.7250417659, bounds=[-inf:inf]>), ('peakcenter', <Parameter 'peakcenter', value=-7.401486830834377e-17, bounds=[-inf:inf]>), ('peaksigma', <Parameter 'peaksigma', value=0.05033557046979864, bounds=[0.0:inf]>), ('peakfwhm', <Parameter 'peakfwhm', value=0.11853120805369124, bounds=[-inf:inf], expr='2.3548200*peaksigma'>), ('peakheight', <Parameter 'peakheight', value=3824355.5717666983, bounds=[-inf:inf], expr='0.3989423*peakamplitude/max(1e-15, peaksigma)'>)]),
 Parameters([('backgroundslope', <Parameter 'backgroundslope', value=9.271792676819166e-11, bounds=[-inf:inf]>), ('backgroundintercept', <Parameter 'backgroundintercept', value=404986.4717240573, bounds=[-inf:inf]>)])]

To view the guessed initial values, we can inspect the .params attribute.

lm.params

Visualize the initial guesses

lm.plot_initial_guess()
../_images/e18a1b06909690deb69e327fabcbeab7ca3a61fd972aa194505bdc9160ce3fa6.png

Set parameter constraints

All constraints are set by accessing the individiual parameters through the .params attribute. See https://lmfit.github.io/lmfit-py/constraints.html for more information on mathematical constraints allowed by lmfit.

# Constrain the peak amplitude to be positive
lm.params['peakamplitude'].set(min=0)

# Constrain the range of the peak center
lm.params['peakcenter'].set(min=-0.1, max=0.1)

If we inspect the .params attribute again, we should see that the constraints are implemented.

lm.params

Perform the fit

The .fit() method here automatically assumes the parameter values and constraints currently stored in the .params attribute and feeds them to the .fit() method from lmfit.

lm.fit()

Visualize the fit

Once the fit is complete, we can visualize the fit and the residuals using the .plot_fit() method.

lm.plot_fit()
../_images/43b8500f9abdfc323b3e4bd6a5b998c78d68e5b2af5962f52ceddc7aa65bcdf5.png
[[Model]]
    (Model(gaussian, prefix='peak') + Model(linear, prefix='background'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 25
    # data points      = 50
    # variables        = 5
    chi-square         = 4.1415e-10
    reduced chi-square = 9.2033e-12
    Akaike info crit   = -1265.84087
    Bayesian info crit = -1256.28075
    R-squared          = 1.00000000
##  Warning: uncertainties could not be estimated:
    peakcenter:           at initial value
    backgroundslope:      at initial value
[[Variables]]
    peakamplitude:        407704.502 (init = 482528.7)
    peakcenter:          -1.7472e-14 (init = -7.401487e-17)
    peaksigma:            0.04987484 (init = 0.05033557)
    backgroundslope:      4.3073e-09 (init = 9.271793e-11)
    backgroundintercept: -4.3230e-07 (init = 404986.5)
    peakfwhm:             0.11744628 == '2.3548200*peaksigma'
    peakheight:           3261174.59 == '0.3989423*peakamplitude/max(1e-15, peaksigma)'
<Axes: xlabel='x', ylabel='y'>

The number of x values used to evaluate the fit curve can be set using the numpoints parameter, and any keyword arguments here will be passed to the .plot() method of the underlying ModelResult object.

lm.plot_fit(numpoints=1000)
../_images/ca23f90c367a60929803f93134f82098afa84bd82e1b28396a624dd67e8b32e4.png
[[Model]]
    (Model(gaussian, prefix='peak') + Model(linear, prefix='background'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 25
    # data points      = 50
    # variables        = 5
    chi-square         = 4.1415e-10
    reduced chi-square = 9.2033e-12
    Akaike info crit   = -1265.84087
    Bayesian info crit = -1256.28075
    R-squared          = 1.00000000
##  Warning: uncertainties could not be estimated:
    peakcenter:           at initial value
    backgroundslope:      at initial value
[[Variables]]
    peakamplitude:        407704.502 (init = 482528.7)
    peakcenter:          -1.7472e-14 (init = -7.401487e-17)
    peaksigma:            0.04987484 (init = 0.05033557)
    backgroundslope:      4.3073e-09 (init = 9.271793e-11)
    backgroundintercept: -4.3230e-07 (init = 404986.5)
    peakfwhm:             0.11744628 == '2.3548200*peaksigma'
    peakheight:           3261174.59 == '0.3989423*peakamplitude/max(1e-15, peaksigma)'
<Axes: xlabel='x', ylabel='y'>

Access the fit values

The x and y values of the fit are stored in the .x (identical to the raw x values) and the .y_fit attributes.

lm.x
array([-0.493289, -0.473154, -0.45302 , ...,  0.45302 ,  0.473154,
        0.493289], shape=(50,))
lm.y_fit
array([-4.344241e-07, -4.343373e-07, -4.342467e-07, ..., -4.303441e-07,
       -4.302613e-07, -4.301746e-07], shape=(50,))

The residuals are stored within the ModelResult class of the lmfit package, which is stored in the .modelresult attribute of the LinecutModel class.

lm.modelresult.residual
array([4.344241e-07, 4.343374e-07, 4.342507e-07, ..., 4.303481e-07,
       4.302614e-07, 4.301746e-07], shape=(50,))

If .plot_fit() has already been called, then the evaluated data composing the fit curve can be found in .x_eval and .y_eval.

lm.x_eval
array([-0.493289, -0.492301, -0.491313, ...,  0.491313,  0.492301,
        0.493289], shape=(1000,))
lm.y_eval
array([-4.344241e-07, -4.344198e-07, -4.344156e-07, ..., -4.301832e-07,
       -4.301789e-07, -4.301746e-07], shape=(1000,))

If .plot_fit() has not been called, then evaluating the fit can be done in the standard lmfit way through the ModelResult object for either the total model or its individual components.

# Evaulate the total model
lm.modelresult.eval(x=np.linspace(-0.5,0.5,1000))
array([-4.344530e-07, -4.344487e-07, -4.344444e-07, ..., -4.301544e-07,
       -4.301500e-07, -4.301457e-07], shape=(1000,))
# Evaulate only the background component
lm.modelresult.components[1].eval(x=np.linspace(-0.5,0.5,1000))
array([-0.5     , -0.498999, -0.497998, ...,  0.497998,  0.498999,
        0.5     ], shape=(1000,))

You may also opt to evaluate all components at the same time, but separated by their individual contribution.

# Evaluate all components separately
evaluated_components = lm.modelresult.eval_components(x=np.linspace(-0.5,0.5,1000))

Using this approach, a dict is generated where the contribution of each component is labeled by its prefix.

evaluated_components
{'peak': array([4.892499e-16, 5.981719e-16, 7.310488e-16, ..., 7.310488e-16,
        5.981719e-16, 4.892499e-16], shape=(1000,)),
 'background': array([-4.344530e-07, -4.344487e-07, -4.344444e-07, ..., -4.301544e-07,
        -4.301501e-07, -4.301457e-07], shape=(1000,))}
evaluated_components['peak']
array([4.892499e-16, 5.981719e-16, 7.310488e-16, ..., 7.310488e-16,
       5.981719e-16, 4.892499e-16], shape=(1000,))
evaluated_components['background']
array([-4.344530e-07, -4.344487e-07, -4.344444e-07, ..., -4.301544e-07,
       -4.301501e-07, -4.301457e-07], shape=(1000,))

These can then be plotted together to show their individual contributions. See below for an example.

plt.plot(lm.x, lm.y, '.')
plt.fill_between(np.linspace(-0.5,0.5,1000), evaluated_components['peak'], alpha=0.5, label='peak')
plt.fill_between(np.linspace(-0.5,0.5,1000), evaluated_components['background'], alpha=0.5, label='background')
plt.legend()
<matplotlib.legend.Legend at 0x7608b538af50>
../_images/e5cd516ca761092130b88f0b0d8fff2789929208f76959d6daac17e63743489a.png

Viewing the fit report

lm.print_fit_report()
[[Model]]
    (Model(gaussian, prefix='peak') + Model(linear, prefix='background'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 25
    # data points      = 50
    # variables        = 5
    chi-square         = 4.1415e-10
    reduced chi-square = 9.2033e-12
    Akaike info crit   = -1265.84087
    Bayesian info crit = -1256.28075
    R-squared          = 1.00000000
##  Warning: uncertainties could not be estimated:
    peakcenter:           at initial value
    backgroundslope:      at initial value
[[Variables]]
    peakamplitude:        407704.502 (init = 482528.7)
    peakcenter:          -1.7472e-14 (init = -7.401487e-17)
    peaksigma:            0.04987484 (init = 0.05033557)
    backgroundslope:      4.3073e-09 (init = 9.271793e-11)
    backgroundintercept: -4.3230e-07 (init = 404986.5)
    peakfwhm:             0.11744628 == '2.3548200*peaksigma'
    peakheight:           3261174.59 == '0.3989423*peakamplitude/max(1e-15, peaksigma)'

Performing a simple linecut with no customization

The .fit_peak_simple() method offers a quick but rudimentary way to fit a peak, using a pseudo-Voigt peak shape with a linear background.

lm.fit_peak_simple()
lm.plot_fit()
../_images/bf575dfe0344def6849f5ab27dd3aec7f9ed36db8f92173e70cea8852d0d6ba9.png
[[Model]]
    (Model(pvoigt, prefix='peak') + Model(linear, prefix='background'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 239
    # data points      = 50
    # variables        = 6
    chi-square         = 7.1310e-11
    reduced chi-square = 1.6207e-12
    Akaike info crit   = -1351.80009
    Bayesian info crit = -1340.32795
    R-squared          = 1.00000000
##  Warning: uncertainties could not be estimated:
    peakcenter:           at initial value
    peakfraction:         at boundary
    backgroundslope:      at initial value
[[Variables]]
    peakamplitude:        407704.502 (init = 603160.9)
    peakcenter:          -8.6824e-14 (init = -7.401487e-17)
    peaksigma:            0.05872314 (init = 0.05033557)
    peakfraction:         3.0309e-13 (init = 0.5)
    peakfwhm:             0.11744628 == '2.0000000*peaksigma'
    peakheight:           3261174.43 == '(((1-peakfraction)*peakamplitude)/max(1e-15, (peaksigma*sqrt(pi/log(2))))+(peakfraction*peakamplitude)/max(1e-15, (pi*peaksigma)))'
    backgroundslope:      9.0210e-17 (init = 9.271793e-11)
    backgroundintercept: -9.1841e-12 (init = 404986.5)
<Axes: xlabel='x', ylabel='y'>