This tutorial explains how to extract magnetic parameters from experimental EPR spectra by fitting an EasySpin simulation to the experimental spectra using least-squares fitting techniques. EasySpin's function esfit contains all the necessary machinery and is very easy to use.
This tutorial contains the following topics:To get the most out of the EasySpin fitting features, you should work through this entire tutorial and study the examples.
EasySpin's fitting function is esfit and can be called with up to seven parameters:
esfit('pepper',spc,Sys0,Vary,Exp)
esfit('pepper',spc,Sys0,Vary,Exp,SimOpt)
esfit('pepper',spc,Sys0,Vary,Exp,SimOpt,FitOpt)
Here is what the various parameters mean
'pepper' indicates that the simulation function pepper is to be used. Alternatively, 'garlic' or 'chili' can be given here.
spc is the array of experimental spectral data. For the fitting, the magnetic field values are not needed.
Sys0 is a structure collecting magnetic parameters of the spin system. The parameters in this structure are used as first guesses in the fitting procedure.
Vary is a structure similar to Sys0 containing
the search ranges of the parameters that should be fitted.
Exp is a structure containing the experimental parameters for the simulation function.
SimOpt contains setting for the spectral simulation function.
FitOpt contains settings for the least-squares fitting function.
Before you can fit parameters to your experimental spectrum, you have to import the data into MATLAB. There are several ways to do this, depending in which format your data are stored in.
If your spectrum was acquired by a Bruker spectrometer, it is most
likely either in ESP format (having file extensions .par and .spc)
or in BES3T format (with file extensions .DSC and .DTA). Both formats
are handled by EasySpin's function eprload. Here is an example: Suppose your data are stored in the files dpph.DSC and dpph.DTA. To load them
into MATLAB, type
[spc,Params,B] = eprload('dpph.DTA');
This returns the spectral data in spc, all the parameters
stored in the files are in the structure Params, and B contains the magnetic field values.
Often, experimental spectral data come in text files, containing
two columns of data, where the first column is the magnetic
field and the second column contains the spectral intensity.
To load such a file into MATLAB, use the function textread:
[B,spc] = textread('dpph.txt','%f %f');
textread is a very flexible function that can accomodate
many different text file formats. E.g. it can skip header lines if told to do so, it can accomodate comment lines, etc. See the MATLAB documentation for more details.
The fitting function needs to know where to start looking for
an optimal fit. This starting point is given as a spin system in the third input parameter to esfit. For example, if the spectrum is that of a simple S=1/2 with orthorhombic g and an isotropic linewidth, one would start with
Sys0.g = [1.99, 2.07, 2.11]; Sys0.lw = 1;
Some algorithms use this set of parameters as the start set, but other use it only to specify the center of the search range.
Next, esfit has to be told what parameters of the spin system given in Sys0 should be fitted, and by how much the can be varied during the least-squares fitting.
This information is given in the fourth input parameter to esfit. If only the linewidth should be fitted, and it should be varied by approximately +-0.3 mT, use
Vary.lw = 0.3;
If the second principal value of the g tensor and the linewidth should be fitted simultaneously, use
Vary.g = [0 0.02 0]; Vary.lw = 0.3;
In essence, all the fields in Vary must have the same names as those in Sys0, and any non-zero value indicates that that parameter should be fitted by varying it approximately by plus minus the given amount. Zero values for any parameter exclude it from the fitting.
It is advisable not to vary more than about 4 parameters at the same time, as the efficiency of the fitting algorithms decreases with the number of fitting parameters.
A special note for fitting slow-motion EPR spectra with chili: Do not use tcorr or Diff for fitting the correlation time or the diffusion tensor, but rather the logarithmic forms logtcorr and logDiff.
Of course, the fitting function must also be provided with the experimental parameters for the spectral simulations. These are given as the fifth parameter, a structure Exp which is directly forwarded to the simulation functions
pepper,
garlic, or
chili. For details, see the documentation of these functions.
A minimal example would be
Exp.mwFreq = 9.5; Exp.Range = [200 400];
The micrwave frequency, the field range and all other experimental parameters must correspond to the ones used in acquiring the spectral data.
Now we are all set to call the fitting function:
esfit('pepper',spc,Sys0,Vary,Exp);
The least-squares fitting is started, and a figure window pops up that displays the experimental spectrum and the current and best fits so far. In addition, information about the fitting is also displayed in the command window.
When the fitting terminates (which we discuss further down), esfit prints the results
to the command window. If esfit is called with an output parameter
result = esfit('pepper',spc,Sys0,Vary,Exp);
it returns the results of the fitting in result.
If you want to pass settings to the simulation function, collect them in an additional stucture SimOpt and pass them as
sixth parameter:
esfit('pepper',spc,Sys0,Vary,Exp,SimOpt);
It is possible to provide a structure with settings for the fitting function. If we call this structure FitOpt, the syntax is
esfit('pepper',spc,Sys0,Vary,Exp,SimOpt,FitOpt);
The possible settings set in this last structure are the topic of the rest of this tutorial.
The performance of the fitting depends crucially on two things: the choice of the optimization algorithm and the choice of the target function. Let's have a look at each of them in turn.
EasySpin provides optimization algorithms that are in widespread use: (1) the Nelder/Mead downhill simplex method, (2) the Levenberg/Marquardt algorithm, (3) Monte Carlo random search, (4) a genetic algorithm, and (5) a systematic grid search.
The first two are local search algorithms, which start from a given starting set of parameters and try to work their way down a nearby valley of the parameter space to find the minimum. Both methods are quite fast, although there are some differences in general performance between them: The downhill simplex is somewhat slower than Levenberg/Marquardt, but it is more robust in the sense that it does not get stuck in a local minimum as easily as Levenberg/Marquardt.
The latter three are global search methods: they do not have a single starting parameter set, but use many, distributed over the entire parameter search space. The Monte Carlo method simply performs a series of random trial simulations and picks the best one. It is very inefficient. The systematic grid search is better: It covers the parameter space with a grid and then does simulations for each knot of the grid, in random order. Thus, no point is simulated twice, and the method is more efficient than the Monte Carlo search. However, if the minimum is between two grid points, it will never be found.
The third global method is a genetic algorithm: It makes simulations for several, let's say N, parameter sets (called a population), computes the fitting error for all of them (called the fitness) and then proceeds to generate N new parameter sets from the old ones using mechansims like mutation, cross-over and reproduction. This way, a new generation of parameter sets is (pro-)created, just like in biological evolution. The benefit of this algorithm is that if a good parameter is encountered, it is likely to propagate down the generations and across the population.
To select one of the algorithms, specify it in the Method field of the fitting options
FitOpt.Method = 'simplex'; % for Nelder/Mead downhill simplex FitOpt.Method = 'levmar'; % for Levenberg/Marquardt FitOpt.Method = 'montecarlo'; % for Monte Carlo FitOpt.Method = 'genetic'; % for the genetic algorithm FitOpt.Method = 'grid'; % for grid search
and then supply this options structure as seventh parameter to the fitting function, for example
esfit('pepper',spec,Sys0,Vary,Exper,[],FitOpt);
If you don't specify the algorithm, EasySpin uses the downhill simplex by default.
The second important setting is the choice of the target function. esfit computes the error
of the simulated spectrum using the root-mean-square-deviation (rmsd, ie the square root of the mean of the square of
the deviations), where the deviation is the difference between the experimental and the simulated spectrum.
Fitting speed can often by signifantly increased, however, if one used not the spectra directly, but their integrals
or similar transforms. EasySpin supports several settings here 'fcn' (use spectrum as is), 'int'
(integral), 'dint' (double integral), 'diff' (first derivative), 'fft' (Fourier
transform). The setting string is simply appended to the 'Method' field, after a space:
FitOpt.Method = 'simplex int'; % simplex algorithm with integrals for rmsd FitOpt.Method = 'genetic fcn'; % genetic algorithm with spectra as is FitOpt.Method = 'levmar dint'; % Levenberg/Marquardt with double integral
Usually, 'fcn' is an excellent choice, but in the case of many lines 'int' is much better - provided
the baseline in the experimental spectrum is good. The other settings ('dint', 'diff', and 'fft')
do not have general applicability.
Each algorithm has some internal parameters that can be used to fine-tune its performance. In general, it is not necessary to fiddle with those parameters at all. For more details, see the documentation of esfit.
Normally, EasySpin just scales the simulated spectrum in a way that the the maximum absolute value coincides with the maximum
absolute value of the experimental spectrum. However, if the experimental spectrum is very noisy, this is not ideal, and the
best scaling of simulated spectrum is better determined by least-squares fitting. The scaling method can be set in the field
FitOpt.Scaling. The most common settings are
FitOpt.Scaling = 'maxabs'; % scaling so that the maximum absolute values coincide FitOpt.Scaling = 'lsq'; % determine scaling via least-squares fitting
A simple mock example of this would be
Sys.g = 2; Sys.lw = 2; Exp.mwFreq = 9.5; Exp.Range = [335 345];
[x,y] = pepper(Sys,Exp); y = 3*addnoise(y,2);
Sys.g = 2.001; Sys.lw = 1.8; Vary.g = 0.002; Vary.lw = 0.4;
FitOpt.Scaling = 'maxabs';
esfit('pepper',y,Sys,Vary,Exp,[],FitOpt);
Change FitOpt.Scaling to see how the two scaling methods differ. For other scaling methods, see the documentation
of esfit.
With EasySpin it is easily possible to perform so-called hybrid least-squares fittings, where one optimization algorithm is used to locate a potential minimum, and another one is used to refine the parameters at that minimum. The first of these two steps often employs algorithms that are able to locate minima globally and do not get stuck easily in a local minimum. The disadvantage of these methods is that they are often slow. The second step closes in on the minimum by using a much faster local method. There are two ways to perform such a two-stage fitting: using the graphical interface, or writing your own code.
If one of the global methods (Monte Carlo, grid search, or genetic algorithm) is used, the graphical plot during the execution of
esfit contains a button labelled local. Whenever that button is pressed, EasySpin takes the current best
parameter set and performs a simplex minimization. Once that is done, it goes back and continues with the global search. This way,
you can decide to narrow down a minimum whenever you think the global search has produced a parameter set that could be close to one.
Alternatively, you can write code that does two- or multistage fitting. Let's look at an example with a two-stage fitting using genetic
algorithm followed by Levenberg/Marquardt. This can be set up by calling esfit twice with different settings in FitOpt.
% first stage: genetic algorithm
FitOpt.Method = 'genetic fcn';
Sys1 = esfit('pepper',y,Sys0,Vary,Exp,[],FitOpt);
% second stage: Levenberg/Marquardt
FitOpt.Method = 'levmar int';
Sys2 = esfit('pepper',y,Sys1,Vary,Exp,[],FitOpt)
Of course, you'll probably have to change some of the termination criteria for the two algorithms so that the genetic search narrows down a minimu only coarsely, and the local search can then much more efficiently finalize the fit by refining the parameters.
esfit stops the fitting when it thinks it has reached a minimum in the error function, when it has taken more
than a given amount of time, if the set number of simulations are reached, or if the user interrupts it.
Let's have a look at these possibilities in turn.
esfit considers a local least-squares fit using the simplex or the Levenberg/Marquardt algorithm to be converged
if the change in parameters from one simulation iteration to the next falls below a certain threshold and/or if
the change in the error from iteration to iteration falls below another threshold. Both thresholds have pre-set
values, but can be adjusted by supplying appropriate fields in the FitOpt structure:
FitOpt.TolFun = 1e-3; % termination tolerance for error change FitOpt.TolStep = 1e-3; % termination tolerance for parameter step
The global methods terminate also if the maximum number of simulations are reached: the Monte Carlo search does only a pre-set
number of simulations (FitOpt.N), the grid search stops if all the grid points are simulated (see the option
FitOpt.GridSize), and the genetic algorithm stops at the latest after a certain number of generations (see
FitOpt.PopulationSize and FitOpt.Generations).
In a field FitOpt.maxTime, the fitting function can be told to terminate after a given amount of time, even if the
fitting did not converge in terms of TolStep and TolFun.
This can be useful when running several fittings overnight from a script.
FitOpt.maxTime = 60; % maximum time, in minutes
Yet another way to terminate the fitting requires that the graphical display is on. In the figure window, there is a button
labelled "stop". esfit terminates, when this button is pressed
and returns the best fit so far.