## Wednesday, January 13, 2021

### Taking advantage of Ruby in QSoas

First of all, let me all wish you a happy new year, with all my wishes of health and succes. I sincerely hope this year will be simpler for most people as last year !

For the first post of the year, I wanted to show you how to take advantage of Ruby, the programming language embedded in QSoas, to make various things, like:

• creating a column with the sum of Y values;
• extending values that are present only in a few lines;
• renaming datasets using a pattern.

#### Summing the values in a column

When using commands that take formulas (Ruby code), like apply-formula, the code is run for every single point, for which all the values are updated. In particulier, the state of the previous point is not known. However, it is possible to store values in what is called global variables, whose name start with an $ sign. Using this, we can keep track of the previous values. For instance, to create a new column with the sum of the y values, one can use the following approach: QSoas> eval$sum=0
QSoas> apply-formula /extra-columns=1 $sum+=y;y2=$sum


set-meta a ${1} 1  This script takes a single argument, the value of $$a$$, generates the appropriate dataset, sets the meta-data a and writes the data about the largest (and only in this case) peak to the output file. Let's now run this script with 1 as an argument: QSoas> @ do-one.cmds 1 This command generates a file out.dat containing the following data: ## buffer what x y index width left_width right_width area generated.dat max 2.002002002 0.541340590883 100 3.4034034034 1.24124124124 2.162162162161.99999908761 This gives various information about the peak found: the name of the dataset it was found in, whether it's a maximum or minimum, the x and y positions of the peak, the index in the file, the widths of the peak and its area. We are interested here mainly in the x position. Then, we just run this script for several values of $$a$$ using run-for-each, and in particular the option /range-type=lin that makes it interpret values like 0.5..5:80 as 80 values evenly spread between 0.5 and 5. The script is called run-all.cmds: output peaks.dat /overwrite=true /meta=a run-for-each do-one.cmds /range-type=lin 0.5..5:80 V all /style=red-to-blue The first line sets up the output to the output file peaks.dat. The option /meta=a makes sure the meta a is added to each line of the output file, and /overwrite=true make sure the file is overwritten just before the first data is written to it, in order to avoid accumulating the results of different runs of the script. The last line just displays all the curves with a color gradient. It looks like this: Running this script (with @ run-all.cmds) creates a new file peaks.dat, whose first line looks like this: ## buffer what x y index width left_width right_width area a The column x (the 3rd) contains the position of the peaks, and the column a (the 10th) contains the meta a (this column wasn't present in the output we described above, because we had not used yet the output /meta=a command). Therefore, to load the peak position as a function of a, one has just to run: QSoas> load peaks.dat /columns=10,3 This looks like this: Et voilà ! To train further, you can: • improve the resolution in x; • improve the resolution in y; • plot the magnitude of the peak; • extend the range; • derive the analytical formula for the position of the peak and verify it ! [1] this is not exactly true. For instance, some commands like unwrap interpret the sr meta-data as a voltammetric scan rate if it is present. But this is the exception. #### About QSoas QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code there (or clone from the GitHub repository) and compile it yourself, or buy precompiled versions for MacOS and Windows there. ## Wednesday, November 11, 2020 ### Solution for QSoas quiz #1: averaging spectra This post describes the solution to the Quiz #1, based on the files found there. The point is to produce both the average and the standard deviation of a series of spectra. Below is how the final averaged spectra shoud look like: I will present here two different solutions. #### Solution 1: using the definition of standard deviation There is a simple solution using the definition of the standard deviation: $$\sigma_y = \sqrt{<y^2> - {<y>}^2}$$ in which $$<y^2>$$ is the average of $$y^2$$ (and so on). So the simplest solution is to construct datasets with an additional column that would contain $$y^2$$, average these columns, and replace the average with the above formula. For that, we need first a companion script that loads a single data file and adds a column with $$y^2$$. Let's call this script load-one.cmds: load${1}
apply-formula y2=y**2 /extra-columns=1
flag /flags=processed

When this script is run with the name of a spectrum file as argument, it loads it (replaces ${1} by the first argument, the file name), adds a column y2 containing the square of the y column, and flag it with the processed flag. This is not absolutely necessary, but it makes it much easier to refer to all the spectra when they are processed. Then to process all the spectra, one just has to run the following commands: run-for-each load-one.cmds Spectrum-1.dat Spectrum-2.dat Spectrum-3.dat average flagged:processed apply-formula y2=(y2-y**2)**0.5 dataset-options /yerrors=y2  The run-for-each command runs the load-one.cmds script for all the spectra (one could also have used Spectra-*.dat to not have to give all the file names). Then, the average averages the values of the columns over all the datasets. To be clear, it finds all the values that have the same X (or very close X values) and average them, column by column. The result of this command is therefore a dataset with the average of the original $$y$$ data as y column and the average of the original $$y^2$$ data as y2 column. So now, the only thing left to do is to use the above equation, which is done by the apply-formula code. The last command, dataset-options, is not absolutely necessary but it signals to QSoas that the standard error of the y column should be found in the y2 column. This is now available as script method-one.cmds in the git repository. #### Solution 2: use QSoas's knowledge of standard deviation The other method is a little more involved but it demonstrates a good approach to problem solving with QSoas. The starting point is that, in apply-formula, the value $stats.y_stddev corresponds to the standard deviation of the whole y column... Loading the spectra yields just a series of x,y datasets. We can contract them into a single dataset with one x column and several y columns:
load Spectrum-*.dat /flags=spectra
contract flagged:spectra

After these commands, the current dataset contains data in the form of:
lambda1	a1_1	a1_2	a1_3
lambda2	a2_1	a2_2	a2_3
...

in which the ai_1 come from the first file, ai_2 the second and so on. We need to use transpose to transform that dataset into:
0	a1_1	a2_1	...
1	a1_2	a2_2	...
2	a1_3	a2_3	...

In this dataset, values of the absorbance for the same wavelength for each dataset is now stored in columns. The next step is just to use expand to obtain a series of datasets with the same x column and a single y column (each corresponding to a different wavelength in the original data). The game is now to replace these datasets with something that looks like:
0	a_average
1	a_stddev

For that, one takes advantage of the $stats.y_average and $stats.y_stddev values in apply-formula, together with the i special variable that represents the index of the point:
apply-formula "if i == 0; then y=$stats.y_average; end; if i == 1; then y=$stats.y_stddev; end"
strip-if i>1

Then, all that is left is to apply this to all the datasets created by expand, which can be just made using run-for-datasets, and then, we reverse the splitting by using contract and transpose ! In summary, this looks like this. We need two files. The first, process-one.cmds contains the following code:
apply-formula "if i == 0; then y=$stats.y_average; end; if i == 1; then y=$stats.y_stddev; end"
strip-if i>1
flag /flags=processed

The main file, method-two.cmds looks like this:
load Spectrum-*.dat /flags=spectra
contract flagged:spectra
transpose
expand /flags=tmp
run-for-datasets process-one.cmds flagged:tmp
contract flagged:processed
transpose
dataset-options /yerrors=y2

Note some of the code above can be greatly simplified using new features present in the upcoming 3.0 version, but that is the topic for another post.

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code and compile it yourself or buy precompiled versions for MacOS and Windows there.

## Thursday, October 22, 2020

### QSoas tips and tricks: generating smooth curves from a fit

Often, one would want to generate smooth data from a fit over a small number of data points. For an example, take the data in the following file. It contains (fake) experimental data points that obey to Michaelis-Menten kinetics: $$v = \frac{v_m}{1 + K_m/s}$$ in which $$v$$ is the measured rate (the y values of the data), $$s$$ the concentration of substrate (the x values of the data), $$v_m$$ the maximal rate and $$K_m$$ the Michaelis constant. To fit this equation to the data, just use the fit-arb fit:
QSoas> l michaelis.dat
QSoas> fit-arb vm/(1+km/x)
After running the fit, the window should look like this:
Now, with the fit, we have reasonable values for $$v_m$$ (vm) and $$K_m$$ (km). But, for publication, one would want to generate "smooth" curve going through the lines... Saving the curve from "Data.../Save all" doesn't help, since the data has as many points as the original data and looks very "jaggy" (like on the screenshot above)... So one needs a curve with more data points.

Maybe the most natural solution is simply to use generate-buffer together with apply-formula using the formula and the values of km and vm obtained from the fit, like:

QSoas> generate-buffer 0 20
QSoas> apply-formula y=3.51742/(1+3.69767/x)
By default, generate-buffer generate 1000 evenly spaced x values, but you can change their number using the /samples option. The two above commands can be combined to just one call to generate-buffer:
QSoas> generate-buffer 0 20 3.51742/(1+3.69767/x)
This works, but it is quite cumbersome and it is not going to work well for complex formulas or the results of differential equations or kinetic systems...

This is why to each fit- command corresponds a sim- command that computes the result of the fit using a "saved parameters" file (here, michaelis.params, but you can also save it yourself) and buffers as "models" for X values:

QSoas> generate-buffer 0 20
QSoas> sim-arb vm/(1+km/x) michaelis.params 0
This strategy works with every single fit ! As an added benefit, you even get the fit parameters as meta-data, which are displayed by the show command:
QSoas> show 0
Dataset generated_fit_arb.dat: 2 cols, 1000 rows, 1 segments, #0
Flags:
Meta-data:	commands =	 sim-arb vm/(1+km/x) michaelis.params 0	fit =	 arb (formula: vm/(1+km/x))	km =	 3.69767
vm =	 3.5174
They also get saved as comments if you save the data.

Important note: the sim-arb command will be available only in the 3.0 release, although you can already enjoy it if you use the github version.

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code and compile it yourself or buy precompiled versions for MacOS and Windows there.

## Wednesday, October 7, 2020

### QSoas quiz #1 : averaging spectra

Here is the first QSoas quiz ! I recently measured several identical spectra in a row to evaluate the noise of the setup, and so I wanted to average all the spectra and also determine the standard deviation in the absorbances. Averaging the spectra can simply be done taking advantage of the average command:
QSoas> load Spectrum*.dat /flags=spectra
QSoas> average flagged:spectra

However, average does not provide means to make standard deviations, it just takes the average of all but the X column. I wanted to add this feature, but I realized there are already at least two distinct ways to do that...

#### Quiz

Your task is to determine the average and standard deviations of the three spectra located there, (Spectrum-1.dat, Spectrum-2.dat and Spectrum-3.dat). There are at least two ways:
• One that relies simply on average and on apply-formula, and which requires that you remember how to compute standard deviations.
• One that is a little more involved, that requires more data manipulation (take a look at contract for instance) and relies on the fact that you can use statistics in apply-formula (and in particular you can use y_stddev to refer to the standard deviation of $$y$$), but which does not require you to know exactly how to compute standard deviations.
To help you, I've added the result in Average.dat. The figure below shows a zoom on the data superimposed to the average (bonus points to find how to display this light red area that corresponds to the standard deviation !).
I will post the answer later. In the meantime, feel free to post your own solutions or attempts, hacks, and so on !

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code or buy precompiled versions for MacOS and Windows there.

## Wednesday, September 23, 2020

### Tutorial: analyze Km data of CODHs

This is the first post of a series in which we will provide the readers with simple tutorial approaches to reproduce the data analysis of some of our published papers. All our data analysis is performed using QSoas. Today, we will show you how to analyze the $K_m$ experiments we used to characterize the behaviour of an enzyme, the Nickel-Iron CO dehydrogenase IV from Carboxytothermus hydrogenoformans. The experiments we analyze here are described in much more details in the original publication, Domnik et al, Angewandte Chemie, 2017. The only things you need to know for now are the following:
• the data is the evolution of the current over time given by the absorbed enzyme;
• at a given point, a certain amount of CO (50 µM) is injected, and its concentration decreases exponentially over time;
• the enzyme has a Michaelis-Menten behaviour with respect to CO.
This means that we expect a response of the type: $$i(t) = \frac{i_m}{1 + \frac{K_m}{[\mathrm{CO}](t)}}$$ in which $$[\mathrm{CO}](t) = \begin{cases} 0, & \text{for } t < t_0 \\ C_0 \exp \frac{t_0 - t}{\tau}, & \text{for }t\geq t_0 %> \end{cases}$$

To begin this tutorial, first download the files from the github repository (direct links: data, parameter file and ruby script). Start QSoas, go to the directory where you saved the files, load the data file, and remove spikes in the data using the following commands:

QSoas> cd
QSoas> l Km-CODH-IV.dat
QSoas> R

##### First fit
Then, to fit the above equation to the data, the simplest is to take advantage of the time-dependent parameters features of QSoas. Run simply:
QSoas> fit-arb im/(1+km/s) /with=s:1,exp
This simply launches the fit interface to fit the exact equations above. The im/(1+km/s) is simply the translation of the Michaelis-Menten equation above, and the /with=s:1,exp specifies that s is the result of the sum of 1 exponential like for the definition of $[\mathrm{CO}](t)$ above. Then, load the Km-CODH-IV.params parameter files (using the "Parameters.../Load from file" action at the bottom, or the Ctrl+L keyboard shortcut). Your window should now look like this:
To fit the data, just hit the "Fit" button ! (or Ctrl+F).
##### Including an offset
The fit is not bad, but not perfect. In particular, it is easy to see why: the current predicted by the fit goes to 0 at large times, but the actual current is below 0. We need therefore to include an offset to take this into consideration. Close the fit window, and re-run a fit, but now with this command:
QSoas> fit-arb im/(1+km/s)+io /with=s:1,exp
Notice the +io bit that corresponds to the addition of an offset current. Load again the base parameters, run the fit again... Your fit window show now look like:
See how the offset current is now much better taken into account. Let's talk a bit more about the parameters:
• im is $$i_m$$, the maximal current, around 120 nA (which matches the magnitude of the original current).
• io is the offset current, around -3nA.
• km is the $$K_m$$, expressed in the same units as s_1, the first "injected" value of s (we used 50 because the injection is 50 µM CO). That means the value of $$K_m$$ determined by this fit is around 9 nM ! (but see below).
• s_t_1 is the time of the injection of CO (it was in the parameter files you loaded).
• Finally, s_tau_1 is the time constant of departure of CO, noted $$\tau$$ in the equations above.
##### Taking into account mass-transport limitations
However, the fit is still unsatisfactory: the predicted curve fails to reproduce the curvature at the beginning and at the end of the decrease. This is due to issues linked to mass-transport limitations, which are discussed in details in Merrouch et al, Electrochimica Acta, 2017. In short, what you need to do is to close the fit window again, load the transport.rb Ruby file that contains the definition of the itrpt function, and re-launch the fit window using:
QSoas> ruby-run transport.rb
QSoas> fit-arb itrprt(s,km,nFAm,nFAmu)+io /with=s:1,exp
Load again the parameter file... but this time you'll have to play a bit more with the starting parameters for QSoas to find the right values when you fit. Here are some tips:
• the curve predicted with the current parameters (use "Update" to update the display) should "look like" the data;
• apart from io, no parameter should be negative;
• there may be hints about the correct values in the papers...
A successful fit should look like this:
Here you are ! I hope you enjoyed analyzing our data, and that it will help you analyze yours ! Feel free to comment and ask for clarifications.