This content is auto-generated from the notes site. There may be formatting issues as it is transcribed.
If you are using iolite 4, you’ve probably noticed references to something called ‘python’ in the user interface. Python is a general purpose interpreted programming language widely for many things, but has become particularly popular among scientists for data processing and analysis. iolite 4 has a built-in python interpreter that you can use for a variety of tasks and one of the main goals of iolite notes is to educate our users on the various ways we can harness its power to do something interesting.
More about python¶
Like all programming languages, python has a specific syntax. To acquaint yourself with python syntax you can refer to any number of external resources, such as:
Libraries, packages and friends¶
iolite 4 is built with Qt, and a great deal of that functionality is available to you through iolite’s use of PythonQt. More information about the Qt and PythonQt application programming interface (API) and functionality is readily available online.
Python is known for its many packages. iolite is distributed with many useful data science packages such as:
It is also possible to add your own packages to be used with iolite’s python interpreter. More on that later!
The python “API”¶
The API is described in the online iolite documentation. A few basics are outlined below to give you an idea how it works.
Getting a selection from a selection group:
sg = data.selectionGroup('G_NIST610') s = sg.selection(0)
Doing something to each selection:
sg = data.selectionGroup('G_NIST610') for s in sg.selections(): doSomethingWithSelection(s) # Where the function doSomethingWithSelection has been defined elsewhere
Getting the data and time arrays from a channel:
c = data.timeSeries('U238') d = c.data() t = c.time()
Note: if you try to get a selection group or channel that does not exist, python will raise an exception, e.g.
- Traceback (most recent call last):
- File “
”, line 1, in RuntimeError: std::runtime_error: [DataManager::timeSeries] no matches for …
Integration in iolite¶
We have integrated python in several places, not all of which are obvious, so let’s review the various ways python can be used now.
This is a place to type in a quick command or short series of commands that you are not likely to do often. It can be accessed from the Tools menu as Python Console or from the keyboard short cut CTRL+SHIFT+P (PC) or CMD+SHIFT+P (Mac).
This is a place to write small (or large!) reusable scripts that are not suitable to be written as a plugin (see below). It can be accessed from iolite’s toolbar on the left side of the main window.
We have also included the ability to write plugins or add ons for iolite by including specific metadata in your python file and putting the file in the location configured in iolite’s preferences.
Note that by default these paths are within the main program folder (i.e. in Program Files for Windows and /Applications/iolite4.app for Mac). This is convenient, but problematic (see below). If you want to keep iolite up to date (recommended) and have your own add ons and/or reference material files, you should change these paths to a separate folder (e.g. in your Documents).
Python plugin types that required extra metadata include importers, data reduction schemes, QA/QC and user interface (UI). The meta data generally looks something like this:
#/ Type: DRS #/ Name: U-Pb Python Example #/ Authors: Joe Petrus and Bence Paul #/ Description: Simple U-Pb with downhole fractionation corrections #/ References: Paton et al., 2010 G3 #/ Version: 1.0 #/ Contact: email@example.com
In addition to the metadata, certain functions are also required and vary depending on the plugin type. This metadata and the required functions allow iolite to parse these files and insert them into the user interface as appropriate. For example, a DRS needs to have functions runDRS and settingsWidget. To learn more about plugins and their format you can visit our examples repository on github.
There are a few more places that python can be used in iolite, including to write custom export scripts (e.g. if you wanted to output certain columns in a certain format and did not want to fuss with it after exporting), within processing templates (e.g. if you wanted to check for outliers within selection groups and move them to a different group, or to perform some custom QA/QC). The calculator tool also uses python to evaluate each expression and therefore python syntax and benefits apply.
- Since some parts of iolite depend on python, if the path to python packages is not configured properly in iolite’s preferences, iolite may not be able to start. If that happens, you can hold down shift while starting iolite to reset some of the preferences to their default value.
- Since the paths for python-based add ons are within the program folder by default, you may encounter problems when trying to create a new add on from the iolite interface because the operating system may prevent you from creating new files in those locations.
- Additionally, if you have modified files in the default paths be warned that these files will be lost when updating!
Click here to discuss.
Manipulating selections from python¶
A reminder about selections¶
A selection in iolite is simply a named (and grouped) period of time. Data processing in iolite revolves around groups of these selections and groups of different types. A selection’s period of time is defined by a start time and an end time. Of course things can be more complicated when you throw in linked selections and other properties, but for now let’s consider a few ways we can manipulate selections with python.
A reminder about python¶
Python is a programming language that you can write add ons and other scripts in for iolite 4. To read more about python and the various ways you can use it in iolite, see here.
Adjusting all the selections in a group¶
# first we get the desired selection group by name and assign it to a variable called 'sg': sg = data.selectionGroup('GroupName') # then we use a for loop to iterate over all the selections in the group: for s in sg.selections(): s.startTime = s.startTime.addSecs(2)
Creating selections based on some arithmetic¶
from iolite.QtCore import QDateTime # Establish a start time using a specific format start = QDateTime.fromString('2020-02-06 12:00:00.000', 'yyyy-MM-dd hh:mm:ss.zzz') # Create a group named "TestGroup" of type "Sample" group = data.createSelectionGroup('TestGroup', data.Sample) duration = 20 # Selection duration in seconds gap = 2 # Gap between selections in seconds N = 30 # Number of selections to create # Create the selections according to the parameters above: for i in range(0, N): this_start = start.addSecs(i*(duration + gap)) this_end = this_start.addSecs(duration) data.createSelection(group, this_start, this_end, 'Sel%i'%(i))
Split selection into several sub selections¶
s = data.activeSelection() # Get the active selection dest_group = data.createSelectionGroup('Split', data.Sample) # Create a group called split N = 20 # Set the number of selections to split into # Calculate the duration of each sub-selection dur = (s.endTime.toMSecsSinceEpoch() - s.startTime.toMSecsSinceEpoch())/(1000.0*N) ps = None # Used to keep track of previous selection # Work out the start and end time for each sub-selection and add it to the group for i in range(0, N): if not ps: this_start = s.startTime else: this_start = ps.endTime this_end = this_start.addMSecs(1000.0*dur) ps = data.createSelection(dest_group, this_start, this_end, 'Split%i'%(i))
Click here to discuss.
Downsampling and exporting time series data¶
We recently had an email asking about exporting time series data that had been smoothed by averaging. It turns out this is something very easy to do with iolite 4’s built in python interpreter. Before continuing, if you are not familiar with iolite’s python functionality you may want to check out this post first.
Step by step¶
Getting access to a channel’s data and time in iolite via python is as easy as:
d = data.timeSeries('U238').data() t = data.timeSeries('U238').time()
This gets us the data/time as NumPy arrays. Now downsampling this data by averaging can be done as follows:
import numpy as np ds = np.mean(d.reshape(-1, 4), 1)
where the 4 means we will average 4 points together. However, this assumes that the length of your data array is divisible by 4. To make things a bit more generic, we could write it as follows:
def downsample_for_export(data_array, n): end = n * int(len(data_array)/n) return np.mean(data_array[:end].reshape(-1, n), 1) ds = downsample_for_export(d, 4)
To demonstrate that this works, we can plot the before and after for a specific selection:
Saving the data is also easy using NumPy (delimited, e.g. csv) or pandas (many formats, including Excel):
np.savetxt("/Users/name/Desktop/file.csv", ds, delimiter=",") import pandas as pd pd.DataFrame(data=ds).to_excel("/Users/name/Desktop/file.xlsx", index=False)
Now suppose we want to be able to specify the amount of time for each data point rather than the number of data points. We could do that by inspecting the time array:
tdelta = np.diff(t).mean() n = int(time_per_point/tdelta)
If we wanted to do it for every channel without needing to specify their names, we could use a for loop as follows:
for channel in data.timeSeriesList(): ds = downsample_for_export(channel.data(), time_per_point) ...
Lastly, we can also bring in some Qt dependencies to make things a bit more user friendly:
from iolite.QtGui import QFileDialog, QInputDialog time_per_point = QInputDialog.getDouble(None, "Time per point", "Time per point", 1) filename = QFileDialog.getSaveFileName()
Putting it all together¶
Combining everything above into one neat little script that will ask us what we want for the time per point and where to save the data looks like this:
import numpy as np import pandas as pd from iolite.QtGui import QFileDialog, QInputDialog def downsample_for_export(channel, time_per_point): d = channel.data() t = channel.time() tdelta = np.diff(t).mean() n = int(time_per_point/tdelta) end = n * int(len(d)/n) return (np.mean(t[:end].reshape(-1, n), 1), np.mean(d[:end].reshape(-1, n), 1)) tpoint = QInputDialog.getDouble(None, "Time per point", "Time per point", 1) columns = ["Time"] data_ds =  for channel in data.timeSeriesList(): t_ds, d_ds = downsample_for_export(channel, tpoint) if not data_ds: data_ds.append(t_ds) columns.append(channel.name) data_ds.append(d_ds) filename = QFileDialog.getSaveFileName() df = pd.DataFrame.from_items(zip(columns, data_ds)) df.to_excel(filename, index=False, sheet_name="Downsampled data")
Note that now my downsample_for_export function returns a tuple of time and data so that the downsampled time can be used for plotting purposes. Also note that time in iolite is stored as “time since epoch” in seconds. See here for more info, but it is essentially the number of seconds since 1970-01-01 00:00:00.000 UTC.
How I made the plot¶
import matplotlib.pyplot as plt import numpy as np import matplotlib.dates as mdate s = data.selectionGroup('Demo').selection(0) U = data.timeSeries('U238').dataForSelection(s) t = data.timeSeries('U238').timeForSelection(s) t = mdate.epoch2num(t) def downsample_for_export(data_array, n): end = n * int(len(data_array)/n) return np.mean(data_array[:end].reshape(-1, n), 1) plt.clf() fig, ax = plt.subplots() plt.plot_date(t, U, "-", label='before') plt.plot_date(downsample_for_export(t, 4), downsample_for_export(U, 4),"-", label='after') date_formatter = mdate.DateFormatter("%H:%M:%S") ax.xaxis.set_major_formatter(date_formatter) fig.autofmt_xdate() plt.legend() plt.savefig('/Users/japetrus/Desktop/demo.png')
Click here to discuss.
Accessing session data from 3rd party software¶
In iolite 3, sessions were saved as Igor Pro “packed experiment” pxp files. This was convenient because iolite 3 was built on top of Igor Pro. The main downside to this was that Igor Pro pxp files cannot be easily read by most 3rd party software. In iolite 4, sessions are saved as files with an “io4” extension. However, this extension is just for show as the data format of the file is actually HDF5. This is super convenient because many scientific data analysis and plotting packages support the HDF5 format.
As an example, let’s go through how you can load data from an iolite 4 session file into Igor Pro to be plotted.
Making Igor Pro aware of HDF¶
Igor Pro comes with all the tools to enable HDF interaction but it does not enable them by default. To enable Igor’s HDF functionality (for Igor Pro 8 64bit, others may differ):
- Go to the Help -> Show Igor Pro Folder menu item.
- Go to the Help -> Show Igor Pro User Files menu item.
- Navigate to “More Extensions (64-bit)/File Loaders” in the Igor Pro folder, find “HDF5-64.xop” and copy it (or make a shortcut to it).
- Paste “HDF5-64.xop” in the “Igor Extensions (64-bit)” folder in the User Files folder.
- Navigate to “Wavemetrics Procedures/File Input Output” in the Igor Pro folder, find “HDF5 Browser.ipf” and copy it (or make a shortcut to it).
- Paste “HDF5 Browser.ipf” in the “Igor Procedures” folder in the User Files folder.
- Restart Igor Pro.
That’s it. You should now be able to access the HDF data browser by going to the Data -> Load Waves -> New HDF5 Browser menu item in Igor Pro.
Importing some data¶
When you activate the “New HDF5 Browser” menu item, you will be greeted by a dialog that looks as follows:
To start, we click the highlighted “Open HDF5 File” button and select the iolite 4 session file we want to get some data from. Once the file is loaded, you can navigate the various groups on the left and datasets on the right. Once you have found a channel you want to plot, you can select it and lick the “Load Dataset” button highlighted below.
You may need to repeat this process a few times to get all the channels you want loaded into Igor Pro. Also note that some channels use “IndexTime”, which you can find in the “Special” group.
Adjusting the time¶
Igor and iolite handle time a bit differently. If it matters to you that the time axis is the true value, you’ll want to adjust it as follows (e.g. in the Command Window - Ctrl+J or Cmd+J):
IndexTime += Date2Sec(1970, 1, 1)
Making a plot¶
Now that you have the channel and time data loaded, you can make a plot in the usual Igor Pro way. You can start by going to the Windows -> New Graph menu item and selecting the time as X and the channel as Y. I chose one of the final age outputs to plot and made some adjustments to the style:
Click here to discuss.
Installing additional python packages¶
iolite comes with many useful python packages, but we cannot anticipate everything our users might want to use python for in iolite. If your great idea depends on additional python packages that we do not include, here is a quick overview of one way you can install those packages.
Typically the Python Package Index is the best place to start if you know which package you’re looking for. You would start by searching for the package, clicking the Download files link on the left side under Navigation and downloading the appropriate file for your operating system and python version.
This is where we encounter the first possible complication. Python packages that are pure python (no machine/operating system specific code) are easy and will normally be provided as a single .whl or zip file. Python packages that are not pure python are a bit tricker and require you to know which version of python you are using (version 3.6 is embedded in iolite at the time of writing but may change in the future) and which operating system / architecture you are using. As an example for the latter scenario, let’s look at the files available for SciPy. The format generally looks like:
Where VERSION is the package version (1.4.1 as I write this), PYTHONVERSION is 36 and the OSINFO might be something like win_amd64 or macosx_10_6_intel. So, for this particular example, the files we would want to download would be scipy-1.4.1-cp36-cp36m-win_amd64.whl for PC and scipy-1.4.1-cp36-cp36m-macosx_10_6_intel.whl for Mac.
Note that this technique does not do any dependency resolution. For example, if package A depends on package B and you’ve installed A as above, you will also need to install package B.
As an example, let’s install SymPy, a symbolic mathematics package. When we search the Python Package Index and go to the SymPy download files there are two versions: a .whl file and a .tar.gz. A .whl file is a python package format that is essentially a zip file and the .tar.gz is a also a compressed archive, but one not commonly used on Macs and PCs. If we download the .whl version, we could simply rename it to .zip, extract it and copy it to our python site-packages. The location of your site-packages can be found in iolite’s preferences. However, do note that you can also add a location outside of the application installation directory if you do not want these packages replaced every time you update iolite.
Alternatively, a simple python script runnable from the python workspace can be used to install .whl files:
from iolite.QtGui import QFileDialog, QMessageBox import zipfile import sys whl_file = QFileDialog.getOpenFileName() site_path = [p for p in sys.path if 'site-packages' in p] button = QMessageBox.question(None, 'Install Wheel', 'Are you sure you want to install:\n%s\nto\n%s?'%(whl_file, site_path)) if button == QMessageBox.Yes: with zipfile.ZipFile(whl_file, 'r') as zip_ref: zip_ref.extractall(site_path) QMessageBox.information(None, 'Install Wheel', 'You will need to restart iolite before you can use the new package')
Now if you try to import the sympy package in iolite, for example:
from sympy import * # or from sympy import expr
you will likely see an error as follows:
ModuleNotFoundError: No module named ‘mpmath’
During handling of the above exception, another exception occurred:
- Traceback (most recent call last):
”, line 1, in File “/Applications/iolite4.app/Contents/Frameworks/python3.6/site-packages/sympy/init.py”, line 21, inraise ImportError(“SymPy now depends on mpmath as an external library. “
ImportError: SymPy now depends on mpmath as an external library. See https://docs.sympy.org/latest/install.html#mpmath for more information.
If you examine the error message, you’ll see that sympy also depends on a package called mpmath. We can again use the Python Package Index to find and download mpmath. In this case, they only provide a .tar.gz, so you would need to find a way to extract that on your operating system. Since the archive is not in the .whl format, we cannot simply copy the whole thing into our site-packages. However, by inspecting the contents, it is apparent that this is again a pure python package and we can copy the mpmath folder within the archive to iolite’s site-packages path.
Once that is complete, we can check that sympy is working via iolite’s console (or workspace):
from sympy import * x = Symbol('x') print(integrate(1/x, x)) # outputs log(x)
And that is it! If you have any questions or if you are having a hard time installing a package, please send us an email.
Click here to discuss.
Transforming laser data into channels and results¶
Using laser logs to help sort out what’s what in your data files is a significant time saver. Now, with iolite 4’s python API (see here and here for more info) you can do even more with your laser data!
Accessing laser data in python¶
The key is
data.laserData(). This function returns a python dictionary or dict for short. That dict has keys corresponding to each log file that has been imported. The values associated with the file keys are also python dicts, but this time there are several keys, such as ‘x’, ‘y’, or ‘width’, and those keys are associated with data arrays.
As an example, we can get the laser’s time and x position as follows:
laser_data = data.laserData() # Since I don't want to write out the log file path, we can look that up log_file = list(laser_data.keys()) # The first log file (0 = first) x = laser_data[log_file]['x'] x_time = x.time() x_data = x.data() # Now you can do stuff with that data!
This is great, but the laser log’s time is not the same as our normal data channel time. To get the laser data on the same time frame as our normal channels we can use NumPy to interpolate it on to our index time.
import numpy as np index_time = data.timeSeries('TotalBeam').time() x_data_ontime = np.interp(index_time, x_time, x_data) # Now you can do stuff directly comparing laser data and channel data!
We can take things one step further and actually create channels for each of the laser data arrays as follows:
data.createTimeSeries('Laser_x', data.Intermediate, index_time, x_data_ontime)
The beauty of this approach is that now the laser data can be treated like any other channel in iolite. You can visualize the data in the Time Series view, and you can also export results for selections on those channels.
Putting it all together¶
import numpy as np ld = data.laserData() index_time = data.timeSeries('TotalBeam').time() laser_params = ['x', 'y', 'state', 'width', 'height', 'angle'] for param in laser_params: laser_time = np.array() laser_data = np.array() # We will iterate through each of the main laserData keys and # concatenate the arrays to handle the case where multiple logs # have been imported for ld_key in ld.keys(): laser_time = np.concatenate([laser_time, ld[ld_key][param].time()], axis=0) laser_data = np.concatenate([laser_data, ld[ld_key][param].data()], axis=0) interp_data = np.interp(index_time, laser_time, laser_data) data.createTimeSeries('Laser_%s'%(param), data.Intermediate, index_time, interp_data)
Having a look at some laser data¶
import matplotlib.pyplot as plt # Get data x = data.timeSeries('Laser_x') y = data.timeSeries('Laser_y') fig, ax = plt.subplots() # Plot X ax.plot(x.time() - x.time(), x.data(), 'b-') # Plot Y on right ax2 = fig.add_subplot(111, sharex=ax, frameon=False) ax2.yaxis.tick_right() ax2.yaxis.set_label_position('right') ax2.plot(y.time() - y.time(), y.data(), 'r-') # Labels ax.set_xlabel('Time (s)') ax.set_ylabel('X (um)') ax2.set_ylabel('Y (um)') ax.set_xlim( (0, 10000) ) fig.tight_layout() plt.savefig('/Users/japetrus/Desktop/laserdata.png')
import matplotlib.pyplot as plt # Get data laserX = data.timeSeries('Laser_x') laserY = data.timeSeries('Laser_y') group = data.selectionGroup('G_NIST610') x = [data.result(s, laserX).value() for s in group.selections()] y = [data.result(s, laserY).value() for s in group.selections()] plt.plot(x, y, 'o') plt.xlabel('X (um)') plt.ylabel('Y (um)') plt.savefig('/Users/japetrus/Desktop/xy.png')
Click here to discuss.
Comparing splines - part 1¶
Curve fitting plays an import role in iolite’s data reduction pipeline. In order to get accurate results we must have an accurate representation of how our backgrounds and sensitivities are evolving with time. iolite 3’s automatic spline was very good at providing a smooth but not too smooth representation of our data. We wanted to recreate that functionality in iolite 4, but since we no longer have access to Igor Pro’s fitting functions we had to go back to the drawing board.
Scanning the Igor Pro documentation regarding smoothing splines, one can discover that their implementation is based on “Smoothing by Spline Functions”, Christian H. Reinsch, Numerische Mathematik 10, but it also had some iolite special sauce. Hoping to keep our splining relatively consistent from iolite 3 to 4, especially the automatic spline, this algorithm seemed like a good place to start. It turns out that this algorithm is quite popular (> 2500 citations!). One adaptation of this algorithm that also adds generalized cross validation for the automatic choice of smoothing parameter was published in “Algorithm 642: A fast procedure for calculating minimum cross-validation cubic smoothing splines”, M.F. Hutchinson, Transactions on Mathematical Software 12, and it is this algorithm on which iolite 4’s automatic spline is based.
Comparing with iolite 3¶
iolite 4 comes with the igor python module bundled so you can read data from pxp files (i.e. iolite 3) in python. We can make use of this module to extract iolite 3 spline data for comparison with iolite 4 after importing an iolite 3 pxp session into iolite 4. See below for one way of doing that.
from igor import packed import numpy as np import matplotlib.pyplot as plt # Specify the group and channel to check: group_name = 'G_NIST612' channel_name = 'Fe57' # Figure out which pxp was imported into iolite 4: f = data.importedFiles().filePath() if not f.endswith('pxp'): raise RuntimeError('No iolite 3 experiment open?') # Load that pxp using the igor python module: d = packed.load(f) # Get the spline data and time for the group and channel specified for iolite 3: splines_folder = d['root'][b'Packages'][b'iolite'][b'integration'][b'Splines'] i3spline = splines_folder[b'%b_%b'%(channel_name.encode('utf-8'), group_name.encode('utf-8'))].wave['wave']['wData'] i3t = np.copy(splines_folder[b'TimeWave_%b'%(group_name.encode('utf-8'))].wave['wave']['wData']) i3t -= i3t # Adjust the time so it starts at 0 # Get the spline data and time for the group and channel specified for iolite 4: i4spline = data.spline(group_name, channel_name) i4data = np.copy(i4spline.data()) i4t = np.copy(i4spline.time()) i4t -= i4t # Adjust the time so it starts at 0 # Interpolate the iolite 4 data to be on the same time frame as iolite 3's spline: i4_3t = np.interp(i3t, i4t, i4data) # Calculate a percent difference between the two i4_3t = 100*(i4_3t-i3spline)/i4_3t # Plot it: plt.clf() plt.plot(i3t, i3spline, label='iolite 3') plt.plot(i4t, i4data, label='iolite 4') #plt.plot(i3t, i4_3t) # to plot percent diff instead plt.xlim( (0, 10000) ) plt.lengend() plt.xlabel('Time (s)') plt.ylabel('%s - %s Spline %% difference'%(group_name, channel_name)) plt.savefig('/home/japetrus/Desktop/spline_compare.png')
This is fine, but the two splines are almost perfectly on top of each other, so it is easier to see the difference if we plot a percent difference. This requires only a few small changes to the above script to achieve. The output looks as follows where you can see a mere 0.04 % difference between the two.
Comparing different spline types¶
Sometimes it is also nice to see visualize how several different spline types match up with the measured data on which they’re based. The script below is one example of how you can do that.
import matplotlib.pyplot as plt import numpy as np # Specify the group and channel to do the comparison for: group_name = 'G_NIST612' channel_name = 'Fe57' group = data.selectionGroup(group_name) # Make a list of spline types to compare: stypes = ['Spline_NoSmoothing', 'Spline_AutoSmooth', 'MeanMean', 'LinearFit'] # Collect the measurement data into lists: x = [s.startTime.toMSecsSinceEpoch()/1000 for s in group.selections()] y = [data.result(s, data.timeSeries(channel_name)).value() for s in group.selections()] yerr = [data.result(s, data.timeSeries(channel_name)).uncertaintyAs2SE() for s in group.selections()] # Plot the measurement data as error bars: plt.errorbar(x, y, yerr=yerr, xerr=None, fmt='ko', label='Measurements') # For each of the spline types, get the spline and plot it for st in stypes: group.splineType = st s = data.spline(group_name, channel_name) plt.plot(s.time(), s.data(), label=st) # Finish up with the plot: plt.legend() plt.xlim( np.min(x)-100, np.max(x) + 100 ) plt.ylabel('Spline for %s and %s (CPS)'%(group_name, channel_name)) plt.xlabel('Time since epoch (s)') plt.savefig('/home/japetrus/Desktop/spline_test.png')
Click here to discuss.