qmi.data.datastore

Routines for data storage.

Classes

DataFolder(folder_path[, label, date_str, ...])

A DataFolder represents a collection of files from a single measurement.

DataStore(basedir)

A DataStore represents a collection of stored data.

class qmi.data.datastore.DataFolder(folder_path: str, label: str | None = None, date_str: str | None = None, time_str: str | None = None)

A DataFolder represents a collection of files from a single measurement.

A DataFolder typically exists within a DataStore. In this case the DataFolder is identified by a label, date code and time code.

The contents of a DataFolder may consist of:
  • any number of data files from the measurement;

  • a QMI configuration file and/or additional configuration files;

  • data files resulting from analysis;

  • plotted images.

write_config(config: Any) None

Write QMI configuration to a file in the data folder.

copy_file(filename: str) None

Copy an existing file into the data folder.

Parameters:

filename – Path to existing file to be copied to the data folder.

Raises:

FileExistsError – If the data folder already contains a file with the same name.

write_dataset(ds: DataSet, file_format: str = 'hdf5', overwrite: bool = False, backend: str = 'h5py') None

Write the specified DataSet to a new or existing file in the data folder.

The file name will be determined from the name of the DataSet.

Parameters:
  • ds – DataSet instance to write.

  • file_format – File format specification. - “hdf5”. Selects HDF5 format with (default); - “text”. Selects a space-separated text format.

  • overwrite – Allow user to overwrite an existing dataset. Default is False.

  • backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.`

Raises:
  • ValueError – Dataset name is invalid.

  • ValueError – Invalid HDF5 file backend.

  • OSError – If the data folder already contains a file with the same name.

  • QMI_UsageException – Dataset name already exists and overwrite not allowed.

read_dataset(name: str, backend: str = 'h5py') DataSet

Read a DataSet from the data folder.

The file name and format will be determined from the name of the DataSet and the contents of the data folder.

Parameters:
  • name – Name of the dataset.

  • backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.

Returns:

Instance loaded from the data folder.

Return type:

DataSet

Raises:
  • ValueError – Dataset name is invalid.

  • ValueError – Invalid HDF5 file backend.

  • FileNotFoundError – If the data folder does not contain the specified dataset.

make_hdf5file(name: str, backend: str = 'h5py') File | File

Create a new HDF5 file in the data folder.

An error occurs if the specified file already exists.

Parameters:
  • name – Base name of the HDF5 file, without the extension “.h5” or “.hdf5”.

  • backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.

Returns:

A file object representing a HDF5 file.

Return type:

hdf5_file

open_hdf5file(name: str, write_mode: bool = False, backend: str = 'h5py') File | File

Open an existing HDF5 file in the data folder.

Parameters:
  • name – Base name of the HDF5 file, without the extension “.h5” or “.hdf5”.

  • write_mode – True to open the file in read/write mode, False to open the file in read-only mode.

  • backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.

Returns:

A File object representing the HDF5 file. See http://docs.h5py.org/ for information on how to use this object.

add_dataset_to_file(hdf5_file: File | File, ds: DataSet, root_attrs: dict[str, str | int | float | complex | ndarray | integer | list[int | float | complex | str] | tuple[int | float | complex | str, ...]] | None = None) None

Add a dataset, and optional file attributes to an existing HDF5 file.

Parameters:
  • hdf5_file – The file instance to add the dataset in.

  • ds – A QMI dataset instance that is to be added.

  • root_attrs – A dictionary of attributes that should be written in the file root. Default is None.

Raises:

QMI_UsageException – If an attribute with the dataset name already exists in the file.

class qmi.data.datastore.DataStore(basedir: str)

A DataStore represents a collection of stored data.

A DataStore instance can potentially contain data from many different types of measurements, performed at different times under different conditions.

A DataStore instance corresponds to a folder in the file system which contains the actual stored files. The DataStore instance provides a convenient interface to store and access the data.

The file system structure of the DataStore is as follows:

<basedir>/<date_str>/<time_str>_<label>/<measurement_file>

where

<date_str> is an 8-digit string in YYYYmmdd format; <time_str> is a 6-digit string in HHMMSS format.

In other words, the DataStore base directory contains a separate sub-directory for each date. Each of these date sub-directories contains a separate sub-directory for each measurement, labeled by a time code and label for the measurement. Each of the measurement subdirectories contains any number of files related to the measurement.

make_folder(label: str, timestamp: float | None = None, date_str: str | None = None, time_str: str | None = None) DataFolder

Create a new DataFolder with a unique name within the DataStore.

Optionally, either “timestamp” or both “date_str” and “time_str” may be specified to determine the date code of the folder name. When neither are specified, the current date and time will be used.

Parameters:
  • label – Short label describing the measurement. This label will be part of the directory name in the file system. It should not contain whitespace or strange characters.

  • timestamp – Optional POSIX timestamp to use for the folder name.

  • date_str – Optional date code to use for the folder name. If specified, it must be a string of 8 digits in YYYYmmdd format.

  • time_str – Optional time code to use for the folder name. If specified, it must be a string of 6 digits in HHMMSS format.

Returns:

New DataFolder instance.

Raises:

FileExistsError – If the DataFolder already exists.

get_folder(label: str, date_str: str, time_str: str) DataFolder

Open the DataFolder item with specified date code and label.

Parameters:
  • date_str – Date code of the DataFolder.

  • time_str – Time code of the DataFolder.

  • label – Label of the DataFolder.

Returns:

The matching DataFolder.

Raises:

FileNotFoundError – If the specified DataFolder does not exist.

get_folder_from_path(path: str) DataFolder

Open the DataFolder with the specified path in the filesystem.

Parameters:

path – Path to the DataFolder. The path may be either an absolute path, or a relative path from the DataStore base directory, or a relative path from the current directory.

Returns:

The matching DataFolder instance.

Raises:

FileNotFoundError – If the specified DataFolder does not exist.

list_folders(label: str | None = None) list[DataFolder]

Return a list of DataFolder items in the DataStore.

This function may be slow when used on a large DataStore.

Parameters:

label – Optional folder label. When specified, only folders with a matching name are returned.

Returns:

List of matching DataFolder items.

Return type:

ret

find_latest_folder(label: str, date_str: str | None = None) DataFolder | None

Find the most recent matching DataFolder item.

Parameters:
  • label – Folder label to search for.

  • date_str – Optional date code to restrict the search. When not specified, the most recent matching folder from any date is returned.

Returns:

Most recent matching DataFolder, or None if no matching DataFolder exists.