qmi.data.datastore
Routines for data storage.
Classes
|
A DataFolder represents a collection of files from a single measurement. |
|
A DataStore represents a collection of stored data. |
- class qmi.data.datastore.DataFolder(folder_path: str, label: str | None = None, date_str: str | None = None, time_str: str | None = None)
A DataFolder represents a collection of files from a single measurement.
A DataFolder typically exists within a DataStore. In this case the DataFolder is identified by a label, date code and time code.
- The contents of a DataFolder may consist of:
any number of data files from the measurement;
a QMI configuration file and/or additional configuration files;
data files resulting from analysis;
plotted images.
- write_config(config: Any) None
Write QMI configuration to a file in the data folder.
- copy_file(filename: str) None
Copy an existing file into the data folder.
- Parameters:
filename – Path to existing file to be copied to the data folder.
- Raises:
FileExistsError – If the data folder already contains a file with the same name.
- write_dataset(ds: DataSet, file_format: str = 'hdf5', overwrite: bool = False, backend: str = 'h5py') None
Write the specified DataSet to a new or existing file in the data folder.
The file name will be determined from the name of the DataSet.
- Parameters:
ds – DataSet instance to write.
file_format – File format specification. - “hdf5”. Selects HDF5 format with (default); - “text”. Selects a space-separated text format.
overwrite – Allow user to overwrite an existing dataset. Default is False.
backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.`
- Raises:
ValueError – Dataset name is invalid.
ValueError – Invalid HDF5 file backend.
OSError – If the data folder already contains a file with the same name.
QMI_UsageException – Dataset name already exists and overwrite not allowed.
- read_dataset(name: str, backend: str = 'h5py') DataSet
Read a DataSet from the data folder.
The file name and format will be determined from the name of the DataSet and the contents of the data folder.
- Parameters:
name – Name of the dataset.
backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.
- Returns:
Instance loaded from the data folder.
- Return type:
- Raises:
ValueError – Dataset name is invalid.
ValueError – Invalid HDF5 file backend.
FileNotFoundError – If the data folder does not contain the specified dataset.
- make_hdf5file(name: str, backend: str = 'h5py') File | File
Create a new HDF5 file in the data folder.
An error occurs if the specified file already exists.
- Parameters:
name – Base name of the HDF5 file, without the extension “.h5” or “.hdf5”.
backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.
- Returns:
A file object representing a HDF5 file.
- Return type:
hdf5_file
- open_hdf5file(name: str, write_mode: bool = False, backend: str = 'h5py') File | File
Open an existing HDF5 file in the data folder.
- Parameters:
name – Base name of the HDF5 file, without the extension “.h5” or “.hdf5”.
write_mode – True to open the file in read/write mode, False to open the file in read-only mode.
backend – Select backend for HDF5 file format. Options are “hdf5” (default) and “h5netcdf”.
- Returns:
A File object representing the HDF5 file. See http://docs.h5py.org/ for information on how to use this object.
- add_dataset_to_file(hdf5_file: File | File, ds: DataSet, root_attrs: dict[str, str | int | float | complex | ndarray | integer | list[int | float | complex | str] | tuple[int | float | complex | str, ...]] | None = None) None
Add a dataset, and optional file attributes to an existing HDF5 file.
- Parameters:
hdf5_file – The file instance to add the dataset in.
ds – A QMI dataset instance that is to be added.
root_attrs – A dictionary of attributes that should be written in the file root. Default is None.
- Raises:
QMI_UsageException – If an attribute with the dataset name already exists in the file.
- class qmi.data.datastore.DataStore(basedir: str)
A DataStore represents a collection of stored data.
A DataStore instance can potentially contain data from many different types of measurements, performed at different times under different conditions.
A DataStore instance corresponds to a folder in the file system which contains the actual stored files. The DataStore instance provides a convenient interface to store and access the data.
- The file system structure of the DataStore is as follows:
<basedir>/<date_str>/<time_str>_<label>/<measurement_file>
- where
<date_str> is an 8-digit string in YYYYmmdd format; <time_str> is a 6-digit string in HHMMSS format.
In other words, the DataStore base directory contains a separate sub-directory for each date. Each of these date sub-directories contains a separate sub-directory for each measurement, labeled by a time code and label for the measurement. Each of the measurement subdirectories contains any number of files related to the measurement.
- make_folder(label: str, timestamp: float | None = None, date_str: str | None = None, time_str: str | None = None) DataFolder
Create a new DataFolder with a unique name within the DataStore.
Optionally, either “timestamp” or both “date_str” and “time_str” may be specified to determine the date code of the folder name. When neither are specified, the current date and time will be used.
- Parameters:
label – Short label describing the measurement. This label will be part of the directory name in the file system. It should not contain whitespace or strange characters.
timestamp – Optional POSIX timestamp to use for the folder name.
date_str – Optional date code to use for the folder name. If specified, it must be a string of 8 digits in YYYYmmdd format.
time_str – Optional time code to use for the folder name. If specified, it must be a string of 6 digits in HHMMSS format.
- Returns:
New DataFolder instance.
- Raises:
FileExistsError – If the DataFolder already exists.
- get_folder(label: str, date_str: str, time_str: str) DataFolder
Open the DataFolder item with specified date code and label.
- Parameters:
date_str – Date code of the DataFolder.
time_str – Time code of the DataFolder.
label – Label of the DataFolder.
- Returns:
The matching DataFolder.
- Raises:
FileNotFoundError – If the specified DataFolder does not exist.
- get_folder_from_path(path: str) DataFolder
Open the DataFolder with the specified path in the filesystem.
- Parameters:
path – Path to the DataFolder. The path may be either an absolute path, or a relative path from the DataStore base directory, or a relative path from the current directory.
- Returns:
The matching DataFolder instance.
- Raises:
FileNotFoundError – If the specified DataFolder does not exist.
- list_folders(label: str | None = None) list[DataFolder]
Return a list of DataFolder items in the DataStore.
This function may be slow when used on a large DataStore.
- Parameters:
label – Optional folder label. When specified, only folders with a matching name are returned.
- Returns:
List of matching DataFolder items.
- Return type:
ret
- find_latest_folder(label: str, date_str: str | None = None) DataFolder | None
Find the most recent matching DataFolder item.
- Parameters:
label – Folder label to search for.
date_str – Optional date code to restrict the search. When not specified, the most recent matching folder from any date is returned.
- Returns:
Most recent matching DataFolder, or None if no matching DataFolder exists.