General Utilities
These utilities are used by the pipeliner for basic tasks such as nice looking on-screen display, checking file names, and getting directory and file names
- class pipeliner.utils.DirectoryBasedLock(dirname: str | os.PathLike[str] = '.relion_lock', timeout=60.0)
Bases:
object
A lock based on the creation and existence of a directory on the file system.
The interface is almost the same as Python’s standard
multiprocessing.Lock
, except for some changes related to timeout behaviour:There is a default timeout of 60 seconds when acquiring the lock (rather than the default
None
value, with corresponding infinite timeout, that is used bymultiprocessing.Lock
). This is for compatibility with previous RELION locking timeout behaviour.A timeout for use when entering a context manager can be set when the lock object is created. Note that this value is ignored if the
acquire()
method is called directly. If there is a timeout waiting to acquire the lock when entering a context manager, aTimeoutError
is raised.
The principle of this lock is that directory creation is an atomic operation provided by the file system, even in (most, modern) networked file systems. If several processes try to create the same directory at the same time, only one will succeed and the rest will get an error. Therefore, we can use this as a locking primitive, acquiring the lock if we successfully create the directory and releasing it by deleting the directory afterwards.
The lock directory name can be set if required. For compatibility with RELION, the default directory name is “.relion_lock”.
- acquire(block=True, timeout=60.0)
Acquire a lock, blocking or non-blocking.
With the
block
argument set toTrue
(the default), the method call will block until the lock is in an unlocked state, then set it to locked and returnTrue
.With the
block
argument set toFalse
, the method call does not block. If the lock is currently in a locked state, returnFalse
; otherwise set the lock to a locked state and returnTrue
.When invoked with a positive, floating-point value for timeout, block for at most the number of seconds specified by timeout as long as the lock can not be acquired. The default is 60.0 seconds; note that this is different from the default timeout in
multiprocessing.Lock.acquire()
.Invocations with a negative value for timeout are equivalent to a timeout of zero. Invocations with a timeout value of
None
set the timeout period to infinite. The timeout argument has no practical implications if the block argument is set toFalse
and is thus ignored.- Returns:
True
if the lock has been acquired orFalse
if the timeout period has elapsed.
:raises Various possible errors from
os.mkdir()
including:FileNotFoundError
orPermissionError
.
- release()
Release the lock.
This can be called from any thread, not only the thread which has acquired the lock.
When the lock is locked, reset it to unlocked, and return. If any other threads are blocked waiting for the lock to become unlocked, allow exactly one of them to proceed.
When invoked on an unlocked lock, a RuntimeError is raised.
There is no return value.
- pipeliner.utils.check_for_illegal_symbols(check_string: str, string_name: str = 'input', exclude: str = '') str | None
Check a text string doesn’t have any of the disallowed symbols.
Illegal symbols are !*?()^/#<>&%{}$.”’ and @.
- Parameters:
- Returns:
An error message if any illegal symbols are present
- Return type:
- pipeliner.utils.clean_job_dirname(dirname: str) str
Makes sure a pipeline job_dir name is valid and in the right format
- Parameters:
dirname (str) – The dirname to check
- Returns:
The correctly formatted dirname
- Return type:
- Raises:
ValueError – If the dir name connot be formatted correctly
- pipeliner.utils.clean_jobname(jobname: str) str
Makes sure job names are in the correct format
Job names must have a trailing slash, cannot begin with a slash, and have no illegal characters
- pipeliner.utils.compare_nested_lists(a_list: list, e_list: list, tolerance: float = 0.0)
Compare two nested lists, allow or a tolerance for float values
- pipeliner.utils.convert_relative_filename(filename: str) str
Convert a filename that is relative to the project to just its name
IE: ../../my_dir/my_file.txt -> my_dir/my_file /my_dir/my_file.txt -> my_dir/my_file ~/my_dir/my_file.txt -> my_dir/my_file
- Parameters:
filename –
- Returns:
The part of the file path that is not relative to the project
- Return type:
- pipeliner.utils.count_file_lines(filename: str) int
Fast and efficient count of number of lines in a file
- pipeliner.utils.date_time_tag(compact: bool = False) str
Get a current date and time tag
It can return a compact version or one that is easier to read
- pipeliner.utils.decompose_pipeline_filename(fn_in: str) Tuple[str, int, str]
Breaks a job name into usable pieces
Returns everything before the job number, the job number as an int and everything after the job number setup for up to 20 dirs deep. The 20 directory limit is from the relion code but no really necessary anymore
- Parameters:
fn_in (str) – The job or file name to be broken down in the format: <jobtype>/jobxxx/<filename>
- Returns:
- Return type:
- Raises:
ValueError – If the input file name is more than 20 directories deep
- pipeliner.utils.file_in_project(filename: str) bool
Check that a file is part of the project
Not done with os.path.abspath(file).startswith(project_dir) because this causes errors during testing
- pipeliner.utils.find_common_string(input_strings: List[str]) str
Find the common part of a list of strings starting from the beginning
- Parameters:
input_strings (list) – List of strings to compare
- Returns:
The common portion of the strings
- Return type:
- Raises:
ValueError – If input_list is shorter than 2
- pipeliner.utils.format_string_to_type_objs(in_str: str) str | int | float | bool | None
Returns Int, Float, Bool, and None Objects from strings
Any number with a decimal point, in scientific notation, or ‘NaN’ will return a float Any other number will retun an int ‘False’ or ‘false’ returns False ‘True’ or ‘true’ returns True ‘None’ returns a NoneType object
- pipeliner.utils.get_file_size_mb(file: str | Path) float
Get the size of a file in MB, rounded to 2 decimal places
- Parameters:
file (str) – The file to check
- pipeliner.utils.get_job_number(job_name)
Get the job number from a pipeliner job as an int
- pipeliner.utils.get_job_runner_command() List[str]
Get the full command to run the job_runner.py script.
- pipeliner.utils.get_job_script(name: str) str
Get the full path to a job script file.
- Returns:
The job script file, if it exists.
- Raises:
FileNotFoundError – if the named job script cannot be found.
- pipeliner.utils.get_pipeliner_root() Path
Get the directory of the main pipeliner module
- Returns:
The path of the pipeliner
- Return type:
- pipeliner.utils.get_python_command() List[str]
Get the command to launch the current Python interpreter.
Note that the command is returned as a list and might include some arguments as well as the command itself.
- pipeliner.utils.get_regenerate_results_command() List[str]
Get the full command to run the regenerate_results.py script.
- pipeliner.utils.increment_file_basenames(files: List[str]) Dict[str, str]
Increment the base names of files if there are duplicates
e.g. Import/job001/myfile, Import/job002/myfile
- pipeliner.utils.launch_detached_process(command: List[str]) None
Run the given command as a detached process.
The process is started in a new session and with all file handles set to null, to ensure it keeps running in the background after the parent Python process exits.
- Parameters:
command (List[str]) – The commands to execute
- pipeliner.utils.make_pretty_header(text: str, char: str = '-=', top: bool = True, bottom: bool = True)
Make nice looking headers for on-screen display
- pipeliner.utils.print_nice_columns(list_in: List[str], err_msg: str = 'ERROR: No items in input list')
Takes a list of items and makes three columns for nicer on-screen display
- pipeliner.utils.run_subprocess(*args, **kwargs) CompletedProcess
- pipeliner.utils.smart_strip_quotes(in_string: str) str
Strip the quotes from a string in an intelligent manner
Remove leading and ending ‘ and “ but don’t remove them internally
- pipeliner.utils.str_is_hex_colour(in_string, allow_0x: bool = False) bool
Test that a string is a hexadecimal colour code
Valid codes consist of a # symbol or ‘0x’ followed by exactly six hexadecimal digits (0-9 or a-f, lower or upper case).
- pipeliner.utils.touch(filename: str)
Create an empty file
- Parameters:
filename (str) – The name for the file to create
- pipeliner.utils.truncate_number(number: float, maxlength: int) str
Return a number with no more than x decimal places but no trailing 0s
This is used to format numbers in the exact same way that RELION does it. IE: with maxlength 3; 1.2000 = 1.2, 1.0 = 1, 1.23 = 1.23. RELION commands are happy to accept numbers with any number of decimal places or trailing 0s. This function is just to maintain continuity between RELION and pipeliner commands