Pipeliner Jobs

Pipeliner jobs

class pipeliner.pipeliner_job.ExternalProgram(command: str, name: str | None = None, vers_com: List[str] | None = None, vers_lines: List[int] | None = None)

Bases: object

Class to store info about external programs called by the pipeliner

command

The command that will be used to run the program

Type:

str

name

The name for the program, command will be used unless this is specified

Type:

str

exe_path

The path to the executable for the program

Type:

str

vers_com

The command that needs to be run to get the version

Type:

List[str]

vers_lines

The lines from the output of the version command that contain the version info

Type:

List[int]

get_version() str | None
class pipeliner.pipeliner_job.JobInfo(display_name: str = 'Pipeliner job', version: str = '0.0', job_author: str | None = None, short_desc: str = 'No short description for this job', long_desc: str = 'No long description for this job', documentation: str = 'No online documentation available', external_programs: List[ExternalProgram] | None = None, references: List[Ref] | None = None)

Bases: object

Class for storing info about jobs.

This is used to generate documentation for the job within the pipeliner

display_name

A user-friendly name to describe the job in a GUI, this should not include the software used, because that info is pulled from the job type

Type:

str

version

The version number of the pipeliner job

Type:

str

job_author

Who wrote the pipeliner job

Type:

str

short_desc

A one line “title” for the job

Type:

str

long_desc

A detained description about what the job does

Type:

str

documentation

A URL for online documentation

Type:

str

programs

A list of 3rd party software used by the job. These are used by the pipeliner to determine if the job can be run, so they need to include all executables the job might call. If any program on this list cannot be found with which then the job will be marked as unable to run.

Type:

List[~pipeliner.pipeliner_job.ExternalProgram]

references

A list of Ref objects used

Type:

list

force_unavailable

This can be set to True if other checks for the job to be available (besides programs missing from the $PATH) have failed, e.g. a necessary library is missing

Type:

bool

property is_available

Is the job available to run?

True if executables were found for all the job’s programs or if force_unavailable has been set, or False otherwise.

class pipeliner.pipeliner_job.PipelinerCommand(args: Sequence[str | float | int], relion_control: bool = False)

Bases: object

Holds a command that will be run by the pipeliner

com

The command that will be run each list item is one arg

Type:

List[str]

relion_control

Does the command need the relion ‘–pipeline_control’ argument appended before being run

Type:

bool

add_pipeline_control(outputdir: str) None
class pipeliner.pipeliner_job.PipelinerJob

Bases: object

Super-class for job objects.

Each job type has its own sub-class.

WARNING: do not instantiate this class directly, use the factory functions in this module.

jobinfo

Contains information about the job such as references

Type:

JobInfo

output_dir

The path of the output directory created by this job

Type:

str

alias

the alias for the job if one has been assigned

Type:

str

is_continue

If this job is a continuation of an older job or a new one

Type:

bool

input_nodes

A list of Node objects for each file used as in input

Type:

list

output_nodes

A list of Node objects for files produced by the job

Type:

list

joboptions

A dict of JobOption objects specifying the parameters for the job

Type:

dict

is_tomo

Is the job a tomography job?

Type:

bool

working_dir

The working directory to be used when running the job. This should normally be left as None, meaning the job will be run in the project directory. Jobs that write files in their working directory should instead work somewhere within the job’s output directory, and take care to adjust the paths of input and output files accordingly.

Type:

str

raw_options

A dict of all raw joboptions as they were read in

Type:

dict

OUT_DIR = ''
PROCESS_NAME = ''
add_compatibility_joboptions() None

Write additional joboptions for back compatibility

Some JobOptions are needed by the original program (hey Relion 4), but not the pipeliner, they are added here so the files pipeliner writes will be back compatible with their original program.

add_output_node(file_name: str, node_type: str, keywords: List[str] | None = None) None

Helper function to add a new Node for a file in the job’s output directory.

This is a wrapper around node_factory.create_node which simply adds self.output_dir to the start of the file name before creating the node and adding it to self.output_nodes.

Parameters:
  • file_name – The name of the file that the new node will refer to. It is assumed that the file will be written to the job’a output directory. Note that the existence of the file is not checked, because this method will usually be called before the job has run.

  • node_type – The top-level type for the new node. This should almost always be one of the constants defined in pipeliner.nodes.

  • keywords – A list of keywords to append to the node type.

additional_joboption_validation() List[JobOptionValidationResult]

Advanced validation of job parameters

This is a placeholder function for additional validation to be done by individual job subtypes, such as comparing JobOption values IE: JobOption A must be > JobOption B

Avoid using self.get_string or self.get_number in this function as they may raise an error if the JobOption is required and has no value. Use self.joboptions[“jobopname”].value.

Returns:

A list JobOptionValidationResult

objects

Return type:

list

check_joboption_is_now_deactivated(jo: str) bool

Check if a joboption has become deactivated in relation to others

For example if job option A is False, job option B is now deactiavted

Parameters:

jo (str) – The name of the JobOption to test

Returns:

Has the JobOption been deactivated

Return type:

bool

check_joboption_is_now_required(jo: str) list

Check if a joboption has become required in relation to others

For example if job option A is True, job option B is now required

Parameters:

jo (str) – The name of the joboption to test

Returns:

pipeliner.job_options.JobOptionValidationResult:

for any errors found

Return type:

list

create_input_nodes() None

Automatically add the job’s input nodes to its input node list.

Input nodes are created from each of the job’s job options.

create_output_nodes() None

Make the job’s output nodes.

This method should be overridden by PipelinerJob subclasses.

The output nodes should be added to the list in the output_nodes attribute. The add_output_node function is helpful to create and add a new node in a single call.

If your job doesn’t make any output nodes, or doesn’t know what their names will be until the job has been run, you still need to override this method but your implementation can simply pass and do nothing. If you need to add output nodes at the end of the job, create them in create_post_run_output_nodes.

Note that this method is called by the job manager (via PipelinerJob.prepare_to_run) before the job is added to the pipeline. The job’s output directory does exist when this method is called, but that could change in future versions of the pipeliner and jobs should avoid making any file system changes in this method.

create_post_run_output_nodes()

Placeholder function for post run node creation

Some jobs have output nodes that can only be created after the job has run because their names are not known until after they have been created. They can be added here. This function should ONLY add output nodes; any other work should be done in commands run by the job.

create_results_display() Sequence[ResultsDisplayObject]

Create results display objects to be displayed by the GUI

This default implementation simply creates the default results display object for each of the job’s output nodes. Subclasses that want customised results should override this method.

Returns:

A list of ResultsDisplayObject

gather_metadata() Dict[str, Any]

Placeholder function for metadata gathering

Each job class should define this individually

Returns:

A placeholder “No metadata available” and the reason why

Return type:

dict

get_additional_reference_info() List[Ref]

A placeholder function for job that need to return additional references

This if for references that are not included in self.job info, such as ones pulled from the EMDB/PDB in fetch jobs

get_commands() List[PipelinerCommand]

Get the commands to be run for a specific job.

This method should be overridden by PipelinerJob subclasses.

Jobs are normally run with the project directory as the working directory. If your job needs to run in a different working directory (for example if it calls a program which always writes files into the current directory), set the self.working_dir attribute in this method.

Note that this method should run quickly! Any long-running actions should be done in one of the job’s commands instead. (If necessary, put Python code that needs to be run into a separate script in pipeliner.scripts.job_scripts and then call it as a command.)

Returns:

The commands as a list of PipelinerCommand objects

get_current_output_nodes() List[Node]

Get the current output nodes if the job was stopped prematurely

For most jobs there will not be any but for jobs with many iterations the most recent interation can be used if the job is aborted or failed and then later marked as successful

Returns:

of Node objects

Return type:

list

get_default_params_dict() Dict[str, str]

Get a dict with the job’s parameters and default values

Returns:

All the job’s parameters {parameter: default value}

Return type:

dict

get_extra_options() None

Get user specified extra queue submission options

get_final_commands() List[List[str]]

Assemble the commands to be run for a job.

This function is intended to be called by the job runner just before the commands are run. Any setup required before the job starts should be done in prepare_to_run().

Returns:

The commands, in a lists of lists format. Each item in the main list is a single command composed of a list of strings (as used by subprocess.run, i.e. [com, arg1, arg2, …])

get_joboption_groups() Dict[str, List[str]]

Put the joboptions in groups according to their jobop_group attribute

Assumes that the joboptions have already been put in order of priority by self.set_joboption_order() or were in order to begin with.

Groups are ordered based on the highest priority joboption in that group from the order of the joboptions, except that “Main” is always the first group. Joboptions within the groups are ordered by priority.

Returns:

The joboptions groupings {group: [jopbop, … jobop]}

Return type:

Dict[str, List[str]]

get_mpi_command() List[int | float | str]
get_nr_mpi() int
get_nr_threads() int
get_runtab_options(mpi: bool = False, threads: bool = False, addtl_args: bool = False, mpi_default_min: int = 1, mpi_must_be_odd: bool = False) None

Get the options found in the Run tab of the GUI, which are common to for all jobtypes

Adds entries to the joboptions dict for queueing, MPI, threading, and additional arguments. This method should be used when initialising a PipelinerJob subclass

Parameters:
  • mpi (bool) – Should MPI options be included?

  • threads (bool) – Should multi-threading options be included

  • addtl_args (bool) – Should and ‘additional arguments’ be added

  • mpi_default_min (int) – The minimum for the default number of MPIs, will be used if mpi_default_min > user defined min number of MPI

  • mpi_must_be_odd (bool) – Does the number of mpis have to be odd, like for relion refine_jobs.

handle_doppio_uploads(dry_run=False) None

Tasks that have to be performed to deal with Doppio file uploads.

  • Move files from DoppioUploads to the job dir:

    DoppioUploads/tmpdir/file -> JobType/jobNNN/InputFiles/file

  • Update the job option values to point to the new file locations, so when the job input nodes are created they refer to the moved files

Parameters:

dry_run – If True, do not actually try to move any files, just update the job option values. This option is only intended for use in testing.

is_submit_to_queue() bool
load_results_display_files() Sequence[ResultsDisplayObject]

Load the job’s results display objects from files on disk.

This method must be fast because it is used by the GUI to load job results. Therefore, if a display object fails to load properly, no attempt is made to recalculate it and a ResultsDisplayPending object is returned instead.

If there are no results display files yet, an empty list is returned.

Returns:

A list of ResultsDisplayObject

make_additional_args() None

Get the additional arguments job option

make_queue_options() None

Get options related to queueing and queue submission, which are common to for all jobtypes

parse_additional_args() List[str]

Parse the additional arguments job option and return a list

Returns:

A list ready to append to the command. Quotated strings are preserved

as quoted strings all others are split into individual items

Return type:

list

prepare_clean_up_lists(do_harsh: bool = False) Tuple[List[str], List[str]]

Placeholder function for preparation of list of files to clean up

Each job class should define this individually

Parameters:

do_harsh (bool) – Should a harsh cleanup be performed

Returns:

Two empty lists ([files, to, delete], [dirs, to, delete])

Return type:

tuple

prepare_deposition_data(depo_type: str) Sequence[EmpiarRefinedParticles | EmpiarParticles | EmpiarCorrectedMics | EmpiarMovieSet | OneDepData]

Placeholder for function to return deposition data objects

The specific list returned should be defined by each jobtype

Parameters:

depo_type (str) – EMPIAR or ONEDEP

Returns:

The deposition object(s) returned by the specific job. These

need to be of the types defined in pipeliner.deposition_tools.onedep_deposition and pipeliner.deposition_tools.empiar_deposition

Return type:

list

prepare_to_run(ignore_invalid_joboptions=False) None

Prepare the job to run.

This function is intended to be called by the pipeliner before the job file is saved to disk. It does several things including: - Validate the job options - Make the job directory - Move uploaded Doppio user files into the job directory

Parameters:

ignore_invalid_joboptions (bool) – Prepare the job to run anyway even if the job options appear to be invalid

Raises:
  • ValueError – If the job options appear to be invalid and ignore_invalid_joboptions is not set

  • RuntimeError – If the job does not already have an output directory assigned

read(filename: str) None

Reads parameters from a run.job or job.star file

Parameters:

filename (str) – The file to read. Can be a run.job or job.star file

Raises:

ValueError – If the file is a job.star file and job option from the PipelinerJob is missing from the input file

save_job_submission_script(commands: list) str

Writes a submission script for jobs submitted to a queue

Parameters:

commands (list) – The job’s commands. In a list of lists format

Returns:

The name of the submission script that was written

Return type:

str

Raises:
  • ValueError – If no submission script template was specified in the job’s joboptions

  • ValueError – If the submission script template is not found

  • RuntimeError – If the output script could not be written

save_results_display_files() Sequence[ResultsDisplayObject]

Create new results display objects and save them to disk.

This method removes any existing results display files first, and returns the new display objects after they have been created and saved.

Returns:

The newly-created results display objects.

set_joboption_order(new_order=typing.List[str]) None

Replace the joboptions dict with an ordered dict

Use this to set the order the joboptions will appear in the GUI. If a joboption is not specified in the list it will be tagged on to the end of the list.

Parameters:

new_order (list[str]) – A list of joboption keys, in the order they should appear

Raises:

ValueError – If a nonexistent joboption is specified

set_option(line: str) None

Sets a value in the joboptions dict from a run.job file

Parameters:

line (str) – A line from a run.job file

Raises:
  • RuntimeError – If the line does not contain ‘==’

  • RuntimeError – If the value of the line does not match any of the joboptions keys

validate_dynamically_required_joboptions() List[JobOptionValidationResult]

Check all joboptions if they have become required because of if_required

For example if job option A is True, job option B is now required

Returns:

pipeliner.job_options.JobOptionValidationResult:

for any errors found

Return type:

list

validate_input_files() List[JobOptionValidationResult]

Check that files specified as inputs actually exist

Returns:

A list of pipeliner.job_options.JobOptionValidationResult

objects

Return type:

list

validate_joboptions() List[JobOptionValidationResult]

Make sure all the joboptions meet their validation criteria

Returns:

A list JobOptionValidationResult

objects

Return type:

list

write_jobstar(output_dir: str, output_fn: str = 'job.star', is_continue: bool = False)

Write a job.star file.

Parameters:
  • output_dir (str) – The output directory.

  • output_fn (str) – The name of the file to write. Defaults to job.star

  • is_continue (bool) – Is the file for a continuation of a previously run job? If so only the parameters that can be changed on continuation are written. Overrules is_continue attribute of the job

write_runjob(fn: str | None = None) None

Writes a run.job file

Parameters:

fn (str) – The name of the file to write. Defaults to the file the pipeliner uses for storing GUI parameters. A directory can also be entered and it will add on the file name ‘run.job’

class pipeliner.pipeliner_job.Ref(authors: str | List[str] | None = None, title: str = '', journal: str = '', year: str = '', volume: str = '', issue: str = '', pages: str = '', doi: str = '', **kwargs)

Bases: object

Class to hold metadata about a citation or reference, typically a journal article.

authors

The authors of the reference.

Type:

list

title

The reference’s title.

Type:

str

journal

The journal.

Type:

str

year

The year of publication.

Type:

str

volume

The volume number.

Type:

str

issue

The issue number.

Type:

str

pages

The page numbers.

Type:

str

doi

The reference’s Digital Object Identifier.

Type:

str

other_metadata

Other metadata as needed. Gathered from kwargs

Type:

dict

Display tools

Use these methods to create ResultsDisplayObjects used by the pipeliner GUI Doppio to create graphical outputs for each job.

pipeliner.display_tools.create_results_display_object(dobj_type: str, **kwargs) ResultsDisplayObject

Safely create a results display object

Returns a ResultsDisplayPending if there are any problems. Give it the type of display object as the first argument followed by the kwargs for that specific type of ResultsDisplayObject

Parameters:

dobj_type (str) – The type of DisplayObject to create

pipeliner.display_tools.get_ordered_classes_arrays(model_file: str, ncols: int, boxsize: int, output_dir: str, output_filename: str, parts_file: str | None = None, title: str = '2D class averages', start_collapsed: bool = False, flag: str = '') ResultsDisplayMontage | ResultsDisplayPending

Return a 3D array of class averages from a Relion Class2D model file

Parameters:
  • model_file (str) – Name of the model file

  • ncols (int) – number of columns desired in the file montage

  • boxsize (int) – Size of the class averages in the final montage

  • output_dir (str) – The output dir of the pipeliner job creating this object

  • output_filename (str) – The name for the output montage file

  • parts_file (str) – Path of the file containing the particles, for counting

  • title (str) – A title for the DisplayObject

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If this display object contains scientificlly dubious results display this message

Returns:

An object for the GUI to use to render the graph

Return type:

ResultsDisplayMontage

pipeliner.display_tools.graph_from_starfile_cols(title: str, starfile: str, block: str, ycols: list, xcols: list | None = None, xrange: list | None = None, yrange: list | None = None, data_series_labels: List[str] | None = None, xlabel: str = '', ylabel: str = '', assoc_data: List[str] | None = None, modes: List[str] | None = None, start_collapsed: bool = False, flag: str = '') ResultsDisplayGraph | ResultsDisplayPending

Automatically generate a ResultsDisplayGraph object from a starfile

Can use one or two columns and third column for labels if desired

Parameters:
  • title (str) – The title of the final graph

  • starfile (str) – Path to the star file ot use

  • block (str) – The block to use in the starfile, use None for a starfile with only a single block

  • ycols (list) – Column label(s) from the star file to use for the y data series

  • xcols (list) – Column label(s) from the star file to use for the y data series if None a simple count from 1 will be used

  • xlabel (str) – Label for the x axis, if no x data are specified the label will ‘Count’, if x data are specified and the xlabel is None the x axis label will be the name of the starfile column used

  • xrange (list) – Range for x vlaues to be displayed, full range if None

  • yrange (list) – Range for y vlaues to be displayed, full range if None

  • data_series_labels (list) – Names for the data series

  • ylabel (str) – Label for the y axis, if None the y axis label will be the name of the starfile column used

  • assoc_data (list) – List of data file(s) associated with this graph

  • modes (list) – Controls the appearance of each data series, choose from ‘lines’, ‘markers’ ‘or lines+markers’

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If this display object contains scientificlly dubious results display this message

Returns:

A ResultsDisplayGraph object for the created graph

Return type:

ResultsDisplayGraph

pipeliner.display_tools.histogram_from_starfile_col(title: str, starfile: str, block: str, data_col: str, xlabel: str = '', ylabel: str = 'Count', assoc_data: List[str] | None = None, start_collapsed: bool = False, flag: str = '') ResultsDisplayHistogram | ResultsDisplayPending

Automatically generate a ResultsDisplayHistogram object from a starfile

Parameters:
  • title (str) – The title of the final graph

  • starfile (str) – Path to the star file ot use

  • block (str) – The block to use in the starfile, use None for a starfile with only a single block

  • data_col (str) – Column label from the star file to use for the data series

  • xlabel (str) – Label for the x axis, if no x data are specified the label will ‘Count’, if x data are specified and the xlabel is None the x axis label will be the name of the starfile column used

  • ylabel (str) – Label for the y axis, if None the y axis label will be the name of the starfile column used

  • assoc_data (list) – List of data file(s) associated with this graph

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If this display object contains scientificlly dubious results display this message

pipeliner.display_tools.make_map_model_thumb_and_display(outputdir: str, maps: List[str] | None = None, maps_opacity: List[float] | None = None, maps_colours: List[str] | None = None, models: List[str] | None = None, models_colours: List[str] | None = None, title: str | None = None, maps_data: str = '', models_data: str = '', assoc_data: List | None = None, start_collapsed: bool = True, flag: str = '') ResultsDisplayMapModel | ResultsDisplayPending

Make a display object for an atomic model overlaid over a map

Makes a binned map and a ResultsDisplayMapModel display object

Parameters:
  • outputdir (str) – Name of the job’s output directory

  • maps (list) – List of map files to use

  • models (list) – List of model files to use

  • maps_opacity (list) – List of opacity for the maps, from 0-1 if None 0.5 is used for all

  • maps_colours (list) – Colors for the maps of specific ones are desired, otherwise mol* will assign them

  • title (str) – The title for the ResultsDisplayMapModel object, if None the name of the map and model will be used

  • maps_data (str) – Any additional data to be included about the map

  • models_data (str) – Any additional data to be included about the map

  • models_colours (list) – Colors for the models of specific ones are desired, otherwise mol* will assign them

  • assoc_data (list) – List of associated data, if left as None then just uses the file itself

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If the results are considered scientifically dubious explain in this string

Returns:

The DisplayObject for the map and model

Return type:

ResultsDisplayMapModel

pipeliner.display_tools.make_particle_coords_thumb(in_mrc, in_coords, out_dir, thumb_size=640, pad=5, start_collapsed=False, title: str = 'Example picked particles', flag: str = '', markers: bool = False) ResultsDisplayImage | ResultsDisplayPending

Create a thumbnail of picked particle coords on their micrograph

Because the extraction box size is not known boxes will be a % of the total image size.

Parameters:
  • in_mrc (str) – Path to the merged micrograph mrc file

  • in_coords (str) – Path to the .star coordinates file

  • out_dir (str) – Name of the output directory

  • thumb_size (int) – Size of the x dimension of the final thumbnail image

  • pad (int) – Thickness of the particle box borders before binning in px

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • title (str) – What title to use for the displayobj created

  • flag (str) – If this display object contains scientificlly dubious results display this message

  • markers (bool) – Instead of making boxes make markers

pipeliner.display_tools.mini_montage_from_many_files(filelist: List[str], outputdir: str, nimg: int = 5, montagesize: int = 640, title: str = '', ncols: int = 5, associated_data: List[str] | None = None, labels: List[str] | None = None, cmap: str = '', start_collapsed: bool = False, flag: str = '') ResultsDisplayMontage | ResultsDisplayPending

Make a mini montage from a list of images

Merge and flatten image stacks

Parameters:
  • filelist (list) – A list of the files to use

  • outputdir (str) – The output dir of the pipeliner job

  • nimg (int) – Number of images to use in the montage

  • montagesize (int) – Desired size of the final montage image

  • title (str) – Title for the ResultsDisplay object that will be output

  • ncols (int) – Number of columns to make in the montage

  • associated_data (list) – Data files associated with these images, if None then all of the selected images

  • labels (list) – The labels for the items in the montage

  • cmap (str) – colormap to apply, if any

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If this display object contains scientificlly dubious results display this message

Returns:

The DisplayObject for the map

Return type:

ResultsDisplayImage

Raises:

ValueError – If a non mrc or tiff image is used

pipeliner.display_tools.mini_montage_from_stack(stack_file: str, outputdir: str, nimg: int = 40, ncols: int = 10, montagesize: int = 640, title: str = '', labels: List[int | str] | None = None, cmap: str = '', start_collapsed: bool = False, flag: str = '') ResultsDisplayMontage | ResultsDisplayPending

Make a montage from a mrcs or tiff file

Parameters:
  • stack_file (str) – The path to the stack_file

  • outputdir (str) – The output dir of the pipeliner job

  • nimg (int) – Number of images to use in the montage, if < 1 uses all of them

  • ncols (int) – Number of columns to use

  • montagesize (int) – Desired size of the final montage image

  • title (str) – Title for the ResultsDisplay object that will be output

  • labels (list) – Labels for the images

  • cmap (str) – colormap to apply, if any

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If this display object contains scientificlly dubious results display this message

Returns:

The DisplayObject for the map

Return type:

ResultsDisplayImage

Raises:

ValueError – If a non mrc or tiff image is used

pipeliner.display_tools.mini_montage_from_starfile(starfile: str, block: str, column: str, outputdir: str, title: str = '', nimg: int = 20, montagesize: int = 640, ncols: int = 10, labels: List[str] | None = None, cmap: str = '', start_collapsed: bool = False, flag: str = '') ResultsDisplayMontage | ResultsDisplayPending

Make a montage from a list of images in a starfile column

Merge and flatten image stacks if they are encountered.

Parameters:
  • starfile (str) – The starfile to use

  • block (str) – The name of the block with the images

  • column (str) – The name of the column that has the images

  • outputdir (str) – The output dir of the pipeliner job

  • title (str) – The title for the object, automatically generated if “”

  • nimg (int) – Number of images to use in the montage, uses all if < 1

  • montagesize (int) – Desired size of the final montage image

  • ncols (int) – number of columns to use

  • labels (list) – Labels for the images in the montage, in order

  • cmap (str) – colormap to apply, if any

  • start_collapsed (bool) – Should the display start out collapsed when displayed in the GUI

  • flag (str) – If this display object contains scientificlly dubious results display this message

Returns:

The DisplayObject for the map

Return type:

ResultsDisplayImage

Raises:

ValueError – If a non mrc or tiff image is encountered

ResultsDisplay Objects

These objects generally should not be instantiated directly they should instead be created using the functions above.

class pipeliner.results_display_objects.ResultsDisplayGraph(*, xvalues: list, yvalues: list, title: str, associated_data: list, data_series_labels: list, xaxis_label: str = '', xrange: list | None = None, yaxis_label: str = '', yrange: list | None = None, modes: list | None = None, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

A simple graph for the GUI to display

It is best to not instantiate this class directly. Instead create it using create_results_display_object

title

The title of the object/graph

Type:

str

xvalues

(list): list of x coordinate data series, can have multiple data series

xaxis_label

Label for the x axis if a graph

Type:

str

xrange

Range of x to be displayed, displays the full range if None. If the x axis needs to be reveresd then enter the values backwards [max, min]

Type:

list

yvalues

(list): List y coordinate data series can have multiple data series

yaxis_label

Label for the y axis if a graph

Type:

str

yrange

Range of y to be displayed, displays the full range if None. If the y axis needs to be reveresd then enter the values backwards [max, min]

Type:

list

data_series_labels

List of names of the different data series

Type:

list

associated_data

A list of files that contributed the data used in the image/graph

Type:

list

modes

Controls the appearance of each data series, choose from ‘lines’, ‘markers’ ‘or lines+markers’

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayHistogram(*, title: str, associated_data: list, data_to_bin: List[float] | None = None, xlabel: str = '', ylabel: str = '', bins: List[int] | None = None, bin_edges: List[float] | None = None, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

A class for the GUI to display a histogram

It is best to not instantiate this class directly. Instead, create it using create_results_display_object

Parameters:
  • title (str) – The title of the histogram

  • data_to_bin (list) – The data to bin

  • xlabel (str) – Label for the x axis

  • ylabel (str) – Label for the y axis

  • associated_data (list) – List of data files associated with the histogram

  • bins (list) – A list of bin counts, if they are known

  • bin_edges (list) – A list of the bin edges, if they are already known

  • start_collapsed (bool) – Should the object start out collapsed when displayed in the GUI

Raises:
  • ValueError – If no data or bins are specified

  • ValueError – If an attempt is made to specify bins or bin edges when data to bin are being provided

  • ValueError – If the associated data is not a list, or not provided

class pipeliner.results_display_objects.ResultsDisplayHtml(*, title: str, associated_data: list, html_dir: str = '', html_file: str = '', html_str: str = '', start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object for the GUI to display html

It is best to not instantiate this class directly. Instead create it using create_results_display_object

This can be used for general HTML display in Doppio. Either provide a directory with index.html or specify a html file or provide a html string as input.

html_dir

Path to the html directory (optional)

Type:

str

html_file

Path to a standalone html file or in the given html_dir (optional)

Type:

str

html_str

Input html as string (optional)

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayImage(*, title: str, image_path: str, image_desc: str, associated_data: list, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

A class for the GUI to display a single image

It is best to not instantiate this class directly. Instead create it using create_results_display_object

title

The title for the image

Type:

str

image_path

The path to the image

Type:

str

image_desc

A description of the image

Type:

str

associated_data

Data files associated with the image

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayJson(*, file_path: str, title: str = '', start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object for the GUI to display JSON files

It is best to not instantiate this class directly. Instead create it using create_results_display_object

file_path

Path to the file

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayMapModel(title: str, associated_data: list, maps: list | None = None, models: list | None = None, maps_data: str = '', models_data: str = '', maps_opacity: list | None = None, maps_colours: list | None = None, models_colours: list | None = None, start_collapsed: bool = True, flag: str = '')

Bases: ResultsDisplayObject

An object for overlaying multiple maps and/or models

It is best to not instantiate this class directly. Instead create it using create_results_display_object

title

The title that appears at the top of the accordian in the GUI

Type:

str

associated_data

A list of associated data files

Type:

list

maps

List of map paths, mrc format

Type:

list

models

List of model paths, pdb or mmcif format

Type:

list

maps_opacity

Opacity for each map from 0-1 if not specified set at 0.5 for all maps

Type:

list

models_data

Any extra info about the models

Type:

str

maps_data

Any extra info about the maps

Type:

str

maps_colours

Hex values for colouring the maps specific colours, in the form “#XXXXXX” where X is a hex digit (0-9 or a-f). If None, the standard colours will be used

Type:

list

models_colours

Hex values for colouring the models specific colours, in the form “#XXXXXX” where X is a hex digit (0-9 or a-f). If None, the standard colours will be used

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

Raises:
  • ValueError – If no maps or models were specified

  • ValueError – If the map is not .mrc format

  • ValueError – If models are not in pdb of mmcif format

  • ValueError – If the number of maps and map opacities don’t match

class pipeliner.results_display_objects.ResultsDisplayMontage(*, xvalues: list, yvalues: list, img: str, title: str, associated_data: list, labels: list | None = None, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object to send to the GUI to make an image montage

This one is an image montage with info about the specific images It is best to not instantiate this class directly. Instead create it using create_results_display_object

title

The title of the object/graph

Type:

str

xvalues

(list): The x coordinates by image

yvalues

(list): The y coordinates by image

labels

Data labels for the images

Type:

list

associated_data

A list of files that contributed the data used in the image/graph

Type:

list

img

Path to an image to display

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayObject(title: str, start_collapsed: bool = False, flag='')

Bases: object

Abstract super-class for results display objects

title

The title

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

dobj_type

Used to identify what kind of ResultsDisplayObject it is

Type:

str

flag

A message that is displayed if the results display object is showing somthing scientifically dubious.

Type:

str

write_displayobj_file(outdir) None

Write a json file from a ResultsDisplayObject object

Parameters:

outdir (str) – The directory to write the output in

Raises:

NotImplementedError – If a write attempt is made from the superclass

class pipeliner.results_display_objects.ResultsDisplayPdfFile(*, file_path: str, title: str = '', start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object for the GUI to display pdf files

It is best to not instantiate this class directly. Instead create it using create_results_display_object

file_path

Path to the file

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayPending(*, title: str = 'Results pending...', message: str = 'The result not available yet', reason: str = 'unknown', start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

A placeholder class for when a job is not able to produce results yet

class pipeliner.results_display_objects.ResultsDisplayPlotlyHistogram(*, data: List[float] | DataFrame | ndarray | dict | None = None, title: str, x: str | list | None = None, y: str | list | None = None, color: str | int | list | None = None, nbins: int | None = None, range_x: list | None = None, range_y: list | None = None, category_orders: dict | None = None, labels: dict | None = None, bin_counts: List[float] | None = None, bin_centres: List[float] | None = None, associated_data: list, start_collapsed: bool = False, flag: str = '', **kwargs)

Bases: ResultsDisplayObject

A class that generates plotly.graph_objects.Figure object to display a histogram Uses plotly express histogram https://plotly.com/python-api-reference/generated/plotly.express.histogram.html Examples here: https://plotly.com/python/histograms/

data

The data to bin. Following types are allowed list - list of values to be binned array and dict - converted to a pandas dataframe internally pandas dataframe - ensure column names are added if ‘x’ indicates a column name More details https://plotly.com/python/px-arguments/

title

The title of the plot

Type:

str

plotlyfig

plotly.graph_objects.Figure object generated from input data

Type:

plotly.graph_objects.Figure

associated_data

A list of the associated data files

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayPlotlyObj(*, data: list | DataFrame | ndarray | dict, plot_type: list | str, title: str, associated_data: list, multi_series: bool = False, subplot: bool = False, make_subplot_args: dict | None = None, subplot_order: str | List[tuple] | None = None, subplot_size: Sequence[int] | None = None, subplot_args: List[dict] | None = None, series_args: List[dict] | None = None, layout_args: dict | None = None, trace_args: dict | None = None, xaxes_args: List[dict] | dict | None = None, yaxes_args: List[dict] | dict | None = None, start_collapsed: bool = False, flag: str = '', **kwargs)

Bases: ResultsDisplayObject

This uses the plotly express class to create plotly.graph_objects.Figure object https://plotly.com/python/plotly-express/ Use this class to generate plotly Figure objects for custom plots including facet-plots: https://plotly.com/python/facet-plots/ subplots: https://plotly.com/python/subplots/ multi_series: e.g. https://plotly.com/python/creating-and-updating-figures/#adding-traces

data

The data to plot. For a single plot, following types are allowed: list - list of values to be binned array and dict - converted to a pandas dataframe internally pandas dataframe - ensure column names are added if ‘x’ indicates a column name More details https://plotly.com/python/px-arguments/ For subplots and/or multi_series: list - list with dictionary of arguments for each plot/series

plot_type

Required, type of plot. For a single plot, it is the plotly express function to call https://plotly.com/python-api-reference/plotly.express.html For subplots and/or multi_series, plotly.graph_objects function to call https://plotly.com/python-api-reference/plotly.graph_objects.html

Type:

str

title

The title of the plot

Type:

str

plotlyfig

plotly.graph_objects.Figure object generated from input data

Type:

plotly.graph_objects.Figure

associated_data

A list of the associated data files

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

check_multiseries_arguments(data, plot_type, series_args) None
check_plottype_list(plot_type, data) None
check_singleplot_arguments(plot_type) None
check_subplot_arguments(data, subplot_size, subplot_order, plot_type, subplot_args, xaxes_args, yaxes_args) None
generate_multiseries_plots(plot_type, plot_args) Figure
generate_subplots(subplot_size, plot_type, subplot_order, plot_args, make_subplot_args) Figure
set_multiplot_data(data) None
set_singleplot_data(data) None
class pipeliner.results_display_objects.ResultsDisplayPlotlyScatter(*, data: List[List[float]] | DataFrame | ndarray | dict | None = None, title: str, x: str | List[float] | None = None, y: str | List[float] | None = None, color: str | int | Sequence[str] | None = None, size: str | int | Sequence[str] | None = None, symbol: str | int | Sequence[str] | None = None, hover_name: str | int | Sequence[str] | None = None, range_color: list | None = None, range_x: list | None = None, range_y: list | None = None, category_orders: dict | None = None, labels: dict | None = None, associated_data: list, start_collapsed: bool = False, flag: str = '', **kwargs)

Bases: ResultsDisplayObject

A class that generates plotly.graph_objects.Figure object to display a scatter plot Uses plotly express scatter https://plotly.com/python-api-reference/generated/plotly.express.scatter.html Examples here: https://plotly.com/python/line-and-scatter/

data

The data to bin. Following types are allowed list - list of values to be binned array and dict - converted to a pandas dataframe internally pandas dataframe - ensure column names are added if ‘x’ indicates a column name More details https://plotly.com/python/px-arguments/

title

The title of the plot

Type:

str

plotlyfig

plotly.graph_objects.Figure object generated from input data

Type:

plotly.graph_objects.Figure

associated_data

A list of the associated data files

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayRvapi(*, title: str, rvapi_dir: str, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object for the GUI to display rvapi objects

It is best to not instantiate this class directly. Instead create it using create_results_display_object

This can be used for general HTML display in Doppio. Create a directory with index.html and it will be shown in the results display tab

rvapi_dir

Path to the rvapi directory

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayTable(*, title: str, headers: list, table_data: list, associated_data: list, header_tooltips: list | None = None, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object for the GUI to display a table

It is best to not instantiate this class directly. Instead create it using create_results_display_object

title

The title of the table

Type:

str

headers

The column headers for the table

Type:

list

table_data

A list of lists, on per row

Type:

list

associated_data

A list of the associated data files

Type:

list

header_tooltips

Tooltips for each column. Column header by default

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayText(*, title: str, display_data: str, associated_data: list, start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

A class to display general text in the GUI results tab

It is best to not instantiate this class directly. Instead create it using create_results_display_object

title

the title of the section

Type:

str

display_data

The text to display

Type:

str

associated_data

Data files associated with this result

Type:

list

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

class pipeliner.results_display_objects.ResultsDisplayTextFile(*, file_path: str, title: str = '', start_collapsed: bool = False, flag: str = '')

Bases: ResultsDisplayObject

An object for the GUI to display ascii tecxt files

It is best to not instantiate this class directly. Instead create it using create_results_display_object

This can be used for default display of files that have ascii encoded text but the formats are too variable to make a more complex ResultsDisplayFile

file_path

Path to the file

Type:

str

start_collapsed

Should the object start out collapsed when displayed in the GUI

Type:

bool

pipeliner.results_display_objects.get_next_resultsfile_name(dir: str, search_str: str) str

Get the name of the next results file

taking into account existing files of this type in the output dir to prevent overwriting existing ones

Parameters:
  • dir (str) – The output directory

  • search_str (str) – The full name for the file with * in place of the number

Returns:

The name of the file

Return type:

str

Deposition Objects

DepositionObjects are returned by a PipelinerJob’s prepare_deposition_data function and are used to prepare automated depositions to the PDB, EMDB, and EMPIAR.

EMPIAR DepositionObjects

class pipeliner.deposition_tools.empiar_deposition_objects.EmpiarCorrectedMics(name: str = 'Corrected micrographs', directory: str | NoneType = None, category: str | NoneType = None, header_format: str | NoneType = None, data_format: str | NoneType = None, num_images_or_tilt_series: int | NoneType = None, frames_per_image: int | NoneType = None, voxel_type: str | NoneType = None, pixel_width: float | NoneType = None, pixel_height: float | NoneType = None, details: str | NoneType = None, image_width: int | NoneType = None, image_height: int | NoneType = None, micrographs_file_pattern: str | NoneType = None)

Bases: object

category: str | None = None
data_format: str | None = None
details: str | None = None
directory: str | None = None
frames_per_image: int | None = None
header_format: str | None = None
image_height: int | None = None
image_width: int | None = None
micrographs_file_pattern: str | None = None
name: str = 'Corrected micrographs'
num_images_or_tilt_series: int | None = None
pixel_height: float | None = None
pixel_width: float | None = None
voxel_type: str | None = None
class pipeliner.deposition_tools.empiar_deposition_objects.EmpiarData(name: str = '', directory: str | NoneType = None, category: str | NoneType = None, header_format: str | NoneType = None, data_format: str | NoneType = None, num_images_or_tilt_series: int | NoneType = None, frames_per_image: int | NoneType = None, voxel_type: str | NoneType = None, pixel_width: float | NoneType = None, pixel_height: float | NoneType = None, details: str | NoneType = None, image_width: int | NoneType = None, image_height: int | NoneType = None, micrographs_file_pattern: str | NoneType = None, picked_particles_file_pattern: str | NoneType = None)

Bases: object

category: str | None = None
data_format: str | None = None
details: str | None = None
directory: str | None = None
frames_per_image: int | None = None
header_format: str | None = None
image_height: int | None = None
image_width: int | None = None
micrographs_file_pattern: str | None = None
name: str = ''
num_images_or_tilt_series: int | None = None
picked_particles_file_pattern: str | None = None
pixel_height: float | None = None
pixel_width: float | None = None
voxel_type: str | None = None
class pipeliner.deposition_tools.empiar_deposition_objects.EmpiarMovieSet(name: str = 'Multiframe micrograph movies', directory: str | NoneType = None, category: str | NoneType = None, header_format: str | NoneType = None, data_format: str | NoneType = None, num_images_or_tilt_series: int | NoneType = None, frames_per_image: int | NoneType = None, voxel_type: str | NoneType = None, pixel_width: float | NoneType = None, pixel_height: float | NoneType = None, details: str | NoneType = None, image_width: int | NoneType = None, image_height: int | NoneType = None, micrographs_file_pattern: str | NoneType = None)

Bases: object

category: str | None = None
data_format: str | None = None
details: str | None = None
directory: str | None = None
frames_per_image: int | None = None
header_format: str | None = None
image_height: int | None = None
image_width: int | None = None
micrographs_file_pattern: str | None = None
name: str = 'Multiframe micrograph movies'
num_images_or_tilt_series: int | None = None
pixel_height: float | None = None
pixel_width: float | None = None
voxel_type: str | None = None
class pipeliner.deposition_tools.empiar_deposition_objects.EmpiarParticles(name: str = 'Particle images', directory: str | NoneType = None, category: str | NoneType = None, header_format: str | NoneType = None, data_format: str | NoneType = None, num_images_or_tilt_series: int | NoneType = None, frames_per_image: int | NoneType = None, voxel_type: str | NoneType = None, pixel_width: float | NoneType = None, pixel_height: float | NoneType = None, details: str | NoneType = None, image_width: int | NoneType = None, image_height: int | NoneType = None, micrographs_file_pattern: str | NoneType = None, picked_particles_file_pattern: str | NoneType = None)

Bases: object

category: str | None = None
data_format: str | None = None
details: str | None = None
directory: str | None = None
frames_per_image: int | None = None
header_format: str | None = None
image_height: int | None = None
image_width: int | None = None
micrographs_file_pattern: str | None = None
name: str = 'Particle images'
num_images_or_tilt_series: int | None = None
picked_particles_file_pattern: str | None = None
pixel_height: float | None = None
pixel_width: float | None = None
voxel_type: str | None = None
class pipeliner.deposition_tools.empiar_deposition_objects.EmpiarRefinedParticles(name: str = 'Per-particle motion corrected particle images', directory: str | NoneType = None, category: str | NoneType = None, header_format: str | NoneType = None, data_format: str | NoneType = None, num_images_or_tilt_series: int | NoneType = None, frames_per_image: int | NoneType = None, voxel_type: str | NoneType = None, pixel_width: float | NoneType = None, pixel_height: float | NoneType = None, details: str | NoneType = None, image_width: int | NoneType = None, image_height: int | NoneType = None, micrographs_file_pattern: str | NoneType = None, picked_particles_file_pattern: str | NoneType = None)

Bases: object

category: str | None = None
data_format: str | None = None
details: str | None = None
directory: str | None = None
frames_per_image: int | None = None
header_format: str | None = None
image_height: int | None = None
image_width: int | None = None
micrographs_file_pattern: str | None = None
name: str = 'Per-particle motion corrected particle images'
num_images_or_tilt_series: int | None = None
picked_particles_file_pattern: str | None = None
pixel_height: float | None = None
pixel_width: float | None = None
voxel_type: str | None = None
class pipeliner.deposition_tools.empiar_deposition_objects.Micrograph(file: str, ext: str, n_frames: int, dimx: int, dimy: int, dtype: str, headtype: str, apix: float, voltage: float, spherical_abberation: float)

Bases: object

apix: float
dimx: int
dimy: int
dtype: str
ext: str
file: str
headtype: str
n_frames: int
spherical_abberation: float
voltage: float
pipeliner.deposition_tools.empiar_deposition_objects.empiar_check(emp_dep_obj: Any, attribute: str, number: int)

Check that an attribute in empiar format is valid

Checks values in the form (“T<n>, “”) for things like EMPIARs, header_format value

Parameters:
  • emp_dep_obj (object) – The EMPIAR DepositionObject

  • attribute (str) – Name of the attribute to check

  • number (int) – Max value for the ‘T’ value, (OT OtherType) will always be added as well

Returns:

The error or None if there is no error

Return type:

str

pipeliner.deposition_tools.empiar_deposition_objects.empiar_type_is_valid(empobj: Any)
pipeliner.deposition_tools.empiar_deposition_objects.get_imgfile_info(imgfile: str, blockname: str, img_block_col: str) Tuple[Dict[str, Tuple[float, float, float]], List[str]]

Get information from the starfile containing image info

Parameters:
  • imgfile (str) – The node object for the file

  • blockname (str) – Name of the images block in the starfile

  • img_block_col (str) – The name of the column for the images in the image data block of the starfile

Returns:

( Dict with info about the opitcs groups {og_number: (apix, voltage, sphere. ab)}, List of full paths (relative to the working dir) for all the images in the file, except in the case of movies then the path is relative to import dir)

Return type:

tuple

pipeliner.deposition_tools.empiar_deposition_objects.prepare_empiar_mics(in_file: str) List[EmpiarCorrectedMics]
pipeliner.deposition_tools.empiar_deposition_objects.prepare_empiar_mics_parts_data(mpfile: str, is_parts: bool, is_cor_parts: bool) List[EmpiarData]

Prepare the micrographs or particles portion of an EMPIAR deposition

Parameters:
  • mpfile (str) – The name of the file containing the micrographs or particles

  • is_parts (bool) – Is the image set particles? will affect the info in the details

  • is_cor_parts (bool) – Is the image set corrected (polished particles)?

Returns:

A list of deposition objects

Return type:

list

pipeliner.deposition_tools.empiar_deposition_objects.prepare_empiar_parts(in_file: str, is_polished: bool = False) Sequence[EmpiarParticles | EmpiarRefinedParticles]
pipeliner.deposition_tools.empiar_deposition_objects.prepare_empiar_raw_mics(movfile: str) List[EmpiarMovieSet]

Prepare the raw micrographs portion of an EMPIAR deposition

Parameters:

movfile (str) – Movies star file to operate on

Returns:

A

DepositionObject used to create a deposition

Return type:

List[EmpiarMovieSet]

PDB/EMDB DepositionObjects

PDB/EMDB Deposition object correspond to the schema here: http://ftp.ebi.ac.uk/pub/databases/emdb/doc/XML-schemas/emdb-schemas/v3/current_v3/doc/Untitled.html

class pipeliner.deposition_tools.onedep_deposition_objects.Em2dProjectionSelection(id: int | List[str | NoneType] | str | NoneType = None, entry_id: str = 'ENTRY_ID', details: str | NoneType = None, method: str | NoneType = None, num_particles: int | NoneType = None, software_name: str | NoneType = None, citation_id: int | List[str | NoneType] | str | NoneType = None)

Bases: OneDepData

citation_id: int | List[str | None] | str | None = None
details: str | None = None
entry_id: str = 'ENTRY_ID'
method: str | None = None
num_particles: int | None = None
software_name: str | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.Em3dReconstruction(id: int | List[str | NoneType] | str | NoneType = None, entry_id: str = 'ENTRY_ID', num_particles: int | NoneType = None, symmetry_type: str | NoneType = None, image_processing_id: int | List[str | NoneType] | str | NoneType = None, actual_pixel_size: float | NoneType = None, algorithm: str | NoneType = None, citation_id: int | List[str | NoneType] | str | NoneType = None, ctf_correction_method: str | NoneType = None, details: str | NoneType = None, euler_angles_details: str | NoneType = None, fsc_type: str | NoneType = None, magnification_calibration: str | NoneType = None, method: str | NoneType = None, nominal_pixel_size: float | NoneType = None, num_class_averages: int | NoneType = None, refinement_type: str | NoneType = None, resolution: float | NoneType = None, resolution_method: str | NoneType = None, software: str | NoneType = None)

Bases: OneDepData

actual_pixel_size: float | None = None
algorithm: str | None = None
citation_id: int | List[str | None] | str | None = None
ctf_correction_method: str | None = None
details: str | None = None
entry_id: str = 'ENTRY_ID'
euler_angles_details: str | None = None
fsc_type: str | None = None
image_processing_id: int | List[str | None] | str | None = None
magnification_calibration: str | None = None
method: str | None = None
nominal_pixel_size: float | None = None
num_class_averages: int | None = None
num_particles: int | None = None
refinement_type: str | None = None
resolution: float | None = None
resolution_method: str | None = None
software: str | None = None
symmetry_type: str | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.EmCtfCorrection(id: int | List[str | NoneType] | str | NoneType = None, image_processing_id: int | List[str | NoneType] | str | NoneType = None, type: Literal['PHASE FLIPPING AND AMPLITUDE CORRECTION'] = 'PHASE FLIPPING AND AMPLITUDE CORRECTION', details: str | NoneType = None)

Bases: OneDepData

details: str | None = None
image_processing_id: int | List[str | None] | str | None = None
type: Literal['PHASE FLIPPING AND AMPLITUDE CORRECTION'] = 'PHASE FLIPPING AND AMPLITUDE CORRECTION'
class pipeliner.deposition_tools.onedep_deposition_objects.EmImageProcessing(id: int | List[str | NoneType] | str | NoneType = None, image_recording_id: int | List[str | NoneType] | str | NoneType = None, details: str | NoneType = None)

Bases: OneDepData

details: str | None = None
image_recording_id: int | List[str | None] | str | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.EmImageRecording(id: int | List[str | NoneType] | str | NoneType = None, imaging_id: int | List[str | NoneType] | str | NoneType = None, avg_electron_dose_per_image: float | NoneType = None, average_exposure_time: float | NoneType = None, details: str | NoneType = None, detector_mode: str | NoneType = None, film_or_detector_model: str | NoneType = None, num_diffraction_images: int | NoneType = None, num_grids_imaged: int | NoneType = None, num_real_images: int | NoneType = None)

Bases: OneDepData

average_exposure_time: float | None = None
avg_electron_dose_per_image: float | None = None
details: str | None = None
detector_mode: str | None = None
film_or_detector_model: str | None = None
imaging_id: int | List[str | None] | str | None = None
num_diffraction_images: int | None = None
num_grids_imaged: int | None = None
num_real_images: int | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.EmImaging(id: int | List[str | NoneType] | str | NoneType = None, entry_id: str = 'ENTRY_ID', accelerating_voltage: int | NoneType = None, illumination_mode: str | NoneType = None, electron_source: str | NoneType = None, microscope_model: str | NoneType = None, imaging_mode: str | NoneType = None, sample_support_id: int | List[str | NoneType] | str | NoneType = None, specimen_holder_type: str | NoneType = None, specimen_holder_model: str | NoneType = None, details: str | NoneType = None, date: str | NoneType = None, mode: str | NoneType = None, nominal_cs: float | NoneType = None, nominal_defocus_min: float | NoneType = None, nominal_defocus_max: float | NoneType = None, tilt_angle_min: float | NoneType = None, tilt_angle_max: float | NoneType = None, nominal_magnification: float | NoneType = None, calibrated_magnification: float | NoneType = None, energy_filter: str | NoneType = None, energy_window: str | NoneType = None, temperature: float | NoneType = None, detector_distance: float | NoneType = None, recording_temperature_minimum: float | NoneType = None, recording_temperature_maximum: float | NoneType = None)

Bases: OneDepData

accelerating_voltage: int | None = None
calibrated_magnification: float | None = None
date: str | None = None
details: str | None = None
detector_distance: float | None = None
electron_source: str | None = None
energy_filter: str | None = None
energy_window: str | None = None
entry_id: str = 'ENTRY_ID'
illumination_mode: str | None = None
imaging_mode: str | None = None
microscope_model: str | None = None
mode: str | None = None
nominal_cs: float | None = None
nominal_defocus_max: float | None = None
nominal_defocus_min: float | None = None
nominal_magnification: float | None = None
recording_temperature_maximum: float | None = None
recording_temperature_minimum: float | None = None
sample_support_id: int | List[str | None] | str | None = None
specimen_holder_model: str | None = None
specimen_holder_type: str | None = None
temperature: float | None = None
tilt_angle_max: float | None = None
tilt_angle_min: float | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.EmSampleSupport(id: int | List[str | NoneType] | str | NoneType = None, entry_id: str = 'ENTRY_ID', film_material: str | NoneType = None, grid_type: str | NoneType = None, grid_material: str | NoneType = None, grid_mesh_size: str | NoneType = None, details: str | NoneType = None)

Bases: OneDepData

details: str | None = None
entry_id: str = 'ENTRY_ID'
film_material: str | None = None
grid_material: str | None = None
grid_mesh_size: str | None = None
grid_type: str | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.EmSoftware(id: int | List[str | NoneType] | str | NoneType = None, category: Literal['CLASSIFICATION', 'CTF CORRECTION', 'DIFFRACTION INDEXING', 'FINAL EULER ASSIGNMENT', 'IMAGE ACQUISITION', 'INITIAL EULER ASSIGNMENT', 'LAYERLINE INDEXING', 'MASKING', 'MODEL FITTING', 'MODEL REFINEMENT', 'OTHER', 'PARTICLE SELECTION', 'RECONSTRUCTION'] | NoneType = None, details: str | NoneType = None, name: str | NoneType = None, version: str | NoneType = None, image_processing_id: int | List[str | NoneType] | str | NoneType = None, fitting_id: int | List[str | NoneType] | str | NoneType = None, imaging_id: int | List[str | NoneType] | str | NoneType = None)

Bases: OneDepData

category: Literal['CLASSIFICATION', 'CTF CORRECTION', 'DIFFRACTION INDEXING', 'FINAL EULER ASSIGNMENT', 'IMAGE ACQUISITION', 'INITIAL EULER ASSIGNMENT', 'LAYERLINE INDEXING', 'MASKING', 'MODEL FITTING', 'MODEL REFINEMENT', 'OTHER', 'PARTICLE SELECTION', 'RECONSTRUCTION'] | None = None
details: str | None = None
fitting_id: int | List[str | None] | str | None = None
image_processing_id: int | List[str | None] | str | None = None
imaging_id: int | List[str | None] | str | None = None
name: str | None = None
version: str | None = None
class pipeliner.deposition_tools.onedep_deposition_objects.EmSpecimen(id: int | List[str | NoneType] | str | NoneType = None, experiment_id: int = 1, concentration: str | NoneType = None, details: str | NoneType = None, embedding_applied: bool = False, shadowing_applied: bool = False, staining_applied: bool = False, vitrification_applied: bool = True)

Bases: OneDepData

concentration: str | None = None
details: str | None = None
embedding_applied: bool = False
experiment_id: int = 1
shadowing_applied: bool = False
staining_applied: bool = False
vitrification_applied: bool = True
class pipeliner.deposition_tools.onedep_deposition_objects.OneDepData(id: int | List[str | NoneType] | str | NoneType = None)

Bases: object

id: int | List[str | None] | str | None = None
pipeliner.deposition_tools.onedep_deposition_objects.ddc

alias of Em3dReconstruction

Deposition tools

These functions combine the information from the DepositionObjects returned by a PipelinerJob into a format for automated deposition.

class pipeliner.deposition_tools.onedep_deposition.DepositionData(field_name: str, data_items: object, merge_strat: str, reverse: bool = False)

Bases: object

An object that holds data to be depositied via the onedep system

field_name

The name of the field in the ped/emdb these should be drawn from https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Groups/index.html

Type:

str

data_items

A dataclass from pipeliner.deposition_tools.onedep_deposition_objects containing the data

Type:

object

dc_type

The subclass of the dataclass used for data_items

Type:

DepositionData

cif_type

(“loop” or “pairs”) How the data will be written into the final cif file.

Type:

str

merge_strategy

(“multiple”, “combine”, or “overwrite”) How multiple display objects will be combined, Multiple: means each DepositionData object will be given its own entry in a loop in the cif file. Overwrite: Only the first/last (depending on self.reverse) will be used. Combine: the objects will be combined, with each field being overwritten by subsequent data in a given field if it is not None.

Type:

str

reverse

What order should the objects be considered in when combined False: The latest items take precedence, True: The earliest items take precedence

Type:

bool

add_to_cif(block: Block, did: str)

Add a DepositionData object to a deposition cif file

Parameters:
  • block (gemmi.cif.Block) – The block of the cif.Document that is being written

  • did (str) – The id of the deposition that is being written

Raises:

ValueError – If the cif type is not in (“loop”, “pairs”)

static cifformat(val: int | str | float | bool | None, depoid: str)

Format data for writing to cif files in gemmi

All data must be strings Anything with spaces gets single quotated None is replaced with “?”

Parameters:
  • val (Any) – The data value

  • depoid (str) – The id of the deposition, which will be substituted for if the val is the placeholder “ENTRY_ID”

Returns:

The updated value

Return type:

str

update_job_references(jobs_dict: Dict[str, list])

Update the job references in a DepositionData object

If an object needs to refer to a DepositionData object from another job it uses the placeholder “JOBREF:<jobname>”. Replaces the placeholder with the uuid of the appropriate DepositionData object

write_json(output_dir: str)

Write a json file from DepositionData object

Writes a file called deposition_<field_name>.json :param output_dir: The dir to write the file in. :type output_dir: str

class pipeliner.deposition_tools.onedep_deposition.OneDepDeposition(terminal_job)

Bases: object

Object for making a onedep deposition

Broken down in submethods for testing

raw_depobjs

A list of lists each contain deposition dataclasses of the same type

Type:

list

merged_depobjs

The raw_repobjs list, which each subgroup merged according to the merge strategy of that type of deposition object

Type:

list

final_depobjs

The merged depobjs with their UIDs and job references updated to their actual values

Type:

list

int_ids

The integer ids that correspond to the UIDs assigned to each of the raw deposition objects

Type:

dict

make_int_ids()

Make a dictionary {UID: integer ID}

merge_depobjs()

Merge deposition objects according to their merge strategy

prepare_deposition()

Do all the steprs to get ready for a deposition

update_jobrefs()

Update job references from ‘JOBREF: jobname’ to the UID for that job’

update_uids_to_int_ids()

Update all UIDS in deposition objects to the integer ids

write_deposition_cif_file(depo_id: str | None)

Write a cif file for deposition through the onedep system

Parameters:

depo_id – The id of the deposition, assigned by EMDB/PDB

Raises:

ValueError – If the deposition has not been prepared to write yet

pipeliner.deposition_tools.onedep_deposition.gather_onedep_jobs_and_depobjects(terminal_job: str) Tuple[List[List[DepositionData]], Dict[str, List[DepositionData]]]

Prepare a onedep deposition

Parameters:

terminal_job (str) – The job to use

Returns:

The gathered DepositionData objects. Each sublist

contains objects of one type.

Return type:

List[List[DepositionData]]

Raises:

ValueError – If the terminal job is not found.

pipeliner.deposition_tools.onedep_deposition.make_deposition_data_object(data_obj, uid: int | List[str | None] | str | None = '') DepositionData

Makes a new DepositionData object of the desired type

Parameters:
  • data_obj (object) – The data class for the object with the data in it

  • uid (str) – A uuid4 for the object, if not specified one will be created, if None, no UID is needed

Returns:

The created DepositionData object

Return type:

DepositionData

Raises:

ValueError – If the data dict is not appropriate for the selected dataclass

pipeliner.deposition_tools.onedep_deposition.merge_depdata_items(new_di: DepositionData, old_di: List[DepositionData]) List[DepositionData]

Merge together DepositionData objects, using their merge strategy

Parameters:
  • new_di (DepositionData) – The object to be merged in

  • old_di (List[DepositionData]) – Object(s) to be merged into. For DepositionData objects with the ‘multiple’ merge strategy this will be multiple objects for all others it will be a single object in the list

Returns:

The merged object(s) a single item for “combine” and

”overwrite” merge strategies and multiple items for “multiple”

Return type:

List[DepositionData]

Raises:

ValueError – If the objects have different merge strategies