Star File Utilities

class pipeliner.starfile_handler.BodyFile(fn_in: str)

Bases: StarFile

A STAR file that lists the bodies in a multibody refinement

bodycount

Number bodies in the file

Type:

int

class pipeliner.starfile_handler.JobStar(fn_in: str)

Bases: StarFile

A class for STAR files that define a pipeliner job parameters

jobtype

the job type, converted from the relion nomenclature if necessary

Type:

str

joboptions

The joboptions {name: value} all values are strings regardless of joboption type

Type:

dict

is_continue

is the job a continuation

Type:

bool

is_tomo

is the job tomography (might be removed when relon compatobility is deprecated)

Type:

bool

all_options_as_dict() Dict[str, str]

Returns a dict of all the parameters of a job.star file

The dict contains both the job options and the running options. All values in the dict are strings.

make_substitutions(conversions: Dict[str, str])

Substitute one job name or file name for another wherever it appears

Parameters:

conversions (Dict[str, str]) – {“old job name”: “new job name”}

modify(params_to_change: dict, allow_add=False)

Change multiple values in a job.star from a dict

Parameters:
  • params_to_change (dict) – The Parameters to change in the template in the format {param_name: new_value}

  • allow_add (bool) – Can new joboptions be added?

Raises:

ValueError – If an attempt is made to add new fields with allow_add=False

write(outname: str | PathLike[str] = '')

Write a STAR file from the JobStar

Parameters:

outname (str or Path) – name of the output file; will overwrite the original file if none provided

class pipeliner.starfile_handler.StarFile(fn_in: str | PathLike[str])

Bases: object

A superclass for all types of STAR files used by the pipeliner

file_name

The STAR file

Type:

str or Path

column_as_list(blockname: str | None, colname: str) List[str]

Return a single column from a block as a list

Parameters:
  • blockname (str) – The name of the block to use

  • colname (str) – The name of the column to get

Returns:

The values from that column

Return type:

list

column_headers(block: str = '') List[str]

Get all the column headers for a specific block.

count_block(blockname: str | None = None) int

Count the number of items in a block that only contains a single loop

This is the format in most relion data STAR files

Parameters:

blockname (str) – The name of the block to count

Returns:

The count

Return type:

int

get_block(blockname: str | None = None) Block

Get a block from the STAR file

Parameters:

blockname (str) – The name of the block to get. Use None if the file has a single unnamed block

Returns:

The desired block or None if not found

Return type:

(gemmi.cif.Block)

get_block_names() List[str]
loop_as_list(block: str = '', columns: List[str] | None = None, headers: bool = False) List[List[str]]

Returns a set of columns from a STAR file loop as a list

Parameters:
  • block (str) – The name of the block to get the data from

  • columns (list) – Names of the columns to get, if None then all columns are returned

  • headers (bool) – Should the 1st row of the returned list be the header names

Returns:

The column data one row per sublist. If headers=True

the first sublist contains the headers

Return type:

List[List[str]]

Raises:
  • ValueError – If the specified block is not found

  • ValueError – If the specified block does not contain a loop

  • ValueError – If any of the specified columns are not found

pairs_as_dict(block: str = '') Dict[str, str]

Returns paired values from a STAR file as a dict

Parameters:

block (str) – The name of the block to get the data from

Returns:

{parameter: value}

Return type:

dict

Raises:
  • ValueError – If the specified block is not found

  • ValueError – If the specified block is a loop and not a pair-value

pipeliner.starfile_handler.convert_old_relion_pipeline(fn_in: str)

Convert a pipeline STAR file from RELION 3 or 4 to ccpem-pipeliner format.

The file is also checked and corrected if it contains any reserved words which would cause Gemmi’s CIF parser to fail when the file is read.

Note that this function reads the whole file several times, but pipeline STAR files are never expected to be so large that this would take a significant amount of time.

Returns:

True if the file was converted, or False otherwise.

Raises:

ValueError – if the pipeline file is invalid or its version cannot be determined

pipeliner.starfile_handler.fix_reserved_words(fn_in)

Make sure the STAR file doesn’t contain any illegally used reserved words

Overwrites the original file if it is corrected. The old file is saved as filename.orig.

Returns:

True if the file was corrected, or False if it contained no reserved words and did not need correcting.

pipeliner.starfile_handler.get_particle_box_size_from_starfile(starfile: str) int

Get the image size in pixels from a starfile from the optics group(s)

assumes the images are square and all the same size

Parameters:

starfile (str) – path to starfile

Returns:

edge length in px

Return type:

int

pipeliner.starfile_handler.get_pixel_size_from_starfile(starfile: str) float

Get the pixel size from a starfile from the optics group(s)

Parameters:

starfile (str) – path to starfile

Returns:

pixel size

Return type:

float

pipeliner.starfile_handler.get_starfile_loop_headers(in_file: str) Dict[str, List[str]]

Get the headers for each loop in each block in a STAR file

Parameters:

in_file (str) – The STAR file to operate on

Returns:

{block_name: [header1, header2, … headerN]} Headers

appear in the column order of the STAR file.

Return type:

Dict[str, List[str]]

pipeliner.starfile_handler.interconvert_box_star_coords(fn_in: str, out_name: str | None = None) str

Interconvert .box and .star coordinate files

Parameters:
  • fn_in (str) – The file to convert

  • out_name (str) – Dir to write the output file into, if None writes it the same place as the input, with the same name just switching the extension

Returns:

Name of the file written

Return type:

str

Raises:

ValueError – If the file is not .box or .star

pipeliner.starfile_handler.quotate_starfile_line(in_line: str) str | None

Adds the necessary quotation to a line to make it compatible with gemmi

Parameters:

in_line (str) – The line to operate on

Returns:

The line, properly quotated, or None if no changes to the line are needed

Return type:

str

pipeliner.starfile_handler.read_relion_optimiser_starfile(fn_in) Dict[str, str]

Special function for reading optimiser.star files written by relion

These files have issues with duplicate entries because of a Relion5 bug, which precludes use of gemmi because it raises an error if it encounters duplicate entires If a duplicate entry exists it uses the value from the last instance.

Parameters:

fn_in (str) – Name of the optimiser file

Returns:

The param-value pairs from the file

Return type:

Dict[str, str]

Raises:

ValueError – If the file isn’t an optimiser file

Starfile writing utilities

Writes STAR files in the same style as RELION

pipeliner.star_writer.star_quote(val: str) str

Quote a string value as needed for writing to a STAR file.

If the value contains only characters that can be safely written without quoting, and does not begin with a reserved word of the STAR format, it is returned unchanged.

Otherwise, the value is passed to gemmi.cif.quote() for quoting, which handles all of the more complicated quoting logic that is sometimes required.

pipeliner.star_writer.write(doc: Document, filename: str | PathLike[str])

Write a Gemmi CIF document to a RELION-style STAR file.

The file is written as an atomic operation. This ensures that any processes reading it will always see a valid version (old or new) and not a half-written new file.

The document is written to a temporary file first, then after the writing is complete and the file has been flushed to disk, the target file is replaced with the temporary one. The temporary file will always be removed even if the replacement is unsuccessful.

Note: it is still possible for data to be lost using this function, if two processes try to write to the file at the same time. In that case, one of the new versions of the file will be kept and the other will not.

Parameters:
pipeliner.star_writer.write_jobstar(in_dict: dict, out_fn: str | PathLike[str])

Write a job.star file from a dictionary of options

Parameters:
  • in_dict (dict) – Dict of job option keys and values

  • out_fn (str or Path) – Name of the file to write to

pipeliner.star_writer.write_to_stream(doc: Document, out_stream: IO[str])

Write a Gemmi CIF document to an output stream using RELION’s output style.

Parameters:
  • doc (gemmi.cif.Document) – The data to write out

  • out_stream (str) – The name of the file to write the data to