sm.engine package

Submodules

sm.engine.dataset module

class sm.engine.dataset.Dataset(sc, name, owner_email, ds_config, wd_manager, db)[source]

Bases: object

A class representing a mass spectrometry dataset. Backed by a couple of plain text files containing coordinates and spectra.

sc : pyspark.SparkContext
Spark context object
name : String
Dataset name
ds_config : dict
Dataset config file

wd_manager : engine.local_dir.WorkDir db : engine.db.DB

get_dims()[source]
: tuple
A pair of int values. Number of rows and columns
get_norm_img_pixel_inds()[source]
: ndarray
One-dimensional array of indexes for dataset pixels taken in row-wise manner
get_spectra()[source]
: pyspark.rdd.RDD
Spark RDD with spectra. One spectrum per RDD entry.
save_ds_meta()[source]

Save dataset metadata (name, path, image bounds, coordinates) to the database

static txt_to_spectrum_non_cum(s)[source]

sm.engine.db module

sm.engine.es_export module

sm.engine.fdr module

class sm.engine.fdr.FDR(job_id, db_id, decoy_sample_size, target_adducts, db)[source]

Bases: object

decoy_adduct_selection()[source]
estimate_fdr(msm_df)[source]

sm.engine.formulas module

class sm.engine.formulas.Formulas(job_id, db_id, ds_config, db)[source]

Bases: object

A class representing a molecule database to search through. Provides several data structured used in the engine to speedup computation

ds_config : dict
Dataset configuration

db : engine.db.DB

static check_formula_uniqueness(formula_ids, adducts)[source]
get_sf_adduct_peaksn()[source]
: list
An array of triples (formula id, adduct, number of theoretical peaks)
get_sf_peak_bounds()[source]
: tuple
A pair of ndarrays with bound mz values for each molecule from the molecule database
get_sf_peak_ints()[source]
: ndarray
An array of arrays of theoretical peak intensities for each item of the molecule database
get_sf_peak_map()[source]
: ndarray
An array of pairs (formula index, local peak index)
get_sf_peaks()[source]
: ndarray
An array of arrays of theoretical peak mzs for each item of the molecule database

sm.engine.formulas_segm module

class sm.engine.formulas_segm.FormulasSegm(job_id, db_id, ds_config, db)[source]

Bases: object

static check_formula_uniqueness(sf_df)[source]
get_sf_adduct_peaksn()[source]
: list
An array of triples (formula id, adduct, number of theoretical peaks)
get_sf_adduct_sorted_df()[source]
get_sf_peak_df()[source]
get_sf_peak_ints()[source]
static sf_peak_gen(sf_df)[source]

sm.engine.imzml_txt_converter module

sm.engine.isocalc_wrapper module

sm.engine.search_algorithm module

class sm.engine.search_algorithm.SearchAlgorithm(sc, ds, formulas, fdr, ds_config)[source]

Bases: object

calc_metrics(sf_images)[source]
estimate_fdr(all_sf_metrics_df)[source]
filter_sf_images(sf_images, sf_metrics_df)[source]
filter_sf_metrics(sf_metrics_df)[source]
search()[source]

sm.engine.search_job module

sm.engine.search_results module

sm.engine.theor_peaks_gen module

sm.engine.util module

class sm.engine.util.SMConfig[source]

Bases: object

Engine configuration manager

classmethod get_conf()[source]
: dict
SM engine configuration
classmethod set_path(path)[source]

Set path for a SM configuration file

path : String

sm.engine.util.cmd(template, *args)[source]
sm.engine.util.cmd_check(template, *args)[source]
sm.engine.util.local_path(path)[source]
sm.engine.util.proj_root()[source]
sm.engine.util.s3_path(path)[source]

sm.engine.work_dir module

Module contents