sm.engine package¶
Subpackages¶
Submodules¶
sm.engine.dataset module¶
-
class
sm.engine.dataset.Dataset(sc, id, name, drop, input_path, wd_manager, db, es)[source]¶ Bases:
objectA class representing a mass spectrometry dataset. Backed by a couple of plain text files containing coordinates and spectra.
-
get_norm_img_pixel_inds()[source]¶ Returns: : ndarray
One-dimensional array of indexes for dataset pixels taken in row-wise manner
-
get_sample_area_mask()[source]¶ Returns: : ndarray
One-dimensional bool array of pixel indices where spectra were sampled
-
sm.engine.db module¶
| synopsis: | Database interface |
|---|
sm.engine.es_export module¶
sm.engine.fdr module¶
sm.engine.formulas_segm module¶
-
class
sm.engine.formulas_segm.FormulasSegm(job_id, db_id, ds_config, db)[source]¶ Bases:
objectA class representing a molecule database to search through. Provides several data structured used in the engine to speedup computation
sm.engine.imzml_txt_converter module¶
| synopsis: | Converter of ImzML into a text format accessible from pyspark |
|---|
-
class
sm.engine.imzml_txt_converter.ImzmlTxtConverter(imzml_path, txt_path, coord_path=None)[source]¶ Bases:
objectConverts spectra from imzML/ibd to plain text files for later access from Spark
-
sm.engine.imzml_txt_converter.encode_coord_line(index, x, y)[source]¶ Encodes given coordinate into a csv line: “index,x,y”
sm.engine.isocalc_wrapper module¶
-
class
sm.engine.isocalc_wrapper.Centroids(isotope_pattern, resolving_power, pts_per_mz=None)[source]¶ Bases:
object-
empty¶
-
-
class
sm.engine.isocalc_wrapper.IsocalcWrapper(isocalc_config)[source]¶ Bases:
objectWrapper around pyMSpec.pyisocalc.pyisocalc used for getting theoretical isotope peaks’ centroids and profiles for a sum formula.
-
formatted_iso_peaks(sf, adduct)[source]¶ Returns: : str
A one line string with tab separated lists. Every list is a comma separated string.
-
sm.engine.metabolights module¶
sm.engine.queue module¶
sm.engine.search_algorithm module¶
sm.engine.search_job module¶
| synopsis: | Molecular search job driver |
|---|
-
class
sm.engine.search_job.SearchJob(ds_id, ds_name, drop, input_path, sm_config_path, no_clean=False)[source]¶ Bases:
objectMain class responsible for molecule search. Uses other modules of the engine.
-
run(ds_config_path=None)[source]¶ - Entry point of the engine. Molecule search is completed in several steps:
- Copying input data to the engine work dir
- Conversion input data (imzML+ibd) to plain text format. One line - one spectrum data
- Generation and saving to the database theoretical peaks for all formulas from the molecule database
- Molecules search. The most compute intensive part. Spark is used to run it in distributed manner.
- Saving results (isotope images and their metrics of quality for each putative molecule) to the database
-
sm.engine.search_results module¶
sm.engine.theor_peaks_gen module¶
-
class
sm.engine.theor_peaks_gen.TheorPeaksGenerator(sc, sm_config, ds_config)[source]¶ Bases:
objectGenerator of theoretical isotope peaks for all molecules in a database.
-
find_sf_adduct_cand(sf_list, stored_sf_adduct)[source]¶ Returns: : list
List of (formula id, formula, adduct) triples which don’t have theoretical patterns saved in the database
-
sm.engine.util module¶
sm.engine.work_dir module¶
| synopsis: | Access to datasets stored in a local directory or on S3 |
|---|
-
class
sm.engine.work_dir.LocalWorkDir(base_path, ds_name)[source]¶ Bases:
object-
coord_path¶
-
ds_config_path¶
-
ds_metadata_path¶
-
imzml_path¶
-
txt_path¶
-
-
class
sm.engine.work_dir.S3WorkDir(base_path, ds_name, s3, s3transfer)[source]¶ Bases:
object-
coord_path¶
-
ds_config_path¶
-
txt_path¶
-
-
class
sm.engine.work_dir.WorkDirManager(ds_id)[source]¶ Bases:
objectProvides access to the work directory of the target dataset
-
coord_path¶
-
copy_input_data(input_data_path, ds_config_path)[source]¶ Copy imzML/ibd/config/meta files from input path to a dataset work directory
-
ds_config_path¶
-
ds_metadata_path¶
-
txt_path¶
-