We are currently in beta testing! Please report any issues you faced while using SCRuB in our Issues Page, or email gia2105@columbia.edu .


Welcome to SCRuB! This is a software package for in silico removal of contamination from microbial datasets using process controls.

In addition to the statistical framework of SCRuB, we provide preSCRuB, a personalized interface which offers recommendations for placement of controls during experimental design processes.

Installation

SCRuB is currently available through our github repository, and can be installed using devtools:

devtools::install_github("korem-lab/SCRuB")
torch::install_torch()

Additionally, we provide SCRuB as a QIIME2 plugin, which can be installed from a QIIME2 environment using pip:

pip install git+https://github.com/korem-lab/q2-SCRuB.git

 

Getting started

To start using SCRuB, follow along one of the following links, according to your preferences:
R tutorial
QIIME2 tutorial

These tutorials demonstrates the data format needed to set up SCRuB, how to run SCRuB’s core funtions, and how to interpret its results.

 

FAQs

Why SCRuB?
Instead of trying to identify whether a taxa is categorically a contaminant, SCRuB models the composition of each contamination source (kit contamination, water contamination, etc.). We assume that taxa present together in a contamination source will be introduced together to other samples, and in similar proportions as in the contamination source. Therefore, if a control sample contains multiple bacteria, and a sample of interest contains only one of them and at high counts, that one bacteria is likely not a contaminant. This allows a more accurate and specific decontamination.

Should I use SCRuB only if I believe that there is substantial contamination?
Our results demonstrate that SCRuB will not erroneously remove taxa if provided with unrelated controls. We therefore support incorporating SCRuB into your day-to-day analysis pipeline.

What does SCRuB do with the locations of the samples on the wells?
SCRuB uses these location to handle the important and common phenomenon of well-to-well leakage, in which material from biological samples leaks into controls during experimental procedures. Using the locations of samples during processing allows us to detect these cases.

What do I do if I have more than one type of control?
SCRuB supports multiple types of controls, and performs serial decontamination - each time performing decontamination using a different type of control. You can specify the order of decontamination yourself; we recommend to perform decontamination in the order in which contaminants are introduced.

We sequenced samples from two different studies on the same batch / plate / sequencing run. Should I run SCRuB just on the data from my study?
One of the key advantages of SCRuB is that it uses the shared information across all samples affected by a certain contamination source (e.g., an extraction batch). We therefore recommend that you supply SCRuB with all the relevant samples, including ones that are not related to a particular study - SCRuB uses the information in those samples to perform better decontamination.

Is there any benefit in providing SCRuB with more than one control sample?
Yes! SCRuB uses each control sample as an independent realization of the contamination source it represents. More samples allow us to infer the latent composition of these source more accurately. We recommend at least 2 controls per source, and you can use to get recommendations that are tailored to your experiment - often, even more than 2 controls would be appropriate.

 

R-CMD-check Codecov test coverage