We are currently in beta testing! Please report any issues you faced while using SCRuB in our Issues Page, or email gia2105@columbia.edu .
Welcome to SCRuB! This is a
software package for in silico
removal of contamination
from microbial datasets using process controls.
In addition to the statistical framework of SCRuB, we provide preSCRuB, a personalized interface which offers recommendations for placement of controls during experimental design processes.
SCRuB is currently available through our github repository, and
can be installed using devtools
:
devtools::install_github("korem-lab/SCRuB")
torch::install_torch()
Additionally, we provide SCRuB as a QIIME2 plugin, which can be installed from a QIIME2 environment using pip:
pip install git+https://github.com/korem-lab/q2-SCRuB.git
These tutorials demonstrates the data format needed to set up SCRuB, how to run SCRuB’s core funtions, and how to interpret its results.
Why SCRuB?
Instead of
trying to identify whether a taxa is categorically a contaminant, SCRuB
models the composition of each contamination source (kit contamination,
water contamination, etc.). We assume that taxa present together in a
contamination source will be introduced together to other samples, and
in similar proportions as in the contamination source. Therefore, if a
control sample contains multiple bacteria, and a sample of interest
contains only one of them and at high counts, that one bacteria is
likely not a contaminant. This allows a more accurate and specific
decontamination.
Should I use SCRuB only if I
believe that there is substantial contamination?
Our results
demonstrate that SCRuB will not erroneously remove taxa if provided with
unrelated controls. We therefore support incorporating SCRuB into your
day-to-day analysis pipeline.
What does SCRuB do with the
locations of the samples on the wells?
SCRuB uses these
location to handle the important and common phenomenon of well-to-well
leakage, in which material from biological samples leaks into controls
during experimental procedures. Using the locations of samples during
processing allows us to detect these cases.
What do I do if I have more than
one type of control?
SCRuB supports multiple types of controls,
and performs serial decontamination - each time performing
decontamination using a different type of control. You can specify the
order of decontamination yourself; we recommend to perform
decontamination in the order in which contaminants are introduced.
We sequenced samples from two
different studies on the same batch / plate / sequencing run. Should I
run SCRuB just on the data from my study?
One of the key
advantages of SCRuB is that it uses the shared information across all
samples affected by a certain contamination source (e.g., an extraction
batch). We therefore recommend that you supply SCRuB with all
the relevant samples, including ones that are not related to a
particular study - SCRuB uses the information in those samples to
perform better decontamination.
Is there any benefit in providing
SCRuB with more than one control sample?
Yes! SCRuB uses each
control sample as an independent realization of the contamination source
it represents. More samples allow us to infer the latent composition of
these source more accurately. We recommend at least 2 controls per
source, and you can use