NJU rloopbase
Q4. How many R-loop mapping datasets were processed in R-loopBase and how were they processed?

In current release of R-loopBase, we integrated 358 datasets generated in human cells by 16 different technologies (Table 1), 71 datasets in mouse cells by 7 different technologies, 46 datasets in fruit fly cells by 4 different technologies, 23 datasets in yeast cells by 4 different technologies (for meta information please refer to Download). We next developed a comprehensive work-flow for quality control and data analysis. Briefly, technical replicates if existed were merged first, and raw sequencing data were then mapped to the genome (hg38/mm39/dm6/sacCer3) using Bowtie2 local alignment mode. Uniquely-mapped non-redundant reads were kept as useful reads and samples with >7M useful reads were considered as with sufficient read counts. To maximally leverage the sequencing data, biological replicates with <7M useful reads were merged to meet with the minimal reads count cutoff as long as they were highly correlated (Spearman correlation coefficient >0.5). For mouse, fruit fly, and yeast, due to the relatively small data volumes, samples with <7M useful reads were not filtered out. Finally, peak calling was done with MACS2 for all useful reads (DRIP-seq, DRIVE-seq, MapR, R-loop CUT&Tag, R-ChIP-NS, enDR3-ChIP and S1-DRIP-seq) or useful reads from Watson or Crick strand separately (DRIPc-seq, RDIP-seq, ssDRIP-seq, qDRIP-seq, R-ChIP-SS, RR-ChIP, enDR3-DRIPc-seq and BisMapR), using q-value cutoff 0.01 for narrow peak (R-ChIP-SS, R-ChIP-NS, R-loop CUT&Tag and RIAN-seq) and 0.05 for broad peak (DRIP-seq, DRIPc-seq, RDIP-seq, ssDRIP-seq, qDRIP-seq, DRIVE-seq, MapR, RR-ChIP, enDR3-ChIP, enDR3-DRIPc-seq, BisMapR and S1-DRIP-seq). If multiple biological replicates existed, peaks with ≥50bp overlap among ≥2 replicates were merged and taken as reproducible peaks. Human samples with <100 peaks, mouse samples with <50 peaks, fruit fly and yeast samples with <10 peaks called were discarded. Following ENCODE guidelines for ChIP-seq data analyses, we further calculated signal portion of tags (SPOT) and reads in blacklisted regions (RiBL) as part of quality control matrix for users' reference. Only peaks outside of ChIP-seq blacklisted regions were used for downstream analysis. Specially, bisDRIP-seq and spKAS-seq data are not readily for peak calling, we instead uploaded their processed signal tracks onto our genome browser for visualization and comparison with other R-loop mapping data. In total, 358 datasets for 55 human cells generated by 16 different technologies (Table 1), 71 datasets for 17 mouse cells generated by 7 different technologies (Table 2), 46 datasets for 16 fruit fly cells generated by 4 different technologies (Table 3), 23 datasets for 5 yeast cells generated by 4 different technologies (Table 4), have been included in current release of R-loopBase .

Table 1. Meta information for Human R-loop mapping data
Technology Treatment Biological Samples Datasets PMID
DRIP-seq Control B-cell (1/1*), CHLA10 (1/1), EWS502(1/1), HeLa (10/9), HEK293 (2/2), SHSY5Y (2/2), TC32 (1/1), Stromal (4/4), Basal-epithelial (4/4), Luminal-progenitor (4/4), Mature-luminal-epithelial (4/4), MCF-7 (1/1), NT2 (6/6), K562 (3/1), Primary-fibroblast (2/2), U2OS (18/14), U87 (2/2), Jurkat (2/0), T-cells (2/0), IMR-90 (1/0), HEK293T (3/2), A375 (1/1), M231 (4/4), CD34+ (4/2), CD4+-HD (1/1), DMS114 (1/1), HCT116 (4/4), MCF10A (2/1), MDA-MB-436 (2/2), MOLM13 (3/2), Nalm-6 (2/2), Neuroblastoma (6/6), PC-9 (2/2), SNB-19 (3/2), Sperm (4/1), U251 (1/1), Blood (2/2), foreskin-fibroblast (1/0), KM27 (1/0), HFK (2/0), CIN612 (2/0), tIMEC (4/0) 126/95 30108179, 32669707, 32769985, 28802045, 30060749, 28270613, 28649985, 27552054, 27373332, 23868195, 22387027, 26182405, 32747416, 30591567, 29416069, 29416038, 32439635, 32398827, 32686621, 28341774, 32615088, 38267456, 38200551, 38787953, 36453989, 38184854, 36999631, 35618715, 33661429, 38589367, 39178326, 34624217, 37831098, 35180428, 33986538, 37697435, 37139234, 37735199, 37463047, 37777505, 35929179, 37010886, 38761375, 40138394, 39828096, 40108134, 41129225, 40836041, 39779698, 40386679, 40382323
Knock down U2OS (17/12) , U87 (2/2), HeLa (9/8), HEK293 (2/2), SHSY5Y (2/2), M231 (6/0), MOLM13 (2/0), KM27 (1/0), tIMEC (4/0), MCF10A (1/1), U251 (1/1), HEK293T (2/2), DMS114 (1/1), MDA-MB-436 (4/4), A375 (1/1), Nalm-6 (2/2), PC-9 (2/2), CD34+ (2/1), Neuroblastoma (6/6) 67/47 32747416, 32669707, 32686621, 32769985, 30060749, 28270613, 38200551, 38184854, 36999631, 35618715, 33661429, 38589367, 34624217, 37831098, 35180428, 37697435, 37735199, 37463047, 35929179, 38761375, 40138394, 39828096, 40108134, 41129225, 39779698, 40386679, 40382323
RDIP-seq Control HeLa (2/2), IMR-90 (1/1), HEK293T (1/0) 4/3 30449723, 26579211
Knock down HeLa (2/2) 2/2 30449723
DRIPc-seq Control K562 (3/3), HEK293 (2/2), NT2 (2/2), CD4+-AD (3/3), MOLM13 (2/1) 12/11 32439635, 30060749, 27373332, 40823807, 35180428, 33986538, 40836041
Knock down K562 (4/4), HEK293 (2/2), MOLM13 (2/0) 8/6 32439635, 30060749, 35180428, 33986538
ssDRIP-seq Control HeLa (3/3), hVECs (2/2), hESCs (2/2), hiPSCs (2/2), hMSCs (2/2), hNSCs (2/2), hVSMCs (2/2) 15/15 31606733, 32640435
Knock down HeLa (3/3) 3/3 31606733
qDRIP-seq Control HeLa (3/2), U2OS (2/2) 5/4 32544226, 40447771
Knock down U2OS (1/1) 1/1 40447771
R-ChIP-SS Control HEK293T (5/5), K562 (2/2), HeLa (1/0), HEK293 (2/2), HUH7 (1/0) 11/9 29104020, 32966794, 39019869, 40790042
Knock down HEK293 (2/2) 2/2 39019869
MapR Control HEK293 (3/3), U87T (2/2), HeLa (2/2), SKM-1 (1/1), CUTLL1 (6/6), hiPSC (2/2), HCT116 (3/3), U2OS (3/3) 22/22 31665646, 34916496, 35061527, 35013239, 40037355, 40613709
Knock down HCT116 (3/3), U2OS (3/3), hiPSC (2/2), SKM-1 (1/1), HeLa (2/2), CUTLL1 (2/2) 13/13 34916496, 35061527, 35013239, 40037355, 40613709
R-loop CUT&Tag Control HEK293T (7/6), HEK293 (1/1), OMM2.3 (1/1), DLD-1 (2/0), HeLa (4/0) 15/8 33597247, 37557913, 34232287, 37270643, 38858601, 40579572
Knock down DLD-1 (2/2), HEK293T (2/0), OMM2.3 (1/0) 5/2 37557913, 34232287, 37270643
RIAN-seq Control HEK293T (4/4), HeLa (2/2) 6/6 40112807
Knock down HEK293T (2/2) 2/2 40112807
R-ChIP-NS Control U2OS (2/2) 2/2 38717338
Knock down U2OS (2/0) 2/0 38717338
bisDRIP-seq Control MCF-7 (13/13) 13/13 29072160
DRIVE-seq Control NT2 (1/1) 1/1 22387027
RR-ChIP Control HeLa (2/2) 2/2 31679819
spKAS-seq Control HEK293 (6/6), HEK293T (2/2), HepG2 (2/2), K562 (2/2) 12/12 39351875, 36449625
enDR3-ChIP Control HeLa (1/1) 1/1 40823807
enDR3-DRIPc-seq Control HeLa (3/3) 3/3 40823807
SUM - - 358/285 -
*number of datasets analyzed / high-quality datasets.
Table 2. Meta information for Mouse R-loop mapping data
Technology Treatment Biological Samples Datasets PMID
DRIP-seq Control 3T3 (1/1), E14 (1/1), HEPA1-6 (2/2), Liver (2/2), N2a (3/3), MEF (1/1) 10/10 38219817, 27373332, 33357438, 39369271
Knock down MEF (1/1) , N2a (6/6) 7/7 38219817, 33357438
CUT&Tag Control CD4+-naive-T-cell (2/1), Leptotene-zygotene (3/3), mESC (4/4), Pachytene-diplotene (3/2), Spermatogonia (3/3), Spleen (2/2), Secondary spermatocytes (3/0) 20/15 36396044, 40504899, 40911696, 41417894
Knock down mESC (8/8), Spleen (2), CD4+-naive-T-cell (2/0) 12/10 40504899, 40911696, 41417894
ssDRIP-seq Control Retina (4/2), Testis (2/2), iPS (2/2), MEF (2/2) 10/8 32704541, 39824811, 40047526
Knock down Testis (2/2) 2/2 39824811
RIAN-seq Control mESC (2/2) 2/2 40112807
DRIPc-seq Control 3T3 (1/1) 1/1 27373332
MapR Control mESC (2/2), B-cells (3/3) 5/5 34937926, 33620319
BisMapR Control mESC (2/2) 2/2 33620319
SUM - - 71/62 -
Table 3. Meta information for Fruit Fly R-loop mapping data
Technology Treatment Biological Samples Datasets PMID
CUT&Tag Control Embryo-E10-14h (1/1), Embryo-E14-16h (4/4), Embryo-E20-22h (2/2), Embryo-E2-4h (2/2), Embryo-E8-10h (2/2), Larva-L3 (2/2) 13/13 39470713
Knock down Embryo-E14-16h (2/2) 2/2 39470713
ssDRIP-seq Control Brain (2/2) 2/2 40660165
Knock down Brain (2/2) 2/2 40660165
DRIP-seq Control 10-14H-Oregon-R-embryos (2/2), 2-6H-Oregon-R-embryos (2/2), DGRP-379-female (2/2), DGRP-379-male (2/2), DGRP-732-female (2/2), DGRP-732-male (2/2), S2 (4/4), w1118-ovaries (2/2) 18/18 32286294, 28819201, 40036798
MapR Control Photoreceptor-neuron (9/9) 9/9 35048512
SUM - - 46/46 -
Table 4. Meta information for Yeast R-loop mapping data
Technology Treatment Biological Samples Datasets PMID
DRIP-seq Control S288C (2/2), WRBb-9D (1/1), BY4741 (1/0) 4/3 29954833, 25357144, 27638543, 40064914
Knock down BY4741 (2/2) 2/2 25357144
S1-DRIP-seq Control S288C (4/4) 4/4 27298336
Knock down S288C (4/4) 4/4 27298336
DRIPc-seq Control W303-G1 (1/1), W303-S (2/2) 3/3 34294712
Knock down W303-S (4/4) 4/4 34294712
RIAN-seq Control S288C (2/2) 2/2 40112807
SUM - - 23/22 -