In current release of R-loopBase, we integrated 358 datasets generated in human cells by 16 different technologies (Table 1), 71 datasets in mouse cells by 7 different technologies, 46 datasets in fruit fly cells by 4 different technologies, 23 datasets in yeast cells by 4 different technologies (for meta information please refer to Download). We next developed a comprehensive work-flow for quality control and data analysis. Briefly, technical replicates if existed were merged first, and raw sequencing data were then mapped to the genome (hg38/mm39/dm6/sacCer3) using Bowtie2 local alignment mode. Uniquely-mapped non-redundant reads were kept as useful reads and samples with >7M useful reads were considered as with sufficient read counts. To maximally leverage the sequencing data, biological replicates with <7M useful reads were merged to meet with the minimal reads count cutoff as long as they were highly correlated (Spearman correlation coefficient >0.5). For mouse, fruit fly, and yeast, due to the relatively small data volumes, samples with <7M useful reads were not filtered out. Finally, peak calling was done with MACS2 for all useful reads (DRIP-seq, DRIVE-seq, MapR, R-loop CUT&Tag, R-ChIP-NS, enDR3-ChIP and S1-DRIP-seq) or useful reads from Watson or Crick strand separately (DRIPc-seq, RDIP-seq, ssDRIP-seq, qDRIP-seq, R-ChIP-SS, RR-ChIP, enDR3-DRIPc-seq and BisMapR), using q-value cutoff 0.01 for narrow peak (R-ChIP-SS, R-ChIP-NS, R-loop CUT&Tag and RIAN-seq) and 0.05 for broad peak (DRIP-seq, DRIPc-seq, RDIP-seq, ssDRIP-seq, qDRIP-seq, DRIVE-seq, MapR, RR-ChIP, enDR3-ChIP, enDR3-DRIPc-seq, BisMapR and S1-DRIP-seq). If multiple biological replicates existed, peaks with ≥50bp overlap among ≥2 replicates were merged and taken as reproducible peaks. Human samples with <100 peaks, mouse samples with <50 peaks, fruit fly and yeast samples with <10 peaks called were discarded. Following ENCODE guidelines for ChIP-seq data analyses, we further calculated signal portion of tags (SPOT) and reads in blacklisted regions (RiBL) as part of quality control matrix for users' reference. Only peaks outside of ChIP-seq blacklisted regions were used for downstream analysis. Specially, bisDRIP-seq and spKAS-seq data are not readily for peak calling, we instead uploaded their processed signal tracks onto our genome browser for visualization and comparison with other R-loop mapping data. In total, 358 datasets for 55 human cells generated by 16 different technologies (Table 1), 71 datasets for 17 mouse cells generated by 7 different technologies (Table 2), 46 datasets for 16 fruit fly cells generated by 4 different technologies (Table 3), 23 datasets for 5 yeast cells generated by 4 different technologies (Table 4), have been included in current release of R-loopBase .
Table 1. Meta information for Human R-loop mapping data| Technology | Treatment | Biological Samples | Datasets | PMID |
|---|---|---|---|---|
| DRIP-seq | Control | B-cell (1/1*), CHLA10 (1/1), EWS502(1/1), HeLa (10/9), HEK293 (2/2), SHSY5Y (2/2), TC32 (1/1), Stromal (4/4), Basal-epithelial (4/4), Luminal-progenitor (4/4), Mature-luminal-epithelial (4/4), MCF-7 (1/1), NT2 (6/6), K562 (3/1), Primary-fibroblast (2/2), U2OS (18/14), U87 (2/2), Jurkat (2/0), T-cells (2/0), IMR-90 (1/0), HEK293T (3/2), A375 (1/1), M231 (4/4), CD34+ (4/2), CD4+-HD (1/1), DMS114 (1/1), HCT116 (4/4), MCF10A (2/1), MDA-MB-436 (2/2), MOLM13 (3/2), Nalm-6 (2/2), Neuroblastoma (6/6), PC-9 (2/2), SNB-19 (3/2), Sperm (4/1), U251 (1/1), Blood (2/2), foreskin-fibroblast (1/0), KM27 (1/0), HFK (2/0), CIN612 (2/0), tIMEC (4/0) | 126/95 | 30108179, 32669707, 32769985, 28802045, 30060749, 28270613, 28649985, 27552054, 27373332, 23868195, 22387027, 26182405, 32747416, 30591567, 29416069, 29416038, 32439635, 32398827, 32686621, 28341774, 32615088, 38267456, 38200551, 38787953, 36453989, 38184854, 36999631, 35618715, 33661429, 38589367, 39178326, 34624217, 37831098, 35180428, 33986538, 37697435, 37139234, 37735199, 37463047, 37777505, 35929179, 37010886, 38761375, 40138394, 39828096, 40108134, 41129225, 40836041, 39779698, 40386679, 40382323 |
| Knock down | U2OS (17/12) , U87 (2/2), HeLa (9/8), HEK293 (2/2), SHSY5Y (2/2), M231 (6/0), MOLM13 (2/0), KM27 (1/0), tIMEC (4/0), MCF10A (1/1), U251 (1/1), HEK293T (2/2), DMS114 (1/1), MDA-MB-436 (4/4), A375 (1/1), Nalm-6 (2/2), PC-9 (2/2), CD34+ (2/1), Neuroblastoma (6/6) | 67/47 | 32747416, 32669707, 32686621, 32769985, 30060749, 28270613, 38200551, 38184854, 36999631, 35618715, 33661429, 38589367, 34624217, 37831098, 35180428, 37697435, 37735199, 37463047, 35929179, 38761375, 40138394, 39828096, 40108134, 41129225, 39779698, 40386679, 40382323 | |
| RDIP-seq | Control | HeLa (2/2), IMR-90 (1/1), HEK293T (1/0) | 4/3 | 30449723, 26579211 |
| Knock down | HeLa (2/2) | 2/2 | 30449723 | |
| DRIPc-seq | Control | K562 (3/3), HEK293 (2/2), NT2 (2/2), CD4+-AD (3/3), MOLM13 (2/1) | 12/11 | 32439635, 30060749, 27373332, 40823807, 35180428, 33986538, 40836041 |
| Knock down | K562 (4/4), HEK293 (2/2), MOLM13 (2/0) | 8/6 | 32439635, 30060749, 35180428, 33986538 | |
| ssDRIP-seq | Control | HeLa (3/3), hVECs (2/2), hESCs (2/2), hiPSCs (2/2), hMSCs (2/2), hNSCs (2/2), hVSMCs (2/2) | 15/15 | 31606733, 32640435 |
| Knock down | HeLa (3/3) | 3/3 | 31606733 | |
| qDRIP-seq | Control | HeLa (3/2), U2OS (2/2) | 5/4 | 32544226, 40447771 |
| Knock down | U2OS (1/1) | 1/1 | 40447771 | |
| R-ChIP-SS | Control | HEK293T (5/5), K562 (2/2), HeLa (1/0), HEK293 (2/2), HUH7 (1/0) | 11/9 | 29104020, 32966794, 39019869, 40790042 |
| Knock down | HEK293 (2/2) | 2/2 | 39019869 | |
| MapR | Control | HEK293 (3/3), U87T (2/2), HeLa (2/2), SKM-1 (1/1), CUTLL1 (6/6), hiPSC (2/2), HCT116 (3/3), U2OS (3/3) | 22/22 | 31665646, 34916496, 35061527, 35013239, 40037355, 40613709 |
| Knock down | HCT116 (3/3), U2OS (3/3), hiPSC (2/2), SKM-1 (1/1), HeLa (2/2), CUTLL1 (2/2) | 13/13 | 34916496, 35061527, 35013239, 40037355, 40613709 | |
| R-loop CUT&Tag | Control | HEK293T (7/6), HEK293 (1/1), OMM2.3 (1/1), DLD-1 (2/0), HeLa (4/0) | 15/8 | 33597247, 37557913, 34232287, 37270643, 38858601, 40579572 |
| Knock down | DLD-1 (2/2), HEK293T (2/0), OMM2.3 (1/0) | 5/2 | 37557913, 34232287, 37270643 | |
| RIAN-seq | Control | HEK293T (4/4), HeLa (2/2) | 6/6 | 40112807 |
| Knock down | HEK293T (2/2) | 2/2 | 40112807 | |
| R-ChIP-NS | Control | U2OS (2/2) | 2/2 | 38717338 |
| Knock down | U2OS (2/0) | 2/0 | 38717338 | |
| bisDRIP-seq | Control | MCF-7 (13/13) | 13/13 | 29072160 |
| DRIVE-seq | Control | NT2 (1/1) | 1/1 | 22387027 |
| RR-ChIP | Control | HeLa (2/2) | 2/2 | 31679819 |
| spKAS-seq | Control | HEK293 (6/6), HEK293T (2/2), HepG2 (2/2), K562 (2/2) | 12/12 | 39351875, 36449625 |
| enDR3-ChIP | Control | HeLa (1/1) | 1/1 | 40823807 |
| enDR3-DRIPc-seq | Control | HeLa (3/3) | 3/3 | 40823807 |
| SUM | - | - | 358/285 | - |
| Technology | Treatment | Biological Samples | Datasets | PMID |
|---|---|---|---|---|
| DRIP-seq | Control | 3T3 (1/1), E14 (1/1), HEPA1-6 (2/2), Liver (2/2), N2a (3/3), MEF (1/1) | 10/10 | 38219817, 27373332, 33357438, 39369271 |
| Knock down | MEF (1/1) , N2a (6/6) | 7/7 | 38219817, 33357438 | |
| CUT&Tag | Control | CD4+-naive-T-cell (2/1), Leptotene-zygotene (3/3), mESC (4/4), Pachytene-diplotene (3/2), Spermatogonia (3/3), Spleen (2/2), Secondary spermatocytes (3/0) | 20/15 | 36396044, 40504899, 40911696, 41417894 |
| Knock down | mESC (8/8), Spleen (2), CD4+-naive-T-cell (2/0) | 12/10 | 40504899, 40911696, 41417894 | |
| ssDRIP-seq | Control | Retina (4/2), Testis (2/2), iPS (2/2), MEF (2/2) | 10/8 | 32704541, 39824811, 40047526 |
| Knock down | Testis (2/2) | 2/2 | 39824811 | |
| RIAN-seq | Control | mESC (2/2) | 2/2 | 40112807 |
| DRIPc-seq | Control | 3T3 (1/1) | 1/1 | 27373332 |
| MapR | Control | mESC (2/2), B-cells (3/3) | 5/5 | 34937926, 33620319 |
| BisMapR | Control | mESC (2/2) | 2/2 | 33620319 |
| SUM | - | - | 71/62 | - |
| Technology | Treatment | Biological Samples | Datasets | PMID |
|---|---|---|---|---|
| CUT&Tag | Control | Embryo-E10-14h (1/1), Embryo-E14-16h (4/4), Embryo-E20-22h (2/2), Embryo-E2-4h (2/2), Embryo-E8-10h (2/2), Larva-L3 (2/2) | 13/13 | 39470713 |
| Knock down | Embryo-E14-16h (2/2) | 2/2 | 39470713 | |
| ssDRIP-seq | Control | Brain (2/2) | 2/2 | 40660165 |
| Knock down | Brain (2/2) | 2/2 | 40660165 | |
| DRIP-seq | Control | 10-14H-Oregon-R-embryos (2/2), 2-6H-Oregon-R-embryos (2/2), DGRP-379-female (2/2), DGRP-379-male (2/2), DGRP-732-female (2/2), DGRP-732-male (2/2), S2 (4/4), w1118-ovaries (2/2) | 18/18 | 32286294, 28819201, 40036798 |
| MapR | Control | Photoreceptor-neuron (9/9) | 9/9 | 35048512 |
| SUM | - | - | 46/46 | - |
| Technology | Treatment | Biological Samples | Datasets | PMID |
|---|---|---|---|---|
| DRIP-seq | Control | S288C (2/2), WRBb-9D (1/1), BY4741 (1/0) | 4/3 | 29954833, 25357144, 27638543, 40064914 |
| Knock down | BY4741 (2/2) | 2/2 | 25357144 | |
| S1-DRIP-seq | Control | S288C (4/4) | 4/4 | 27298336 |
| Knock down | S288C (4/4) | 4/4 | 27298336 | |
| DRIPc-seq | Control | W303-G1 (1/1), W303-S (2/2) | 3/3 | 34294712 |
| Knock down | W303-S (4/4) | 4/4 | 34294712 | |
| RIAN-seq | Control | S288C (2/2) | 2/2 | 40112807 |
| SUM | - | - | 23/22 | - |