Q5. How is the quantitative R-loop score calculated?
There are huge differences in where R-loops are located within the human genome depending on which
R-loop mapping technology is used. Considering that misleading conclusions will be drawn if R-loops
are mis-assigned, it is thus of the highest priority to define a human R-loop map for the R-loop
research community. Currently, it is premature to conclude which technology is superior to others with
regard to precise mapping of R-loops. However, it is plausible that R-loop peaks with higher scores and supported by
multiple technologies are more likely bona fide R-loops.
Data Preperation
First, to eliminate the influence of inter-sample score differences, we applied a modified Robust Z-score to normalize the signal values of each sample to a comparable level. The Robust Z-score was calculated as (R-loop peak score − lower fence) / MAD, and values greater than 10 were capped, completing per-sample normalization.
Calculation of technology score
Second, to facilitate the summation of signals across samples, R-loop peak signals were assigned to 100-bp sliding windows with which they overlapped by ≥50 bp. The sliding window advanced in 10-bp steps, and the signals within each window were summed, yielding per-sample scores in contiguous 10-bp genomic windows. Within the same technology, each sample was given equal weight, and their window scores were directly summed to obtain a technology score. For stranded R-loop technologies, strand information was retained during computation to define the strandness of the final results; meanwhile, Watson and Crick peaks were merged for calculation to enable summation with non-stranded R-loop technologies.
Calculation of R-loop score and R-loop zone
When computing the R-loop score across all technologies, each technology was assigned equal weight and the scores were summed. The summed window score was further multiplied by the number of technologies with a score > 0 in that window to yield the non-stranded R-loop score. Contiguous regions with a score > 0 were trimmed by removing the portion below 25% of the maximum score within the region, producing non-stranded R-loop regions. The same procedure was applied to stranded R-loop technologies to obtain stranded R-loop score and stranded R-loop regions. Non-stranded R-loop score that overlapped with Watson or Crick R-loop scores were proportionally assigned to the Watson or Crick strand. Non-stranded R-loop regions, if overlapped with stranded R-loop regions, were also assigned to Watson or Crick R-loop zones accordingly. The remaining non-stranded R-loop score and regions were considered to have undetermined strandness.