Skip to content

Log-Transformed Actual Rates vs. Integer Grid Indices #160

@ArtPoon

Description

@ArtPoon

Right now, the authors use arbitrary integer grid indices (1 to 20) as coordinates for the Wasserstein distance calculation. A cleaner, more continuous approach is to use the actual FUBAR rate coordinates (alpha and beta) and apply a variance-stabilizing transformation like log(rate + 0.05). I re-ran the entire analysis using log-rate coordinates, and here is what happens:

  • The new distance matrix correlates very strongly with the index-based one (r = 0.966), showing the overall geometry is preserved.
  • However, in the sequential PERMANOVA, using the continuous log-rate coordinates actually reduces the variance explained by the technical confounder log(ncod) from 58.3% to 51.4%, while increasing the biological interaction term from 0.56% to 0.86% (and making it more significant: P = 0.0089 vs P = 0.025).
  • In their 2D residualized space, the interaction term explains 5.5% of the variance (compared to 3.7% using indices) with a highly significant P-value of 0.0002.
    This tells us that the grid discretization indices introduce artifacts that actually compound the confounding and weaken the biological signal. The authors should consider switching to continuous log-transformed rate coordinates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions