Overview & Recommendations

This document outlines the research plan to identify a robust and production-ready toolchain for wavelet analysis of biological signals. The goal is to find libraries for multi-signal analysis, identify suitable validation datasets, confirm methods for statistical significance, and select a framework for interactive visualization. The chart below summarizes the top candidates against key project requirements.

Top Candidate Feature Matrix

RQ1: Multi-Signal Wavelet Analysis Libraries

The primary goal is to identify Python libraries supporting wavelet coherence and cross-wavelet analysis, which are essential for studying interactions between biological time series. The focus is on production-ready libraries with strong support for the Morlet wavelet, good documentation, and active maintenance.

Library Feature Comparison

Comparison of the leading candidates for core wavelet analysis functionalities.

Feature pycwt PyWavelets ssqueezepy
Cross-Wavelet / Coherence ✔ Yes ✘ No ✔ Yes
Morlet Wavelet Support ✔ Yes ✔ Yes ✔ Yes
Statistical Significance ✔ Built-in ✘ Manual Partial
Active Maintenance (2023+) ✔ Yes ✔ Yes ✔ Yes
License BSD-3-Clause MIT MIT

RQ2: Validation Datasets

To ensure the reliability of the chosen tools, we must validate them against datasets with known, ground-truth periodicities. This research question focuses on identifying publicly available biological time series, such as circadian rhythm or cell cycle data, that are standard benchmarks in the field.

Public Data Sources

  • CircaDB: A comprehensive database for circadian gene expression data.
  • BioClock Database: Curated datasets from biological clock research.
  • Published Papers: Supplementary data from key publications in the field.
  • Tool Repositories: Example and test datasets bundled with existing libraries.

Validation Criteria

  • Verified Ground Truth: Data must have confirmed periodicities (e.g., 24h for circadian).
  • Clear Protocols: Use validation methods from published studies.
  • Error Margins: Define acceptable error for period detection.
  • Synthetic Data: Use signal generators for controlled testing scenarios.

RQ3: Statistical Significance Testing

A critical step in wavelet analysis is determining whether observed periodicities are statistically significant or likely due to random chance. This involves testing against a null hypothesis, often a red-noise (AR1) process. We need to identify libraries with built-in or easily adaptable methods for these tests.

Significance Testing Methods in Libraries

Comparing built-in support for standard null hypothesis testing methods.

RQ4: Interactive Visualization Libraries

The output of a wavelet analysis is often a 2D scalogram (a heatmap of power across time and frequency). For effective exploration, these visualizations must be interactive. This research area focuses on identifying JavaScript or Python libraries capable of producing responsive scalograms with features like tooltips, overlays for significance, and the Cone of Influence (COI).

Example Interactive Scalogram (Plotly.js)

This is a demonstration of an interactive heatmap. Hover over cells to see data. The shaded area represents a simulated Cone of Influence (COI), outside of which results are less reliable.