Longitudinal imaging and biostatistical methods
Organizer: Ivo Dinov, University of Michigan
Chair: Julia Fisher, University of Arizona, BIO5 Institute, Statistics Consulting Laboratory
Presenter: Sharmistha Guha, Texas A&M University
Title: Supervised modeling of heterogeneous networks: investigating functional connectivity across various cognitive control tasks
Abstract: We present a novel Bayesian approach to address limitations in current methods for studying the relationship between functional connectivity across cognitive control domains and cognitive phenotypes. Our integrated framework jointly learns heterogeneous networks with vector-valued predictors, overcoming the constraints of treating each network independently in regression analysis. By assuming shared nodes across networks with varying interconnections, our method captures complex relationships while offering uncertainty quantification. Theoretical analysis demonstrates convergence to the true data-generating density, supported by empirical studies showcasing superior performance over existing approaches.
Presenter: Hossein Moradi, South Dakota State University
Title: Tensor regression for brain imaging data
Abstract: Multidimensional array data, also called tensors, are used in neuroimaging and other big data applications. In this paper, we propose a parsimonious Bayesian Tensor linear model for neuroimaging study with brain image as a response and a vector of predictors. Our method provides estimates for the parameters of interest by using an Envelope method. The proposed method characterizes different sources of uncertainty and the inference is performed using Markov Chain Monte Carlo (MCMC). We demonstrate posterior consistency and develop a computationally efficient MCMC algorithm for posterior computation using Gibbs sampling. The effectiveness of our approach is illustrated through simulation studies and analysis of alcohol addiction's effect on brain connectivity.
Presenter: Ranjan Maitra, Iowa State University
Title: Reduced-Rank Tensor-on-Tensor Regression and Tensor-Variate Analysis of Variance
Abstract: Fitting regression models with many multivariate responses and covariates can be challenging, but such responses and covariates sometimes have tensor-variate structure. We extend the classical multivariate regression model to exploit such structure in two ways: first, we impose four types of low-rank tensor formats on the regression coefficients. Second, we model the errors using the tensor-variate normal distribution that imposes a Kronecker separable format on the covariance matrix. We obtain maximum likelihood estimators via block-relaxation algorithms and derive their computational complexity and asymptotic distributions. Our regression framework enables us to formulate tensor-variate analysis of variance (TANOVA) methodology. This methodology, when applied in a one-way TANOVA layout, enables us to identify cerebral regions significantly associated with the interaction of suicide attempters or non-attemptor ideators and positive-, negative- or death-connoting words in a functional Magnetic Resonance Imaging study. Another application uses three-way TANOVA on the Labeled Faces in the Wild image dataset to distinguish facial characteristics related to ethnic origin, age group and gender. A R package totr implements the methodology.
Presenter: Daniel Rowe, Marquette University
Title: Bayesian k-space estimation for fMRI
Abstract: In fMRI, as voxel sizes decrease, there is less tissue to produce a signal, resulting in a decrease in the signal-to-noise ratio and contrast-to-noise ratio. In fMRI, there have been many attempts to decrease the noise in an image in order to increase activation, but most lead to blurrier images. An alternative is to develop methods in spatial frequency space, which have unique benefits. This work proposes a Bayesian approach that quantifies available a priori information about measured complex-valued frequency coefficients. This prior information is incorporated with observed spatial frequency coefficients, and the spatial frequency coefficients estimated a posteriori. The posterior estimated spatial frequency coefficient are inverse Fourier transform reconstructed into images with reduced noise and increased detection power.
Expanding neuroimaging research: integrating insights from biomedical sciences
Organizer: Jun Young Park, University of Toronto
Chair: Haochang Shou, University of Pennsylvania
Presenter: Sarah M. Weinstein, Temple University
Title: Testing network specificity of brain-phenotype associations
Abstract: Evaluating topological similarities between canonical functional networks and maps of brainphenotype associations can add to our understanding of mechanisms underlying psychopathology. However, methods for integrating information about functional network topology with spatial maps of brain-phenotype associations have varied in terms of scientific rigor and underlying assumptions. While some approaches have relied on subjective interpretations, others have made unrealistic assumptions about spatial properties of imaging data, leading to inflated false positive rates. We seek to address this gap in existing methodology by borrowing insight from a method widely used in genomics research. We propose Network Enrichment Significance Testing (NEST), a flexible framework for testing the specificity of brain-phenotype associations to functional networks (or other subregions) of interest. We apply NEST to study associations with structural and functional brain imaging data from a large-scale neurodevelopmental cohort study.
Presenter: Andrew An Chen, Medical University of South Carolina
Title: Batch adjustments in location, scale, and shape for complex multi-site neuroimaging studies
Abstract: Neuroimaging studies increasingly collect complex measurements across multiple study sites to diagnose and assess neurological disorders. These multisite studies can acquire a larger and generalizable sample; however, they are also well-known to be biased by differences across scanners. Previous approaches, including the widely-used ComBat method, address batch effects in the location of scale of measurements while assuming normality. While effective for certain neuroimaging measures, these methods are unable to handle zero-in ation, skewness, and non-negativity which are observed in neurological studies. Here, we introduce Batch adjustments in Location, Scale, and Shape (BatLSS) which removes batch effects from any parameters in a large class of distributions, while flexibly modeling covariates. We first show that BatLSS adjusts for batch in data simulated from distributions relevant to neuroimaging including beta, generalized gamma, and several skewed distributions. We then demonstrate that BatLSS effectively harmonizes zero-in ated and rightskewed white matter lesion volumes in a large multi-site multi-study dataset from the imaging-based SysTem for AGing and NeurodeGenerative diseases (iSTAGING) consortium.
Presenter: Bingxin Zhao, University of Pennsylvania
Title: Multi-organ imaging-derived polygenic indexes for brain and body health
Abstract: The UK Biobank (UKB) imaging project is a crucial resource for biomedical research, but is limited to 100,000 participants due to cost and accessibility barriers. One solution is to use genetic data to predict heritable imaging-derived phenotypes (IDPs) for a larger cohort. Here we developed and evaluated 4,375 IDP genetic scores (IGS) derived from UKB brain and body images. When applied to non-imaging UKB participants, IGS revealed links to numerous phenotypes and stratified subjects at increased risk for both brain and body diseases. For example, IGS burden scores identified individuals at higher risk for Alzheimer's disease (AD) and neuropsychiatric disorders (e.g., bipolar and schizophrenia), offering additional insights beyond traditional polygenic risk scores of these diseases. When applied to non-UKB subjects in the Alzheimer's Disease Neuroimaging Initiative study, IGS also stratified those at high risk for dementia. Our results demonstrate that the UKB imaging study, with its largely healthy participant base holds immense potential for stratifying the risk of various brain and body diseases in broader external genetic cohorts.
Presenter: Jun Young Park, University of Toronto
Title: Integrating multimodal neuroimaging with GWAS for identifying modality-level causal pathways to Alzheimer's disease
Abstract: The UK Biobank has produced thousands of (brain) imaging-driven phenotypes (IDPs) collected from more than 40,000 genotyped individuals, which facilitated the investigation of genetic and imaging biomarkers for brain disorders. Motivated by the efforts in genetics to integrate gene expression levels with genome-wide association studies (GWASs), recent methods in imaging genetics adopted an instrumental variable approach to identify causal IDPs for brain disorders. In this talk, we first discuss several methodological challenges of existing methods in achieving causality in imaging genetics, including horizontal pleiotropy and high dimensionality of candidate instrumental variables. We then propose testing the causality of each brain modality (structural, functional, and diffusion MRI) for each gene as a useful alternative, which offers flexibility in interpretation while maintaining reasonable statistical power and controlling for the pleiotropic effects of IDPs from other imaging modalities. We demonstrate the utility of the proposed method by using summary statistics data from the UK Biobank and the International Genomics of Alzheimer's Project (IGAP) study.
Statistical learning methods for neuroscience
Organizer: Shuheng Zhou, University of California, Riverside
Chair: Yize Zhao, Yale University
Presenter: Jian Kang, University of Michigan, Ann Arbor
Title: Deep kernel learning based Gaussian processes for Bayesian image regression analysis
Abstract: Regression models are widely used in neuroimaging studies to learn complex associations between clinical variables and image data. Gaussian process (GP) is one of the most popular Bayesian nonparametric methods and has been widely used as prior models for the unknown functions in those models. However, many existing GP methods need to pre-specify the functional form of the kernels, which often suffer less flexibility in model fitting and computational bottlenecks in large-scale datasets. To address these challenges, we develop a scalable Bayesian kernel learning framework for GP priors in various image regression models. Our approach leverages deep neural networks (DNNs) to perform low-rank approximations of GP kernel functions via spectral decomposition. With Bayesian kernel learning techniques, we achieve improved accuracy in parameter estimation and variable selection in image regression models. We establish large prior support and posterior consistency of the kernel estimations. Through extensive simulations, we demonstrate our model outperforms other competitive methods. We illustrate the proposed method by analyzing multiple neuroimaging datasets in different medical studies.
Presenter: Chunming Zhang, University of Wisconsin-Madison
Title: Learning network-structured dependence from non-stationary multivariate point process data
Abstract: Understanding sparse network dependencies among nodes from multivariate point process data has broad applications in information transmission, social science, and computational neuroscience. This paper introduces new continuous-time stochastic models for conditional intensity processes, revealing network structures within non-stationary multivariate counting processes. Our model's stochastic mechanism is crucial for inferring graph parameters relevant to structure recovery, distinct from commonly used processes like the Poisson, Hawkes, queuing, and piecewise deterministic Markov processes. This leads to proposing a novel marked point process for intensity discontinuities. We derive concise representations of their conditional distributions and demonstrate cyclicity pf the counting processes driven by recurrence time points. These theoretical properties enable us to establish statistical consistency and convergence properties for proposed penalized M-estimators in graph parameters under mild regularity conditions. Simulation evaluations showcase the method's computational simplicity and improved estimation accuracy compared to existing approaches. Real neuron spike train recordings are analyzed to interconnectivity in neuronal networks.
Presenter: Shuheng Zhou, University of California, Riverside
Title: Concentration of measure bounds for matrix-variate data with missing values
Abstract: We consider the following data perturbation model, where the covariates incur multiplicative errors. For two random matrices U, X, we denote by (U ○ X) the Hadamard or Schur product, which is defined as (U ◦ X)i;j = (Ui,j)(Xi,j). In this paper, we study the subgaussian matrix variate model, where we observe the matrix variate data through a random mask U : X = U ◦ X, where X = B1/2ZA1/2, where Z is a random matrix with independent subgaussian entries, and U is a mask matrix with either zero or positive entries, where E [Uij] ⋲ [0, 1] and all entries are mutually independent. Under the assumption of independence between X and U, we introduce componentwise unbiased estimators for estimating covariance A and B, and prove the concentration of measure bounds in the sense of guaranteeing the restricted eigenvalue (RE) conditions to hold on the unbiased estimator for B, when columns of data matrix are sampled with different rates. We further develop multiple regression methods for estimating the inverse of B and show statistical rate of convergence. Our results provide insight for sparse recovery for relationships among entities (samples, locations, items) when features (variables, time points, user ratings) are present in the observed data matrix X with heterogeneous rates. Our proof techniques can certainly be extended to other scenarios. We provide simulation evidence illuminating the theoretical predictions.
Recent developments in statistical methodology for neuroimaging data analysis
Organizer: Dayu Sun, Indiana University School of Medicine
Chair: Xinyi Li, Clemson University
Presenter: Xin Ma, Columbia University Irving Medical Center
Title: High-dimensional measurement error models with application to brain functional connectivity
Abstract: Recently emerging large-scale biomedical data pose exciting opportunities for scientific discoveries. However, the ultrahigh dimensionality and nonnegligible measurement errors in the data create difficulties in estimation. There are limited methods for high-dimensional covariates with measurement errors, that usually require moments of the noise distribution to t the working model and are restricted to generalized linear models (GLM). In this work, we develop measurement error models involving high-dimensional covariates with correlated sub-Gaussian measurement errors for a class of Lipschitz loss functions that go beyond GLM family, and encompass logistic regression, hinge loss and quantile regression. Our estimator is designed to minimize the L1 norm among all estimators in suitable feasible sets, without requiring any knowledge of the noise distribution. Subsequently, we generalize these estimators to a lasso analog version that is computationally scalable to higher dimensions. We derive theoretical guarantees of finite sample statistical error bounds and sign consistency, even when the dimensionality increases exponentially with the sample size. Extensive simulation studies demonstrate superior performance compared to existing methods in classification and quantile regression problems. We apply the approach to a gender classification task based on functional connectivity and identify significant network edges that reveal gender differences.
Presenter: Shuo Chen, University of Maryland School of Medicine
Title: "Machine learning" to the mean and its correction: an application to imaging-based brain age prediction
Abstract: Machine learning models for continuous outcomes are more likely to yield biased predictions for outcomes with very large and small values. The predicted biases for large-valued outcomes are negative, while for small-valued outcomes, they are positive. We refer to this phenomenon as "machine learning to the mean." We first demonstrate this scenario across multiple applications and then attempt to explain the phenomenon from a theoretical perspective. We propose a general constrained optimization strategy to correct the bias and develop a computationally efficient algorithm for implementing the proposed method. The simulation results show that the predicted outcomes are unbiased by our correction method. We apply this new approach to predicting brain age using neuroimaging data, specifically addressing the issue of predicted age being highly correlated with chronological age which is the "machine learning to the mean" phenomenon in brain age prediction.
Presenter: Joshua Lukemire, Emory University
Title: Bayesian non-parametric factor models for estimating covariances across multiple subjects with repeated imaging runs
Abstract: Many fMRI studies require estimation of brain functional networks across multiple subjects with repeated measures of either same task condition or multiple different task conditions. However, most approaches to this problem either estimate the functional networks for each subject/session individually, or perform some form of group-level estimation. In this work we propose a Bayesian latent factor model that pools information across subjects and sessions to estimate subject/session specific connectivity matrices. The approach is based on a product of Dirichlet process mixtures (PDPM) prior that clusters latent factor loadings separately for each node in the brain, but that restricts sessions within subject to share the same cluster. Through simulations, we show that this approach is highly effective for both clustering subjects with similar connectivity patterns and estimating the overall brain network. An application is provided to the Human Connectome Project fMRI data.
Statistical inference in neuroimaging
Organizer: Eardi Lila, University of Washington
Chair: Shuo Chen, University of Maryland School of Medicine
Presenter: Benjamin Risk, Emory University
Title: Nonparametric motion adjustment in functional
connectivity studies in children with autism spectrum
disorder
Abstract: Autism Spectrum Disorder (ASD) is a neurodevelopmental
condition associated with difficulties
with social interactions, communication, and restricted
or repetitive behaviors. To characterize ASD, investigators
often use functional connectivity derived from
resting-state functional magnetic resonance imaging of
the brain. However, participants' head motion during
the scanning session can induce motion artifacts. Many
studies remove scans with excessive motion, which can
lead to drastic reductions in sample size and introduce
selection bias. To avoid such exclusions, we propose an
estimand using causal inference methods that quantifies the difference in average functional connectivity in
autistic and non-ASD children while standardizing motion
relative to the low motion distribution in scans that
pass motion quality control. We introduce a nonparametric
estimator for motion control, called MoCo, that
uses all participants and flexibly models the impacts of
motion and other relevant features using an ensemble of
machine learning methods. We establish large-sample
efficiency and multiple robustness of our proposed estimator.
The framework is applied to estimate the difference
in functional connectivity between 132 autistic
and 245 non-ASD children, of which 34 and 126 pass
motion quality control. MoCo appears to dramatically
reduce motion artifacts relative to no participant removal,
while more efficiently utilizing participant data
and accounting for possible selection biases relative to
the naive approach with participant removal.
Presenter: Raphiel Murden, Emory University
Title: Probabilistic JIVE for brain morphometry and
cognition
Abstract: Collecting multiple types of data on the
same set of subjects is common in modern scientific applications
including genomics, metabolomics, and neuroimaging.
Joint and Individual Variation Explained
(JIVE) seeks a low-rank approximation of the joint variation
between two or more sets of features captured on
common subjects and isolates this variation from that
unique to each set of features. We propose a probabilistic
model for the JIVE framework with subject random
effects and develop an expectation-maximization
(EM) algorithm to estimate the parameters of interest.
Our model extends probabilistic PCA to the setting
of multiple data sets, simultaneously estimating joint
and individual components, which can lead to greater
accuracy compared to other methods. We apply Pro-
JIVE to measures of brain morphometry and cognition
from the Alzheimer's Disease Neuroimaging Initiative.
ProJIVE learns biologically meaningful sources of variation
in brain morphometry and cognition. The joint
morphometry and cognition subject scores are strongly
related to expensive existing biomarkers.
Presenter: Daniel Kessler, University of Washington
Title: Computational Inference for Directions in
Canonical Correlation Analysis
Abstract: Canonical Correlation Analysis (CCA) is
a method for analyzing pairs of random vectors; it
learns a sequence of paired linear transformations such
that the resultant canonical variates are maximally correlated
within pairs while uncorrelated across pairs.
CCA outputs both canonical correlations as well as the
canonical directions which define the transformations.
While inference for canonical correlations is well developed,
conducting inference for canonical directions
is more challenging and not well-studied, but is key
to interpretability. We propose a computational bootstrap
method (combootcca) for inference on CCA directions.
We conduct thorough simulation studies that
range from simple and well-controlled to complex but
realistic and validate the statistical properties of combootcca
while comparing it to several competitors. We
also apply the combootcca method to a brain imaging
dataset and discover linked patterns in brain connectivity
and behavioral scores.
Presenter: Simon Vandekar, Vanderbilt University
Title: Scalable FDR controlled functional confidence
sets for arbitrary effect size images
Abstract: The field of neuroimaging research has acknowledged
the limitations of hypothesis testing-based
inference. As a solution, colleagues in biostatistics have
developed procedures to construct spatial confidence
sets for images that can be used to identify regions
with target effect sizes above a given threshold with
a specified probability. These confidence sets represent
a paradigm shift in group-level inference for neuroimaging
data, however, there is no generalized approach to
estimate and construct confidence regions on a unitless
scale. We derive the asymptotic distribution of the
robust effect size index and use recently developed approaches
to construct confidence sets from simultaneous
confidence intervals to establish a confidence set procedure
for effect sizes of arbitrary model parameters.
Commonly used reliable inference procedures rely on
bootstrapping or permutations, so can be slow in large
samples. In contrast, our approach uses closed-form
procedures so are scalable to large datasets. We evaluate
their finite sample and use the methods to identify
regions associated with diagnostic differences in the
ABIDE dataset.
New developments for harmonization, processing and modeling for imaging data
Organizer: Yize Zhao, Yale University
Chair: Jun Young Park, University of Toronto
Presenter: Dana Tudorascu, University of Pittsburgh
Title: Data harmonization methods and analysis for
Positron Emission Tomography (PET) imaging studies
of Alzheimer's disease
Abstract: Multisite imaging studies increase statistical
power and enable the generalization of research outcomes;
however, due to the variety of imaging acquisition,
different PET tracer properties and inter-scanner
variability hinders the direct comparability of multi-scanner
PET data. The PET imaging field is lacking
behind in terms of harmonization methods due to
the complexity associated with combination of different
tracers and different scanners. In this study we investigate
samples of cognitively normal participants, mild
cognitive impaired and Alzheimer's disease subjects in
two major multisite studies of Alzheimer's disease. We
present challenges and solutions associated with different
PET tracers analysis and harmonization techniques
including simple imaging standardization, Combat and
deep learning methods. We show regions of interest differences
in PET outcome measures before and after the
harmonization in multisite studies of Alzheimer's Disease.
Presenter: Selena Wang, Yale University
Title: Sex-specific topological structure associated
with dementia and MCI via latent space estimation
Abstract: Statistical network analysis has transformed
neuroimaging research in recent years by enabling flexible and intuitive integration of multiple data
types and preserving the topological brain connectivity
structure while uncovering mechanism of degenerative
aging. In this study, we apply a novel latent space
joint network model to perform a case-control comparison
using the functional connectivity data together with
region-specific cortical volume, cortical thickness, surface
area and PET information from the third release of
the ADNI study. By preserving complex network structures
during imaging biomarker detection, we find sex-specific topological structures associated with dementia.
For female subjects, areas of connectivity edges that are
impacted by dementia and MCI tend to follow the organizational
topological structure of the brain. In contrast,
areas of connectivity edges that are impacted by
dementia and MCI for the male subjects do not follow
such structures. For female subjects, the core brain regions
with connectivity across the whole brain are most
impacted by the development of dementia, which is not
true for male subjects.
Presenter: Zhengwu Zhang, University of North Carolina
Chapel Hill
Title: CoCoNest: a continuous structural connectivity-based
nested parcellation of the human cerebral cortex
Abstract: Despite the widespread exploration and
availability of parcellations for the functional connectome,
parcellations designed for the structural connectome
are comparatively limited. Current research
suggests that there may be no single 'correct' parcellation
and that the human brain is intrinsically
a multi-resolution entity. In this work, we propose
the CoCoNest family of parcellations — a fully data-driven,
multi-resolution family of parcellations constructed
from structural connectome data. The CoCoNest
family is constructed using agglomerative (bottom-up)
clustering and error-complexity pruning, which
strikes a balance between the complexity of the parcellation
and how well it preserves patterns in vertex-level
high-resolution connectivity data. We draw on an intensive
battery of internal and external evaluation metrics
to show that the CoCoNest family is competitive with
or outperforms widely used parcellations in the literature.
Additionally, we show how the CoCoNest family
can serve as an exploratory framework for researchers
to investigate the organization of the structural connectome
across various resolutions.
Presenter: Tsung-Hung Yao, MD Anderson
Title: Bayesian nonparametric product mixtures for
multi-resolution clustering of functions
Abstract: There is a rich literature on clustering functional
data with applications to time-series modeling,
trajectory data, and even spatio-temporal applications.
However, existing methods assume replicated clusters
that enforce identical atom values for all members allocated
to the same cluster. While such an assumption
may be acceptable for clustering scalar or lowdimensional
vectors, it may not be meaningful when
clustering high-dimensional functions observed at thousands
of instances for each sample. A prominent example
of this type of problem pertains to the clustering
of high-dimensional images derived from neuroimaging
applications or even spatial transcriptomics problems
involving a large number of spots in the tissue. For
such problems, units are expected to cluster based on
a subset of informative regions in the image only, with
the remaining imaging regions not being instrumental
in the clustering process. In order to tackle such problems,
we propose a non-parametric Bayesian approach
for multi-resolution clustering of high-dimensional functions.
In particular, we express the random functions
in terms of a wavelet basis expansion coupled with an
additive noise term and impose independent Dirichlet
process priors on coefficients corresponding to varying
wavelet resolutions. The proposed model results in a
product of DPM priors imposed on the wavelet coefficients and is shown to result in posterior consistency
in recovering the true density of the random functions,
as the number of samples grows to infinity while keeping
the number of observed instances for each function fixed. We apply the proposed approach to clustering
high-dimensional images in neuroimaging applications
in order to infer heterogeneous subsets of subjects, as
well as spatial transcriptomics applications where the
goal is to infer clusters of genes with distinct transcriptomics
mechanisms. The operating characteristics of
the model are also evaluated via extensive simulations
that reveal the considerable advantages in performance
under the proposed methods over classical clustering
methods.
Invariance and distribution/density objects
in neuroimaging studies
Organizer: Yi Zhao, Indiana University School of
Medicine
Chair: Eardi Lila, University of Washington
Presenter: Bonnie Smith, Johns Hopkins Bloomberg
School of Public Health
Title: Regression models for partially localized fMRI
connectivity analyses
Abstract: We propose the use of subject-level regression
models for brain functional connectivity. Covariates
can include characteristics such as geographic distance
between two brain regions, symmetry between the
regions, and functional networks to which the two regions
belong. Connectivity regression models can be
used either with data that have been normalized to a
common template, or in settings where each subject's
data is left in its own geometry. This style of analysis
allows us to characterize the relative importance
of each type of predictor, and also provides a parsimonious
way of summarizing each subject's connectivity
that can be used in group-level comparisons. We apply
our approach to Human Connectome Project data,
and we investigate data repeatability using our model
versus using two alternative approaches.
Presenter: Changbo Zhu, University of Notre Dame
Title: Geodesic optimal transport regression
Abstract: Classical regression models do not cover
non-Euclidean data that reside in a general metric
space, while the current literature on non-Euclidean regression
by and large has focused on scenarios where
either predictors or responses are random objects, i.e.,
non-Euclidean, but not both. In this paper we propose
geodesic optimal transport regression models for
the case where both predictors and responses lie in a
common geodesic metric space and predictors may include
not only one but also several random objects.
This provides an extension of classical multiple regression
to the case where both predictors and responses
reside in non-Euclidean metric spaces, a scenario that
has not been considered before. It is based on the
concept of optimal geodesic transports, which we de-
ne as an extension of the notion of optimal transports
in distribution spaces to more general geodesic metric
spaces, where we characterize optimal transports
as transports along geodesics. The proposed regression
models cover the relation between non-Euclidean
responses and vectors of non-Euclidean predictors in
many spaces of practical statistical interest. These include
one-dimensional distributions viewed as elements
of the 2-Wasserstein space and multidimensional distributions
with the Fisher-Rao metric that are represented
as data on the Hilbert sphere. Also included are data
on finite-dimensional Riemannian manifolds, with an
emphasis on spheres, covering directional and compositional
data, as well as data that consist of symmetric
positive definite matrices. We illustrate the utility
of geodesic optimal transport regression with data on
summer temperature distributions and human mortality.
Presenter: Yi Zhao, Indiana University School of
Medicine
Title: Density-on-density regression
Abstract: In this study, a density-on-density regression
model is introduced, where the association between
densities is elucidated via a warping function. The proposed
model has the advantage of a being straightforward
demonstration of how one density transforms into
another. Using the Riemannian representation of density
functions, which is the square-root function (or half
density), the model is defined in the correspondingly
constructed Riemannian manifold. To estimate the
warping function, it is proposed to minimize the average
Hellinger distance, which is equivalent to minimizing
the average Fisher-Rao distance between densities. An
optimization algorithm is introduced by estimating the
smooth monotone transformation of the warping function.
Asymptotic properties of the proposed estimator
are discussed. Simulation studies demonstrate the superior
performance of the proposed approach over competing
approaches in predicting outcome density functions.
Applying to a proteomic-imaging study from the
Alzheimer's Disease Neuroimaging Initiative, the proposed
approach illustrates the connection between the
distribution of protein abundance in the cerebrospinal
uid and the distribution of brain regional volume. Discrepancies
among cognitive normal subjects, patients
with mild cognitive impairment, and Alzheimer's disease
(AD) are identified and the findings are in line
with existing knowledge about AD.
Advances in statistical method for neuroimaging data
Organizer: Selena Wang, Yale University
Chair: Xin Ma, Columbia University Irving Medical
Center
Presenter: Dayu Sun, Indiana University School of
Medicine
Title: Sparse partial generalized tensor regression
Abstract: Tensor data, often characterized as multidimensional
arrays, have become increasingly prevalent
in biomedical studies, particularly in neuroimaging applications.
Analyzing these complex datasets can be
challenging due to the high-dimensionality and inherent
structures within tensors. In this work, we propose the
Sparse Partial Generalized Tensor Regression (SPGTR)
method for modeling general types of outcomes involving
both tensor and vector/scalar predictors. Our novel
mode-wise penalized manifold optimization techniques
enable us to achieve dimension reduction and sparsity
in tensor coefficient estimation, improving the overall
prediction performance. We establish the asymptotic
behavior of the proposed estimation. We demonstrate
the effectiveness of the SPGTR through extensive simulation
studies and showcase its application in investigating
the association between posttraumatic stress disorder
(PTSD) and brain connectivity matrices derived
from functional magnetic resonance imaging (fMRI)
data.
Presenter: Yaotian Wang, Emory University
Title: An empirical-topology-informed Bayesian blind
source separation for investigating whole-brain functional
connectivity
Abstract: Blind source separation (BSS) is one of
the major methods for functional magnetic resonance
imaging (fMRI) analysis. From voxel-level fMRI data
to region-of-interest (ROI) - level functional connectivity
(FC) matrices (e.g., Pearson correlations of fMRI
data), various BSS methods have been developed to decompose
these data into scientifically meaningful and
insightful latent sources. These methods generally do
not utilize any empirical topology information, such as
the spatial information between ROIs. However, existing
studies and theories suggest that spatial distance is
an important factor that influences the property of FC.
For example, the compensatory theory suggests that
aging has different effects on long- and short-distance
connections. Results from an aging brain study that
neglects spatial information may fail to capture scientifically important nuances in the brain. Furthermore,
without taking into account the brain's empirical topology,
ROIs are typically treated as exchangeable, leading
to less reliable findings. Therefore, to produce scientifically meaningful and reliable blind source separation,
an empirical-topology-informed method is called
for. In this talk, I will present a novel BSS method that
integrates empirical topology information in a unified
Bayesian framework and the identified latent sources
underlying the functional connectome in fMRI data.
Presenter: Zhiling Gu, Iowa State University
Title: Statistical learning and inference of surface-based
functional data with applications in neuroimaging
analysis
Abstract: Surface-based neuroimaging analysis has
gained significant attention in recent years due to
its ability to capture fine-grained spatial information
and provide insights into brain structure and function.
In this paper, we present an advanced nonparametric
method for learning and inferring for surface-based
functional data, facilitating accurate estimation of underlying
signals and efficient detection and localization
of significant effects. We propose a framework that
leverages advanced statistical modeling approaches, including
spherical splines on triangulations and next-generation
function data analysis, to handle the challenges
associated with surface-based data, such as irregular
sampling and spatial dependencies. Furthermore,
we propose a novel approach for constructing simultaneous
confidence corridors (SCCs), which effectively
quantify estimation uncertainty. These SCCs provide
a comprehensive representation of the uncertainty in
the estimated functional patterns and facilitate reliable
inference. Furthermore, the procedure is extended to
accommodate comparisons between groups of samples,
enabling the analysis of group differences or treatment
effects. We establish the asymptotic properties of the
proposed estimators and SCCs, and provide a computationally
efficient procedure for constructing the SCCs.
To evaluate the finite-sample performance, we conduct
numerical experiments and apply the methods to real-data
analysis using the cs-fMRI data provided by the
Human Connectome Project Consortium (HCP).
Graph-based network connectomes analysis
Organizer: Simon Vandekar, Vanderbilt University
Chair: Zhengwu Zhang, University of North Carolina
Chapel Hill
Presenter: Eardi Lila, University of Washington
Title: Integrative analysis of dynamic functional connectomes
and high-dimensional data
Abstract: We introduce a novel statistical method
for the integrative analysis of neuroimaging and high-dimensional
data. The motivating application is the
exploration of the dependence structure between each
subject's dynamic functional connectivity — represented
by a temporally indexed collection of positive definite
covariance matrices — and high-dimensional data representing
lifestyle, demographic, and psychometric measures.
To this purpose, we employ a regression-based
reformulation of canonical correlation analysis that allows
us to control the complexity of the connectivity
canonical directions within a Riemannian framework,
using tangent space principal components analysis,
and that of the high-dimensional canonical directions
via a sparsity-promoting penalty. The proposed
method shows improved empirical performance over alternative
approaches. Its application to data from the
Human Connectome Project reveals a dominant mode
of covariation between dynamic functional connectivity
and lifestyle, demographic, and psychometric measures.
This mode aligns with results from static connectivity
studies but reveals a unique temporal non-stationary
pattern that such studies fail to capture.
Presenter: Sean L. Simpson, Wake Forest University
Title: Regression Frameworks for Brain Network Distance
Metrics
Abstract: Brain network analyses have exploded in
recent years, and hold great potential in helping us understand
normal and abnormal brain function. Network
science approaches have facilitated these analyses
and our understanding of how the brain is structurally
and functionally organized. However, the development
of statistical methods that allow relating this organization
to health outcomes has lagged behind. We have
attempted to address this need by developing regression
frameworks for brain network distance metrics that allow
relating system-level properties of brain networks to
outcomes of interest. These frameworks serve as synergistic
fusions of statistical approaches with network
science methods, providing needed analytic foundations
for whole-brain network data. Here we delineate these
approaches that have been developed for single-task,
multi-task/multi-session, and multilevel brain network
data. These tools help expand the suite of analytical
tools for whole-brain networks and aid in providing
complementary insight into brain function.
Presenter: Panpan Zhang, Vanderbilt University
Medical Center
Title: Graph-based methods for functional brain network
analysis
Abstract: Functional magnetic resonance imaging
(fMRI) has been widely used to discover the neural underpinnings
of cognition decline caused by neurological
disorders. Graph-based methods are prevalent for the
analysis of brain networks constructed from fMRI data.
The precise construction of functional brain networks
is critical when using network-based measures as predictors
in downstream analyses. This talk will discuss
popular approaches to functional brain network construction.
The assessment is done through both simulations
and an application to a longitudinal Alzheimer's
Disease study.
Presenter: Tingting Zhang, University of Pittsburgh
Title: Analysis of functional brain network changes
from childhood to old age: a study using HCP-D, HCPYA,
and HCP-A datasets
Abstract: We present a new clustering-enabled regression
approach designed to investigate how whole-brain
functional connectivity (FC) in healthy subjects
changes from childhood to old age. By applying this
method to aggregated fMRI data from three Human
Connectome Projects, we identify clusters of brain regions
that share similar trajectories of FC changes with
age. Our findings reveal that age affects FC in a varied
manner across different brain regions. Most brain connections
experience minimal yet statistically significant
FC changes with age. Only a tiny proportion of connections
exhibit substantial age-related changes in FC.
Among these connections, FC between brain regions
in the same functional network tends to decrease over
time, while FC between regions in different networks
demonstrates diverse patterns of age-related changes,
underscoring the intricate nature of brain aging processes.
Moreover, our research uncovers sex-specific
trends in FC changes; while average FC is comparable
in childhood for both sexes, it becomes increasingly different
with aging. Elderly females show much higher
FC within the default mode network and in certain
between-network connections of the somatomotor network,
whereas elderly males display higher FC across
multiple brain networks. Furthermore, our study suggests
that the relationship between cognitive behavior
and FC is nuanced, being most influenced by age and
sex during childhood, less influenced in older adults,
and to the least extent in young adults.
Collaborative case study: Statistical methods for dissecting tumor microenvironment
based on spatial proteomics datasets
Organizer: Souvik Seal, Medical University of South
Carolina
Chair: Selena Wang, Yale University
Presenter: Thao Vu, University of Colorado Anschutz
Medical Campus
Title: FunSpace: A functional and spatial analytic approach
to cell imaging data using entropy measures
Abstract: Spatial heterogeneity in the tumor microenvironment
(TME) plays a critical role in gaining insights
into tumor development and progression. Conventional
metrics typically capture the spatial differential
between TME cellular patterns by either exploring
the cell distributions in a pairwise fashion or aggregating
the heterogeneity across multiple cell distributions
without considering the spatial contribution. As such,
none of the existing approaches has fully accounted for
the simultaneous heterogeneity caused by both cellular
diversity and spatial configurations of multiple cell categories.
In this article, we propose an approach to leverage
spatial entropy measures at multiple distance ranges
to account for the spatial heterogeneity across different
cellular organizations. Functional principal component
analysis (FPCA) is applied to estimate FPC scores
which are then served as predictors in a Cox regression
model to investigate the impact of spatial heterogeneity
in the TME on survival outcome, potentially adjusting
for other confounders. Using a non-small cell lung cancer
dataset (n = 153) as a case study, we found that the
spatial heterogeneity in the TME cellular composition
of CD14+ cells, CD19+ B cells, CD4+ and CD8+ T
cells, and CK+ tumor cells, had a significant non-zero
effect on the overall survival (p = 0.027). Furthermore,
using a publicly available multiplexed ion beam imaging
(MIBI) triple- negative breast cancer dataset (n = 33),
our proposed method identified a significant impact of
cellular interactions between tumor and immune cells
on the overall survival (p = 0.046). In simulation studies
under different spatial configurations, the proposed
method demonstrated a high predictive power by accounting
for both clinical effect and the impact of spatial
heterogeneity.
Presenter: Jiangmei Xiong, Vanderbilt University
Title: GammaGateR: semi-automated marker gating
for single-cell multiplexed imaging
Abstract: Multiplexed immunofluorescence (mIF) is
an emerging assay for multichannel protein imaging
that can decipher cell-level spatial features in tissues.
However, existing automated cell phenotyping methods,
such as clustering, face challenges in achieving
consistency across experiments and often require subjective
evaluation. As a result, mIF analyses often revert
to marker gating based on manual thresholding of
raw imaging data. To address the need for an evaluable
semi-automated algorithm, we developed Gamma-
GateR, an R package for interactive marker gating designed
specifically for segmented cell-level data from
mIF images. Based on a novel closed-form gamma
mixture model, GammaGateR provides estimates of
marker-positive cell proportions and soft clustering of
marker-positive cells. The model incorporates user-specified constraints that provide a consistent but slidespeci
c model fit. We compared GammaGateR against
the newest unsupervised approach for annotating mIF
data, employing two colon datasets and one ovarian cancer
dataset for the evaluation. We showed that GammaGateR produces highly similar results to a silver standard
established through manual annotation. Furthermore,
we demonstrated its effectiveness in identifying
biological signals, achieved by mapping known spatial
interactions between CD68 and MUC5AC cells in the
colon and by accurately predicting survival in ovarian
cancer patients using the phenotype probabilities as input
for machine learning methods. GammaGateR is
a highly efficient tool that can improve the replicability
of marker gating results, while reducing the time of
manual segmentation.
Presenter: Julia Wrobel, Emory University
Title: A scalable robust K-statistic for quantifying
immune-cell clustering in multiplex imaging data
Abstract: The tumor microenvironment (TME),
which characterizes the tumor and its surroundings,
plays a critical role in understanding cancer development
and progression. Recent advances in imaging techniques,
including multiplex immunofluorescence (mIF)
imaging, enable researchers to study spatial structure
of the TME at a single-cell level. The most relevant
approaches for analyzing spatial relationships between
cell types in mIF data are based on point process theory,
and among these Ripley's K statistic and its derivatives
are both extremely popular and highly effective. In this
framework, the location of cells in mIF data are treated
as following a point process, realizations of a point process
are called "point patterns", and these models seek
to understand correlations in the spatial distributions
of cells. Under the assumption that the rate of a cell is
constant over an entire region of interest a point pattern
will exhibit complete spatial randomness (CSR),
and it is often of interest to model whether cells deviate
from CSR either through clustering or repulsion. In
mIF data estimation issues can arise when the sample
has holes due to the shape of the tissue, folds, or tears,
resulting in patches of areas on the slide where no cells
are present. This can bias the estimation of Ripley's K
due to violation of the CSR assumption of spatial homogeneity.
One correction of this violation accounts for
regions where no cells were present by permuting an empirical
value of complete spatial randomness, and then
comparing observed spatial summary statistic values to
that obtained by this empirical null distribution. This
x works well in small samples, but is computationally
infeasible as the number of cells per image increases.
To improve on this, we derived a closed form representation
of the permuted null distribution for Ripley's K
which is fast and easy to implement using existing software.
We examine the performance of this statistic in
simulations and open-source mIF data.
Presenter: Junsouk Choi, University of Michigan
Title: Gaussian process spatial topic modeling for unsupervised
discovery of spatial tissue architecture in
multiplexed imaging
Abstract: Recent development of technologies such as
multiplexed imaging and spatial transcriptomics allows
for direct observation of cellular phenotypes and cellular
interactions in intact tissues, enabling highly resolved
spatial characterization of cellular phenotypes. A common
research question in analyzing such data is identifying
higher-order patterns of tissue organization, which
holds systematic implications for disease pathology and
clinical outcomes. To address this, we propose a novel
topic modeling approach to identify the higher-order
architecture of tissues and recover signatures of characteristic
cellular microenvironments that are potential
determinants of patient outcomes. Our method infers
the local distribution of cell types as a representation of
cellular microenvironment and incorporates spatial information
through Gaussian processes to ensure spatial
coherence among neighboring microenvironments. By
applying the proposed topic model to publicly available
multiplexed imaging data, we uncover higher-order architectures
within lung cancer tissues and identify tertiary
lymphoid structures, which are closely linked to
the patient survival.
When machine learning and generative models meet imaging, network and point cloud
data
Organizer: Zhengwu Zhang, University of North Carolina
Chapel Hill
Chair: Benjamin Risk, Emory University
Presenter: Mingxia Liu, University of North Carolina
Chapel Hill
Title: Enhancing multi-site multi-modal neuroimage
analysis through advanced AI techniques
Abstract: Multi-site multi-modal neuroimaging data,
such as magnetic resonance imaging (MRI) and positron
emission tomography (PET), are critical to expanding
the diversity of subject populations and enhancing
the statistical robustness of predictive models in neuroscience
research. Despite their potential, the field
faces substantial challenges, notably the heterogeneity
of data across imaging sites and modalities. Addressing
these complexities, my research focuses on creating
machine learning and deep learning methodologies to
analyze multi-modal imaging data from multiple sites,
with the goal of uncovering imaging biomarkers associated
with neurodegenerative disorders. This talk will
delineate our progress in address three long- standing
challenges: neuroimage representation learning, multimodality
neuroimage fusion, and multi-site data adaptation.
Key highlights will include our latest advances
in the representation learning of MRI, capturing both
structural and functional dimensions. Subsequently, I
will elucidate our strategies for the effective integration
of multi-modal neuroimaging data, which promises
the accurate synthesis of MRI and PET scans, particularly
beneficial in cases plagued by missing or incomplete
data modalities. Concluding the talk, I will introduce
our comprehensive suite of multi-site neuroimage
harmonization techniques and unveil DomainATM,
our open-source toolbox specifically designed for medical
data adaptation.
Presenter: Maoran Xu, Duke University
Title: Identifiable and interpretable nonparametric
factor analysis
Abstract: Factor models have been widely used to
summarize the variability of high-dimensional data
through a set of factors with much lower dimensionality.
Gaussian linear factor models have been particularly
popular due to their interpretability and ease
of computation. However, in practice, data often violate
the multivariate Gaussian assumption. To characterize
higher-order dependence and nonlinearity, models
that include factors as predictors in flexible multivariate
regression are popular, with GP-LVMs using
Gaussian process (GP) priors for the regression function
and VAEs using deep neural networks. Unfortunately,
such approaches lack identifiability and interpretability
and tend to produce brittle and nonreproducible results.
To address these problems by simplifying the
nonparametric factor model while maintaining flexibility,
we propose the NIFTY framework, which parsimoniously
transforms uniform latent variables using one dimensional
nonlinear mappings and then applies a linear
generative model. The induced multivariate distribution
falls into a flexible class while maintaining simple
computation and interpretation. We prove that this
model is identifiable and empirically study NIFTY using
simulated data, observing good performance in density
estimation and data visualization. We then apply
NIFTY to bird song data in an environmental monitoring
application.
Presenter: Yuexuan Wu, University of Washington
Title: Topological network analysis of protein aggregates
in Alzheimer's disease using PET imaging data
Abstract: Alzheimer's disease (AD) is characterized
by the accumulation of beta-amyloid (Aβ) and tau proteins
in the brain. Understanding the interplay between
these proteins and their spatial distribution could provide
insights into disease progression. In this study,
we introduce a novel approach to investigate the topological
features of Aβ and tau networks across different
cognitive groups using positron emission tomography
(PET) images. We construct networks via partial
correlation matrices between the standardized uptake
value ratio in specific regions of interest (ROI's). We
employ the bi-filtered persistent homology to explore
these networks' topological characteristics for Aβ and
tau modalities. We further examine networks' hierarchical
tree structures, focusing on comparing consistent
pairs of regions positive for Aβ and tau presence across
different cognitive groups. The results unveil complex
structures in PET images by pinpointing consistent patterns
in ROI's associated with Aβ and tau localization,
which serve as potential biomarkers for AD progression.
The study also highlights tau's more complex aggregate
behavior and its stronger association with AD.
Presenter: Xinyi Li, Clemson University
Title: Nonparametric Learning from 3D Point Clouds
Abstract: In recent years, there has been an exponentially
increased amount of point clouds collected with
irregular shapes in various areas. Motivated by the importance
of solid modeling for point clouds, we develop
a novel and efficient smoothing tool based on multivariate
splines over the triangulation to extract the underlying
signal and build up a 3D solid model from the point
cloud. The proposed method can denoise or deblur the
point cloud effectively, provide a multi-resolution reconstruction
of the actual signal, and handle sparse and
irregularly distributed point clouds to recover the underlying
trajectory. In addition, our method provides a
natural way of numerosity data reduction. We establish
the theoretical guarantees of the proposed method, including
the convergence rate and asymptotic normality
of the estimator, and show that the convergence rate
achieves optimal nonparametric convergence. We also
introduce a bootstrap method to quantify the uncertainty
of the estimators. Through extensive simulation
studies and a real data example, we demonstrate
the superiority of the proposed method over traditional
smoothing methods in terms of estimation accuracy and
efficiency of data reduction.
Novel statistical inference methods with applications
Organizer: Julia Fisher, University of Arizona, BIO5
Institute, Statistics Consulting Laboratory
Chair: Simon Vandekar, Vanderbilt University
Presenter: Fatma Parlak, Indiana University Bloomington
Title: A robust multivariate, non-parametric outlier
identification method for scrubbing in fMRI
Abstract: FMRI data are prone to noise and artifacts,
requiring their removal for reliable analysis. Traditional
scrubbing methods rely on head motion or ad hoc signal
changes, but these may be insufficient. Our innovative
approach treats scrubbing as outlier detection,
viewing volumes with artifacts as multidimensional outliers.
Existing methods assume Gaussianity, but fMRI
data violate these assumptions. We present a robust
outlier detection method applicable to non-Gaussian
data, aiming to establish thresholds based on robust
distances. Two threshold options cater to researchers'
preferences for data retention or sensitivity. Our procedure
involves dimension reduction, robust univariate
outlier imputation, and threshold estimation based on
upper quantiles. Threshold choices include the empirical
distribution of robust distances and nonparametric
bootstrap estimate. Comparative analysis with existing
scrubbing methods highlights the efficacy and versatility
of our approach in addressing non-Gaussian data
and improving outlier detection in fMRI studies.
Presenter: Daniel Adrian, Grand Valley State University
Title: Improved activation detection from magnitude
and phase functional MRI data
Abstract: Functional MRI is a popular noninvasive
technique for mapping brain regions activated by specific brain functions. FMRI data consist of both magnitude
and phase components (i.e., it is complex-valued),
but in the vast majority of statistical analyses, only
the magnitude data is utilized and modeled based on
a Gaussian approximation. We show that using the
correct Ricean distribution for the magnitudes, as well
as the entire complex-valued data, results in improved
activation detection — for activation in the magnitude
component. Further, as fMRI measures brain activity
indirectly through blood
flow, the so-called "brain
or vein" problem refers to the difficulty in determining
whether measured activation corresponds to (desired)
brain tissue or (undesired) large veins, which may be
draining blood from neighboring regions. Previous work
has demonstrated that activation in the phase component "discriminates" between the two: phase activation
occurs in voxels with large, oriented vessels but not in
voxels with small, randomly oriented vessels immediately
adjacent to brain tissue. Following this motivation,
we have developed a model that allows for activation
in the phase and magnitude components.
Presenter: Yueyang Shen, University of Michigan
Title: Imaging statistics, invariance and spacekime analytics
Abstract: This talk will present the theoretical foundations
of symmetries with an emphasis on neural
network modeling and statistical inference in imaging.
We will demonstrate roto-translational, scaling,
and reparametrization symmetries, along with invariant
and equivariant computational statistical metrics using
imaging data. Neural networks realize such symmetries
through weight-sharing or emergent data augmentation
invariance. Information compression and (minimal) sufficient statistics are dual to identifying symmetries and
quotienting out irrelevances.
We plan to show empirical results from two fMRI
datasets: finger tapping and the music stimuli. The
former neuroimaging data is collected from patients
switching between resting and finger tapping tasks and
the latter examines the emerging neural network activation
responding to different music genres. Complex-time
(kime) representation of longitudinal data leads
to novel spacekime analytics, which enables peering
into repeated-measurement information contained in
low signal-to-noise ratio fMRI data. Specifically, the
kime phases are coupled to the random variability in
the repeated sampling. The observed time-courses are
transformed as kimesurfaces encoding the distribution
of the temporal information into richer computable data
objects (manifolds). This allows us to characterize and
analyze fMRI data using voxel-based or ROI- based approaches,
as well as to synthetically generate realistic
neuroimaging data.
Presenter: Jose Rodriguez-Acosta, Texas A&M University
Title: A novel classification framework using a multilayer
network predictor
Abstract: We introduce a novel statistical framework
for exploring the correlation between brain stimulation
and regional brain activation. Using a generalized
linear modeling framework, we predict binary outcomes,
such as regional brain activation during external
stimuli. Our predictive model utilizes multilayer
networks to capture interactions among brain network
nodes. Traditional regression methods with multilayer
network predictors often struggle to effectively utilize
information across graph layers, leading to less accurate
inference, especially with smaller sample sizes. To
address this, our method models edge coefficients at
each network layer using bilinear interactions between
latent effects associated with connected nodes. We also
employ a variable selection framework to identify influential
nodes linked to observed outcomes. Importantly,
our framework is computationally efficient and provides
uncertainty quantification in node identification, coefficient estimation, and binary outcome prediction. Simulation
studies demonstrate the superior performance of
our approach in inference and prediction.
Frontiers in medical imaging: harnessing artificial intelligence and statistical analysis for
breakthrough insights
Organizer: Lei Liu, Washington University in St.
Louis
Chair: Dayu Sun, Indiana University School of
Medicine
Presenter: Yize Zhao, Yale University
Title: Bayesian mixed model inference for genetic association
under related samples with brain network phenotype
Abstract: Genetic association studies for brain connectivity
phenotypes have gained prominence due to advances
in non-invasive imaging techniques and quantitative
genetics. Brain connectivity traits, characterized
by network configurations and unique biological structures,
present distinct challenges compared to other
quantitative phenotypes. Furthermore, the presence of
sample relatedness in most imaging genetics studies limits
the feasibility of adopting existing network-response
modeling. Here, we fill this gap by proposing Bayesian
network-response mixed-effect models that consider a
network-variate phenotype and incorporates either population
structures or sample relatedness. To accommodate
the inherent topological architecture associated
with the genetic contributions to the phenotype, we
model the effect components via a set of effect network
configurations and impose an inter-network sparsity
and intra-network shrinkage to dissect the phenotypic
network configurations affected by the risk genetic
variant. We evaluate the performance of our model
through extensive simulations. We also study the genetic
bases for brain structural connectivity using data
from Human Connectome Project and Adolescent Brain
Cognitive Development studies, and obtain plausible
and interpretable results.
Presenter: Yifan Peng, Cornell University
Title: Image-based primary open-angle glaucoma diagnosis
and prognosis
Abstract: Primary open-angle glaucoma (POAG) is
one of the leading causes of blindness globally and in
the US, potentially affecting an estimated 111.8 million
people by 2040. Among these patients, 5.3 million may
be bilaterally blind. POAG remains asymptomatic until
it reaches an advanced stage, leading to visual field
loss. However, early diagnosis and treatment can avoid
most blindness caused by POAG. Therefore, accurately
identifying individuals with glaucoma is critical to clinical
decision-making. In recent years, developments in
artificial intelligence have offered the potential for automatic
POAG diagnosis and prognosis using fundus
photographs. In this talk, I will review our research on
image-based POAG diagnosis and prognosis. I will also
discuss how we are working to ensure model fairness
across protected groups in deep learning models. Our
proposed approach aims to alleviate concerns about the
fairness and reliability of image-based computer-aided
diagnosis.
Presenter: Lei Liu, Washington University in St.
Louis
Title: Deep learning models to predict primary open-angle
glaucoma
Abstract: Glaucoma is a major cause of blindness
and vision impairment worldwide, and visual field (VF)
tests are essential for monitoring the conversion of glaucoma.
While previous studies have primarily focused
on using VF data at a single time point for glaucoma
prediction, there has been limited exploration of longitudinal
trajectories. Additionally, many deep learning
techniques treat the time-to-glaucoma prediction as a
binary classification problem (glaucoma Yes/No), resulting
in the misclassification of some censored subjects
into the nonglaucoma category and decreased
power. To tackle these challenges, we propose and
implement several deep-learning approaches that naturally
incorporate temporal and spatial information
from longitudinal VF data to predict time-to- glaucoma.
When evaluated on the Ocular Hypertension
Treatment Study (OHTS) dataset, our proposed convolutional
neural network (CNN)-long short term memory
(LSTM) emerged as the top-performing model among
all those examined. The implementation code can
be found on GitHub.
Presenter: Haoda Fu, Eli Lilly
Title: LLM is not all you need. Generative AI on
smooth manifolds
Abstract: Generative AI is a rapidly evolving technology
that has garnered significant interest lately. In this
presentation, we'll discuss the latest approaches, organizing
them within a cohesive framework using stochastic
differential equations to understand complex, high-dimensional
data distributions. We'll highlight the necessity
of studying generative models beyond Euclidean
spaces, considering smooth manifolds essential in areas
like robotics and medical imagery, and for leveraging
symmetries in the de novo design of molecular structures.
Our team's recent advancements in this blossoming field, ripe with opportunities for academic and
industrial collaborations, will also be showcased.