What should we do with
BIG DATA?

Ariel Rokem

University of Washington

2024-10-06

https://arokem.github.io/slides/slides/what-should-we-do.html

Neuroinformatics R&D Group

Kelly Chang

McKenzie Hagen

John Kruper

Asa Gilmore

Teresa Gomez

flowchart LR
  HN((Human Neuroscience))

flowchart LR
  ML((Machine learning and \n statistics))

flowchart LR
  ML((Open data and \n open-source software))

::::

BIG DATA in neuroscience

Brain observatories

N=1,200

N=800

N=700

N=5,000

N=10,000

N=500,000

Data-driven discovery … or “data science”

Jim Gray unveiled a vision of the fourth-paradigm… scientific methodology has evolved into three distinct archetypes: empiricism, theory, and computation. The Fourth Paradigm is an entirely new phase that involves data-intensive practices. Termed “eScience,” this fourth paradigm unites theory, experimentation, and computation

January 11, 2007, National Research Council

Jim Gray

What can we ask with data science?

The white matter matters

From Catani and ffytche, 2015

The white matter matters

de Faria et al., 2021

https://en.wikipedia.org/wiki/Axon

Scholz et al., 2009; reviewed in Sampaio-Bautista and Johansen-Berg, 2017

Magnetic Resonance Imaging (MRI)

Diffusion-weighted MRI

Diffusion-weighted MRI

:::

Diffusion-weighted MRI

Diffusion-weighted MRI

Basser (1994), Basser and Pierpaoli (1996)

Computational tract tracing

Computational tract tracing

Computational tract tracing

Computational tract tracing

Computational tract tracing

Computational dissection

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2021

White matter tractometry

Kruper et al., 2024

Kruper et al., 2024

Opportunities and challenges in BIG DATA neuroscience

  • Data-driven research
    • Machine learning methods
  • Procedures that are routine with small datasets become near impossible with large datasets
  • Conventional statistical significance

Challenges of BIG DATA