MAGIC: Multi-scAle heteroGeneity analysIs and Clustering
MAGIC, Multi-scAle heteroGeneity analysIs and Clustering, is a multi-scale semi-supervised clustering method that aims to derive robust clustering solutions across different scales for brain diseases. Compared to original HYDRA method, MAGIC has the following advantages:
In order to run MAGIC, one must have already installed and ran [SOPNMF (https://github.com/anbai106/SOPNMF) with the voxel-wise image data. After this, please follow the following steps for installation.
There are three choices to install MAGIC.
We recommend the users to use Conda virtual environment:
1) conda create --name MAGIC python=3.6
Activate the virtual environment:
2) source activate MAGIC
Install other python package dependencies (go to the root folder of MAGIC):
3) ./install_requirements.sh
Finally, we need install MAGIC from PyPi:
3) pip install magiccluster==0.0.3
After installing all dependencies in the requirements.txt file, go to the root folder of MAGIC where the setup.py locates:
pip install -e .
python -m pip install git+https://github.com/anbai106/MAGIC.git
MAGIC requires a specific input structure inspired by BIDS. Some conventions for the group label/diagnosis: -1 represents healthy control (CN) and 1 represents patient (PT); categorical variables, such as sex, should be encoded to numbers: Female for 0 and Male for 1, for instance.
The first 3 columns are participant_id, session_id and diagnosis.
Example for feature tsv:
participant_id session_id diagnosis
sub-CLNC0001 ses-M00 -1 432.1
sub-CLNC0002 ses-M00 1 398.2
sub-CLNC0003 ses-M00 -1 412.0
sub-CLNC0004 ses-M00 -1 487.4
sub-CLNC0005 ses-M00 1 346.5
sub-CLNC0006 ses-M00 1 443.2
sub-CLNC0007 ses-M00 -1 450.2
sub-CLNC0008 ses-M00 1 443.2
Example for covariate tsv:
participant_id session_id diagnosis age sex ...
sub-CLNC0001 ses-M00 -1 56.1 0
sub-CLNC0002 ses-M00 1 57.2 0
sub-CLNC0003 ses-M00 -1 43.0 1
sub-CLNC0004 ses-M00 -1 25.4 1
sub-CLNC0005 ses-M00 1 74.5 1
sub-CLNC0006 ses-M00 1 44.2 0
sub-CLNC0007 ses-M00 -1 40.2 0
sub-CLNC0008 ses-M00 1 43.2 1
We offer a fake dataset in the folder of MAGIC/data. Users should follow the same data structure.
from from magic.magic_clustering import clustering
participant_tsv="MAGIC/data/participant.tsv"
opnmf_dir = "PATH_OPNMF_DIR"
output_dir = "PATH_OUTPUT_DIR"
k_min=2
k_max=8
cv_repetition=100
clustering(participant_tsv, opnmf_dir, output_dir, k_min, k_max, 25, 60, 5, cv_repetition)
:warning: Please let me know if you use this package for your publication; I will update your papers in the section of Publication using MAGIC…
:warning: Please cite the software using the Cite this repository button on the right sidebar menu, as well as the original papers below …
Wen J., Varol E., Chand G., Sotiras A., Davatzikos C. (2020) MAGIC: Multi-scale Heterogeneity Analysis and Clustering for Brain Diseases. Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science, vol 12267. Springer, Cham. https://doi.org/10.1007/978-3-030-59728-3_66
Wen J., Varol E., Chand G., Sotiras A., Davatzikos C. (2022) Multi-scale semi-supervised clustering of brain images: Deriving disease subtypes. Medical Image Analysis, 2022. https://doi.org/10.1016/j.media.2021.102304 - Link