This tutorial demonstrates how to compute and visualize the average
EEG signal across various disorders using mne, pandas, and
seaborn in conjunction with almirah.
First, we'll import the necessary libraries and set the log level for MNE.
import mne
import pandas as pd
import seaborn as sns
from almirah import Dataset
mne.set_log_level(False)
Next, we'll load the dataset and query the EEG files.
ds = Dataset(name="calm-brain")
eeg_header_files = ds.query(datatype="eeg", task="rest", extension=".vhdr")
eeg_data_files = ds.query(datatype="eeg", task="rest", extension=".eeg")
len(eeg_data_files)
This should give the total number of EEG files:
1120
We then download the EEG data files.
for file in eeg_data_files:
file.download()
We connect to the database and query the presenting disorders table.
db = ds.components[2]
db.connect("username", "password")
df = ds.query(table="presenting_disorders")
df[["subject", "session", "addiction"]].head()
This displays the first few rows of the queried table in a DataFrame format.
| subject | session | addiction | |
|---|---|---|---|
| 0 | D0019 | 101 | 0 |
| 1 | D0019 | 111 | 0 |
| 2 | D0020 | 101 | 0 |
| 3 | D0020 | 111 | <NA> |
| 4 | D0021 | 101 | 0 |
We define functions to compute the mean EEG signal and retrieve the disorders.
def get_eeg_mean(file):
raw = mne.io.read_raw_brainvision(file.path)
return raw.get_data().mean()
def get_disorders(file):
disorders = []
subject, session = file.tags["subject"], file.tags["session"]
filtered_df = df[(df["subject"] == subject) & (df["session"] == session)]
if filtered_df.empty:
print(subject, session)
return None
for column in ["addiction", "bipolar", "dementia", "ocd", "schizophrenia"]:
presence = filtered_df.iloc[0][column]
if not pd.isna(presence) and presence:
disorders.append(column)
return disorders if disorders else ["healthy"]
def file_func(file):
mean_eeg, disorders = get_eeg_mean(file), get_disorders(file)
if not disorders:
return pd.DataFrame()
mean_df = pd.DataFrame({"mean": [mean_eeg] * len(disorders), "disorder": disorders})
return mean_df.dropna()
We process the EEG header files to compute the mean EEG signal and retrieve the disorders.
mean_dfs = list(map(file_func, eeg_header_files))
mean_dfs = [df for df in mean_dfs if not df.empty]
mean_df = pd.concat(mean_dfs, sort=False)
mean_df.head()
This displays the first few rows of the combined DataFrame.
| mean | disorder | |
|---|---|---|
| 0 | -0.008766 | healthy |
| 1 | 0.000457 | addiction |
| 2 | -0.006335 | healthy |
| 3 | -0.002764 | healthy |
| 4 | -0.008269 | ocd |
We compute the mean EEG signal for each disorder.
mean_df.groupby("disorder").mean()
This displays the mean EEG signal for each disorder.
| mean | |
|---|---|
| disorder | |
| addiction | 0.003414 |
| bipolar | 0.001613 |
| dementia | 0.010485 |
| healthy | 0.002449 |
| ocd | -0.000875 |
| schizophrenia | 0.005444 |
We visualize the distribution of the mean EEG signal for each disorder using a violin plot.
ax = sns.violinplot(data=mean_df, x="mean", hue="disorder")
sns.move_legend(ax, "upper left", bbox_to_anchor=(1, 1))
This generates the plot:
This concludes the tutorial. You've learned how different modalities can be strung together to perform analysis involving multiple modalities.