Box and Violin Plots#
VueCore is a Python package for creating interactive and static visualizations of multi-omics data. It is part of a broader ecosystem of tools—including ACore for data processing and VueGen for automated reporting—that together enable end-to-end workflows for omics analysis.
This notebook demonstrates how to generate box and violin plots using plotting functions from VueCore. We showcase basic and advanced plot configurations, highlighting key customization options such as grouping, color mapping, text annotations, and export to multiple file formats.
Notebook structure#
First, we will set up the work environment by installing the necessary packages and importing the required libraries. Next, we will create basic and advanced box plots.
Credits and Contributors#
This notebook was created by Sebastián Ayala-Ruano under the supervision of Henry Webel and Alberto Santos, head of the Multiomics Network Analytics Group (MoNA) at the Novo Nordisk Foundation Center for Biosustainability (DTU Biosustain).
You can find more details about the project in this GitHub repository.
0. Work environment setup#
0.1. Installing libraries and creating global variables for platform and working directory#
To run this notebook locally, you should create a virtual environment with the required libraries. If you are running this notebook on Google Colab, everything should be set.
# VueCore library
%pip install vuecore
0.2. Importing libraries#
from pathlib import Path
import numpy as np
import pandas as pd
from vuecore.plots.basic.box import create_box_plot
from vuecore.plots.basic.violin import create_violin_plot
0.3. Create sample data#
We create a synthetic dataset simulating gene expression levels across different patient samples and treatment conditions, with each data point representing a unique gene’s expression level under a specific treatment for a particular patient.
| Sample_ID | Treatment | Gene_ID | Expression | |
|---|---|---|---|---|
| 0 | Patient C | Control | Gene_628 | 119.462927 |
| 1 | Patient D | Treated | Gene_587 | 179.718304 |
| 2 | Patient A | Control | Gene_1444 | 57.751992 |
| 3 | Patient C | Control | Gene_1446 | 42.546490 |
| 4 | Patient C | Treated | Gene_104 | 152.350698 |
| ... | ... | ... | ... | ... |
| 195 | Patient B | Treated | Gene_697 | 171.537545 |
| 196 | Patient B | Treated | Gene_1309 | 178.460515 |
| 197 | Patient D | Treated | Gene_1244 | 175.596790 |
| 198 | Patient A | Treated | Gene_1093 | 187.279028 |
| 199 | Patient C | Control | Gene_1071 | 60.882805 |
200 rows × 4 columns
1. Basic Box Plot#
A basic box plot can be created by simply providing the x and y columns from the DataFrame,
along with style options like title
using create_box_plot.
# Define output file path for the PNG basic box plot
file_path_basic_box_png = Path(output_dir) / "box_plot_basic.png"
# Generate the basic box plot
box_plot_basic = create_box_plot(
data=gene_exp_df,
x="Treatment",
y="Expression",
title="Gene Expression Levels by Treatment",
file_path=file_path_basic_box_png,
)
box_plot_basic.show()
[VueCore] Plot saved to outputs/box_plot_basic.png
2. Basic Violin Plot#
A basic violin plot can be created by simply providing the x and y columns from the DataFrame,
along with style options like title
using create_violin_plot .
# Define output file path for the PNG basic violin plot
file_path_basic_violin_png = Path(output_dir) / "violin_plot_basic.png"
# Generate the basic violin plot
violin_plot_basic = create_violin_plot(
data=gene_exp_df,
x="Treatment",
y="Expression",
title="Gene Expression Levels by Treatment",
file_path=file_path_basic_violin_png,
)
violin_plot_basic.show()
[VueCore] Plot saved to outputs/violin_plot_basic.png
3. Advanced Box Plot#
Here is an example of an advanced box plot with more descriptive parameters, including color and box grouping, text annotations, hover tooltips, and export to HTML.
# Define the output file path for the advanced HTML box plot
file_path_adv_box_html = Path(output_dir) / "box_plot_advanced.html"
# Generate the advanced box plot
box_plot_adv = create_box_plot(
data=gene_exp_df,
x="Treatment",
y="Expression",
color="Sample_ID",
boxmode="group",
notched=True,
title="Gene Expression Levels with Control and Treatment Condition",
subtitle="Distribution of gene expression across different treatments and patient samples",
labels={
"Treatment": "Treatment",
"Expression": "Gene Expression",
"Sample_ID": "Patient Sample ID",
},
color_discrete_map={
"Patient A": "#508AA8",
"Patient B": "#A8505E",
"Patient C": "#86BF84",
"Patient D": "#A776AF",
},
category_orders={"Sample_ID": ["Patient A", "Patient B", "Patient C", "Patient D"]},
hover_data=["Gene_ID"],
file_path=file_path_adv_box_html,
)
box_plot_adv.show()
[VueCore] Plot saved to outputs/box_plot_advanced.html
4. Advanced Violin Plot#
Here is an example of an advanced violin plot with more descriptive parameters, including color and box grouping, text annotations, hover tooltips, and export to HTML.
# Define the output file path for the advanced HTML violin plot
file_path_adv_violin_html = Path(output_dir) / "violin_plot_advanced.html"
# Generate the advanced box plot
violin_plot_adv = create_violin_plot(
data=gene_exp_df,
x="Treatment",
y="Expression",
color="Sample_ID",
violinmode="group",
points="outliers",
title="Gene Expression Levels with Control and Treatment Condition",
subtitle="Distribution of gene expression across different treatments and patient samples",
labels={
"Treatment": "Treatment",
"Expression": "Gene Expression",
"Sample_ID": "Patient Sample ID",
},
color_discrete_map={
"Patient A": "#508AA8",
"Patient B": "#A8505E",
"Patient C": "#86BF84",
"Patient D": "#A776AF",
},
category_orders={"Sample_ID": ["Patient A", "Patient B", "Patient C", "Patient D"]},
hover_data=["Gene_ID"],
file_path=file_path_adv_violin_html,
)
violin_plot_adv.show()
[VueCore] Plot saved to outputs/violin_plot_advanced.html