Welcome to Metabolomic Selection for Enhanced Fruit Flavor

This repository contains data and scripts used to repoduce analyses in the manuscript "Metabolomic Selection for Enhanced Fruit Flavor" found on BioRxiv

Abstract
Figures
1. Figure 1
2. Figure 2
3. Figure 3
4. Figure 4
5. Figure 5
6. Figure 6

Abstract

"Consumers often regard heirloom fruit varieties grown in the garden as more flavorful than commercial varieties purchased at the grocery store. While plant breeders have historically focused on improving producer-orientated traits such as yield, consumer-oriented traits such as flavor have regularly been neglected. This is, in part, due to the difficulty associated with measuring the sensory perceptions of flavor. Here, we combine fruit chemical and consumer sensory panel information to train machine learning models that can predict how flavorful a fruit will be from its chemistry. By increasing the throughput of flavor evaluations, these models will help plant breeders to integrate flavor earlier in the breeding pipeline and aid in the design of varieties with exceptional flavor profiles."

Figures

Here we will go through the figures and which scripts were used to generate the underlying analysis. Often we generate the analysis in one script and design the figure component in another. We then combine the figure components together in inkscape.

Figure 1

To generate this figure, we start by preprocessing the data from the supplemental files with default choices for imputation and scaling:

[0.preprocessing.R]

Next we create the metabolite network using the WGCNA package:

[1.a.wgcna_tomato.R]

Then we plot the tomato volatile concentration violin plots in panel b:

[1.b.metabolite_histograms.R]

Additionally, the cytoscape visualization used to plot out the results from 1.a.wgcna_tomato.R and 2.a.wgcna_blueberry.R. Also used to compute betweenness centrality statistics:

[./results/fig1/asPublished_metabolite_networks.cys]

Figure 2

Creating the blueberry metabolite network:

[2.a.wgcna_blueberry.R]

Plotting the blueberry volatile concentration violin plots in panel b:

[2.b.metabolite_histograms.R]

The blueberry cytoscape visualizations are included in the cytoscape network file above.

Figure 3

Calculating contributions of volatile classes to variance in flavor ratings using linear mixed modeling:

[3.a.variance_decomposition.R]

Figure 4

A) Training metabolomic selection models

[4.a.1.metabolomic_selection_tomato.R]

B) Comparing genomic selection and metabolomic selection models

These models were ran on our HiPerGator cluster. The general structure is the first bash script launches the jobs for cross validation and replication, the second bash script creates an environment to run the jobs in R, and the R script does the computation.

[4.b.1.genomic_selection_tomato.sh]
[4.b.2.genomic_selection_tomato.sh]
[4.b.3.genomic_selection_tomato.R]
[4.b.4.metabolomic_selection_tomato.sh]
[4.b.5.metabolomic_selection_tomato.sh]
[4.b.6.metabolomic_selection_tomato.R]
[4.b.7.gblup_plots.R]

Sh1ne111 / metabolomic_selection_for_enhanced_fruit_flavor

Welcome to Metabolomic Selection for Enhanced Fruit Flavor

Table of Contents

Abstract

Figures

Figure 1

Figure 2

Creating the blueberry metabolite network:

Plotting the blueberry volatile concentration violin plots in panel b:

Figure 3

Calculating contributions of volatile classes to variance in flavor ratings using linear mixed modeling:

Figure 4

A) Training metabolomic selection models

B) Comparing genomic selection and metabolomic selection models

C) Evaluating how many fruit varieties are needed to train accurate metabolomic selection models

Figure 5

Using models trained on all varieties to calculate weights for inference

Figure 6

About

Languages