This dataset contains all data and code necessary to reproduce the analysis described under the heading "Experiment 3" in the manuscript:
Taliercio, E., Eickholt, D., Read, Q. D., Carter, T., Waldeck, N., & Fallen, B. (2023). Parental choice and seed size impact the uprightness of progeny from interspecific Glycine hybridizations. Crop Science. https://doi.org/10.1002/csc2.21015
The attached files are:
G_max_G_soja_seedweight_seedcolor_analysis.Rmd: RMarkdown notebook containing all analysis code. The CSV data files should be placed in a subdirectory called data within the working directory from which the notebook is rendered.
G_max_G_soja_seedweight_seedcolor_analysis.html: Rendered HTML output from RMarkdown notebook, including figures, tables, and explanatory text.
counts_seedwt.csv: CSV file containing the number of progeny selected and average 100-seed weight data for each combination of cross, size class, and replicate. Columns are:
F3_location: text identifier of F3 nursery location, either "CLA" or "FF"
plot: numeric ID of plot
pop: numeric ID of population
max: name of G. max parent
soja: name of G. soja parent
F2_location: text identifier of F2 nursery location, either "Caswell" or "Hugo"
n_planted: number of seeds planted (raw)
n_selected: number of progeny selected
size_ordered: seed size class, to be converted to an ordered factor
size_combined: seed size class aggregated to fewer unique levels
ave_100sw: average 100-seed weight for the given size class
n_planted_trials: number of seeds planted rounded to nearest integer
seedcolor.csv: CSV file with additional data on number of seeds of each color by population. Columns are:
cross: text identifier of cross
line: text identifier of line
light: number of light seeds
mid: number of mid-green seeds
brown: number of brown seeds
dark: number of dark or black seeds
population: identifier of population type (F2 derived or selected)
max: name of G. max parent
n_total: sum of the light, mid, brown, and dark columns
soja: name of G. soja parent
The data processing and analysis pipeline in the RMarkdown notebook includes:
Importing the data (slightly cleaned version is provided)
Creating boxplots of proportion selected by cross, nursery location, and size class
Fitting logistic GLMM to estimate the probability of selection as a function of parent, 100-seed weight, and their interactions
Extracting and plotting random effect estimates from model
Calculating and plotting estimated marginal means from model
Taking contrasts between pairs of estimated marginal means and trends
Calculating Bayes Factors associated with the contrasts
Generating figures and tables for all above results
Additional seed color analysis: importing data (slightly cleaned version is provided)
Additional seed color analysis: drawing exploratory bar plot
Additional seed color analysis: fitting multinomial GLM modeling the proportion of seeds with each color as a function of population
Additional seed color analysis: generating expected value predictions from GLM and taking contrasts
Additional seed color analysis: creating figures and tables for model results
This research was funded by CRIS 6070-21220-069-00D, United Soybean Board Project # 2333-203-0101, and falls under National Program NP301.
Resources in this dataset:Resource Title: RMarkdown document with all analysis code. File Name: G_max_G_soja_seedweight_seedcolor_analysis.RmdResource Title: Rendered HTML version of notebook. File Name: G_max_G_soja_seedweight_seedcolor_analysis.htmlResource Title: Progeny counts and seed weight data. File Name: counts_seedwt.csvResource Title: Seed color counts data. File Name: seedcolor.csv