Author: Jose Crossa

Additive genetic variance and covariance between relatives in wheat crosses with variable parental ploidy levels

Jose Crossa (2020)

Synthetic hexaploid wheat was developed and used in breeding to introduce new genetic diversity into bread wheat, through interspecific hybridization of T. tauschii (diploid) and durum wheat T. turgidum (tetraploid) to produce synthetic derivatives. Therefore, one may infer that the genetic variances of native wild populations vs. improved wheat may be different due differential origin and evolutionary history. We investigate this idea by partitioning the additive variance of grain yield with respect to breed origin using data from a synthetic derivative. Such information is needed to predict breeding values of synthetic derivatives and their parental populations. A mixed model with a heterogeneous covariance structure for breeding values was employed to estimate variance components using a program written by us. Data originated in a multi-year multi-location field trial of synthetic derivatives from the International Maize and Wheat Improvement Center (CIMMYT). Bayesian estimates of additive variances of grain yield from each population were similar for T. turgidum (0.0225) and T. tauschii (0.0208), but they were strikingly different from the one of T. aestivum (0.0131). Segregation variances were higher than zero, indicating differences in gene frequencies between pure breeds. Broad-sense heritability of the 25% synthetic derivative breed group was estimated to be equal to 0,66. Overall, our results support the suitability of models with heterogeneous additive genetic variances to predict breeding values in wheat crosses with variable ploidy levels.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Supplemental Materials for The Relative Efficiency of Three Constrained Multistage Linear Phenotypic Selection Indices

Jose Crossa (2018)

This dataset provides supplemental information related to an investigation of constrained multistage linear phenotypic selection indices.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Mean Phenotypic and Genotypic data for: Genome-wide association mapping and genomic prediction of anther extrusion in CIMMYT hybrid wheat

BHOJA BASNET Jose Crossa Susanne Dreisigacker (2020)

Conducted genome-wide association scan (GWAS) and explored the possibility of applying genomic prediction (GP) for Anther Extrusion (AE) in the CIMMYT hybrid wheat breeding program

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Sparse designs for genomic selection using multi-environment data

Yoseph Beyene Juan Burgueño Jose Crossa (2020)

This research study the genomic-enabled prediction accuracy of the composition of the following sparse testing allocation design: (1) all non-overlapping (0 overlapping) lines in environments, (2) all overlapping (0 non-overlapping) lines tested in all the environments, and (3) combinations of the two previous cases where certain numbers of non-overlapping (NO)/overlapping (O) lines were distributed in the environments. We also studied cases where the size of the testing population was decreased. The study used two large maize data sets (T1 and T2). Four different genomic-enabled prediction models were studied, two models (M1 and M2) that do not include the genomic × environment interaction (GE), whereas models M3 and M4 incorporate two forms of modeling GE. The results show that genome-based models including GE (M3 and M4) captured more genetic variability with the GE component than the other models for both data sets. Also, models M3 and M4 provide higher prediction accuracy than models M1 and M2 for the different allocation designs comprising different combinations of NO/O lines in environments. Results indicate that substantial savings of testing resources can be achieved by optimizing the allocation design using genome-based models including genomic × environment interaction.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA

Deep learning genomic-enabled prediction of plant traits

Osval Antonio Montesinos-Lopez Jose Crossa (2018)

Machine learning (ML) is a field of computer science that uses statistical techniques to give computer systems the ability to "learn" (i.e., progressively improve performance on a specific task) from data, without being explicitly programmed to do this. ML is closely related to (and often overlaps with) computational statistics, which also focuses on making predictions through the use of computers. In general, ML explores algorithms that can learn from current data and make predictions on new data, through building a model from sample inputs. The field of statistics and ML had a root in common and will continue to come closer together in the future. In this paper we explore the novel deep learning (DL) methodology in the context of genomic selection. DL models with densely connected network architecture were compared with one of the most often used genome-enabled prediction models genomic best linear unbiased prediction (GBLUP). We used nine published real genomic data sets to compare the models and obtain a “meta picture” of the performance of DL models with a densely connected network architecture.

Dataset

CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA