TraitCapture: Genomic modelling for plant phenomics under environmental stress
- TraitCapture publication on GWAS in Current Opinion in Plant Biology (2014, 18:73–79).
- Supercomputer resources for TraitCapture pipeline proposal (2014)
Our work integrates quantitative genetic analysis and functional structural plant models (FSPMs) with phenomics, permitting:
1. the identification of composite phenotypes that constitute heritable genetic traits;
2. the determination of the genetic architecture allowing phenotypic prediction; and,
3. the modelling of the interaction with the environment to select optimal genotypes.
The proposal will address current limitations in phenomics by providing visualization tools; image processing into quantitative values; multi-trait association with genomic data to provide mechanistic insight; and incorporation of environmental variation into growth models that integrate genotype and environment to predict phenotype.
The field of plant phenomics has emerged from traditional plant physiology with high resolution, non-invasive, imaging technologies. Phenomics experiments record time-series data on plant functional traits as well as top down and 3D models of plant growth and development. This comprehensive, multi-dimensional phenotyping allows specific hypothesis about genotypic and/or environmental effects to be tested. It allows testing of multiple variables for differential effects on complex traits such as photosynthetic rates and water use efficiency. High throughput phenotyping (HTP) applies the phenomics approach across hundreds or thousands of plants, enabling screening for large effect qualitative genetic mutations in many traits.
New plant phenomics facilities are opening worldwide following the lead of the Australian Plant Phenomics Facility including our partner on this project, the High Resolution Plant Phenomics Centre (HRPPC) at CSIRO in Canberra. Likewise, many smaller labs are developing their own high throughput systems for phenotyping plants (Zhang et al, 2012). The use of HTP for genetic analysis of complex quantitative traits has been proposed (Furbank and Tester, 2011), but this approach has not been widely applied due to the complex analysis requirements of quantifying individual phenotypes from images of hundreds of plants and associating them with whole genome sequence variation (Topp et al, 2013). Consequently, both larger facilities and smaller labs lack the ability to identify genetically heritable traits from high throughput phenotypic data. Genome Wide Association Studies (GWAS) are a powerful new method for trait dissection that we have recently demonstrated for Arabidopsis (Li et al, 2010). Integrating HTP and GWAS approaches into a single “seeds to traits” phenomics pipeline has the potential to revolutionize the rate of trait discovery. Understanding the genetic basis of complex plant traits such as growth and yield allows elite varieties to be genomically selected based on their predicted phenotypes.
Functional Structural Plant Models (FSPMs) have traditionally been used in an agricultural context to simulate aspects of plant response and growth as governed by physiological processes which are in turn driven by local environmental conditions at the plant organ level (Hanan, 1997; Godin and Sinoquet, 2005). These models incorporate 3D developmental modelling and mechanistic physiological models. Including architectural information using a 3D mesh is particularly relevant for photosynthetic growth responses because light quantity and quality inputs vary with the spatial structure of a plant (Vos et al., 2010). Although the FSPM approach has given us very sophisticated tools for predicting crop yield, the strong impact that plant genetics can have on yield outcomes has yet to be fully integrated into these models. Typically, plant genetic variation is treated as “noise” in FSPM models where the intent is to understand how a generalized plant of a given species will respond under particular climatic conditions. In contrast, in lab phenotyping experiments, environmental variation is considered noise that limits the discovery of the genetic basis underlying plant traits. In the real world, environment and genetics interact together to determine a plant’s actual phenotypic characteristics in the field (and hence resilience, yield, etc.). It is thus necessary to integrate both the FSPM and the lab genetic approach to better predict yield of particular genotypes in across typical growing regions.
Although new work has shown that FSPMs can integrate genetic information from QTL studies (Xu et al., 2011), this approach has not been widely applied. In the proposed work, we will use our world class HTP phenomics facilities to set up sensitive FSPMs across many plants. Environmental variation will then be introduced using retrofitted growth chambers we have developed that allow us to simulate regional seasonal climates (Li et al, 2010). This will allow initial FSPMs that include major genetic effects to be re-parameterized to incorporate dynamic environmental conditions making them flexible to handle field-like conditions. In addition, our GWAS analysis tools allow us to include genetic effects and their interaction with the environment. Developing FSPMs that incorporate the effects of genetics and environment will allow us to predict phenotypes for sets of genetic variants across many environments using in silico approaches. This approach enables ‘virtual plant breeding’ where both potential germplasm and field site/environmental combinations can be evaluated computationally prior to planting. Virtual plant breeding allows one to determine optimal genotypes for the environment at existing locations and to predict high yielding genotypes under future climate change scenarios.
In this proposal, we have assembled a unique team to integrate advanced imaging equipment, feature detection from image data, genomic analysis of complex traits (GWAS) and FSPMs into an open pipeline called TraitCapture (Figure 1). The data generated by this work will address the gap between controlled conditions and the field by incorporating genetic and environmental effects into functional structural plant models (Xu et al, 2010). The biological innovation comes from associating multiple phenotypes as a single trait, or trait locus, gives mechanistic insight into the biological function of the underlying gene and regulatory pathway. Furthermore, the TraitCapture system will be largely open source and incorporate a modular, scalable design to make it usable to both large scale industrial users such as the HRPPC and smaller labs with just a few cameras in a growth chamber.
Figure 1. TraitCapture links High Throughput Phenotyping, GWAS, and Functional Structural Plant modelling.
(A) Multi waveband LED lights provide variation in light quality and quantity as input to FPSMs.
(B) Plants are imaged to quantify fluorescence and green pixel phenotypes, through time (C) for GWAS analysis.
(D) TrayScan high throughput phenotyping provided by our partner PSI.
(E) Functional plant models incorporate environmental as input along with
(F) Structural models that include plant architecture to predict phenotype from environment and genetic variation.
SIGNIFICANCE AND INNOVATION
Agricultural production in Australia is greater than $45B annually (ABS, 2012). Global agricultural demand is increasing rapidly due to increased consumption of food, feed, and fuel by a larger, more affluent population. To meet projected global food demands in coming decades, global cereal production must increase by 70% by 2050, a net annual increase in productivity of nearly 40% over historic levels, every year for the next 38 years (Tester & Langridge, 2010). At the same time, climates are changing globally, shifting growing regions and reducing climate predictability. Models predict even larger changes in critical growing regions in the coming decades (Cline, 2012). These pressures are leading to increased plantings on marginal lands, displacement of natural ecosystems and intensification of existing agricultural practices and their environmental impacts. Consequently, a better understanding of how to breed for increased yield and yield stability among growing regions in the face of shifting climates is of utmost economic and social importance. We must better understand how environment and genetics interact so that we can optimize our agricultural systems to improve long term yield in variable environments.
Plant physiology research is progressing from detailed studies of a few different genotypes at a time, to high throughput, quantitative, phenomic studies on populations with fully sequenced genomes. When this is coupled with modern molecular breeding it allows for genomic selection on “yield indicator” traits to increase regional yield and yield stability (Furbank and Tester, 2011, Munns et al, 2010). These modern techniques provide the potential for plant scientists to identify heritable traits and the complex regulatory networks underlying adaptive phenotypic variation (Yu et al, 2012). The current challenge is to weave these new techniques into a package that can be implemented across phenomics platforms on different plant species, and that are applicable to field data. Quantification of phenotypes extracted from image data, when combined with genetic analysis allows the identification and prediction of heritable traits. By incorporating growth models (FSPMs) that include genetic and environmental variation, phenotypic predictions can be made for different growing regions to pre-select specific genotypes for local field trials. Australia is currently a global leader in high throughput phenotyping (HTP) for plant and crop sciences. The proposed TraitCapture software pipeline will help Australia maintain its leading position by facilitating the integration of genomics and plant phenomics technologies.
Detail what new methodologies or technologies will be developed in the course of the project.
The TrayScan phenomics platform built by our partner (PSI) provides image data from, RGB, infrared, and fluorescence cameras for up to 300 plants in a run (Figure 1D). However, automated analysis of the RGB images and integration with other data sources, such as genomic data, is not currently available nor is planned under extended NCRIS funding. This proposal will develop software tools to integrate new cameras with advanced image analysis, genetic dissection, and plant models. Importantly these pieces interact to enhance each other. For example image analysis settings can be optimized to improve the genetic association. Furthermore, additional image acquisition can be performed at critical time points to explore novel genetic associations while redundant observations can be eliminated. Finally, open-source and web-based software integration will allow phenomic data to be remotely processed and easily shared with collaborators or publicly. The web-based design will also allow users to better collaborate both locally and globally.
There are currently two dedicated phenomics centres in Australia (including our partner HRPPC at CSIRO). This proposal will enable new service offerings that integrate genomic variation and multi-trait analysis using GWAS and FSPMs to capture heritable traits and trait loci. In addition to expanding the HRPPC offerings, TraitCapture software will provide the emerging distributed phenomic community with these high level analysis tools. Thus, new and previously cryptic traits can be identified when plants are grown under different conditions with phenotypes sampled using custom protocols established in their labs.
A brief list of novel experiments enabled by this project includes:
● Light response regulated development
● Light use efficiency of photosynthesis under diurnal and seasonal cycles
● Light and temperature interactions on transpiration using Infrared (IR) cameras
● Spatial and temporal distribution of fluorescent pigments under environmental stress
● Integration of 2.5D and 3D quantification of plant growth with stereo imaging (Paproki et al, 2012)
● Examination of variation in photosynthetic activity and efficiency in model plants
● Spectral indices of heritable traits
To facilitate widespread use and collaboration between researchers, online data visualization and analysis tools are core components in our approach. Web-based visualization tools will allow real-time graphing of environment data with associated time-lapse movies. GWAS will be performed and results displayed during the experiment allowing users to modify phenotyping protocols to improve QTL detection in real time. The online tools will enable researchers to explore experimental data and share results. For example, a user identifies plants with alternative genotypes at a locus controlling growth rate. She could then co-visualize chlorophyll fluorescence and thermographic information in time-lapse with a graph showing the emergence of the quantitative trait loci that control these traits. Finally, she could formally test for pleiotropic effects of the QTL and email the link to collaborators to allow them to load the same dataset to discuss and refine the analysis. Published results could also include links to the datasets and visualization tools expanding upon what is currently implemented in the Phenomics Ontology Driven Datarepository (http://www.plantphenomics.org/PODDProject). .
Strategic Research Priorities
Managing our Food and water assets — Optimize food and fiber production using our land resources: By improving methods that use genotype and environment to predict crop yield in specific growing regions, our proposal will help optimize agricultural production and efficient use of current land resources.
Describe how the Proposal might benefit Partner Organization(s) and other relevant end-users.
PSI is a major commercial player building high throughput phenotyping systems. This linkage proposal will allow PSI to provide validated GWAS/QTL and FSPM software to expand the usability of their existing innovative hardware offerings. ANU and CSIRO will benefit by having direct access to the phenomics developers that are adding new equipment and give us direct input into the next generation of TrayScan.
CSIRO has a suite of existing tools for image analysis and data management that will be extended to GWAS and FSPMs in this proposal. The current software tools developed for the HRPPC are highly specialized and complex. This project will provide the needed personnel to integrate current and new tools into a user friendly package for a wider audience.
TrayScan, developed by PSI/CSIRO/ANU, was originally designed to perform mutant screens to detect rare, qualitative, large effect lesions in photosynthetic processes. A primary goal of the current project is to create new software that allows postdocs and graduate students to perform quantitative genetic (GWAS/QTL) and modelling (FSPM) studies without advanced technical training. The TrayScan machine will be in active use for research by ANU and CSIRO throughout the length of the project. Keeping an active user-base involved will help us better design the system to meet the needs of new users while bringing the tools we are developing to a wider community. The software pipeline and demonstration projects of this proposal will include student training and support roles to overcome the computational hurdles that currently limit use.
APPROACH AND TRAINING
The three partners in this proposal bring unique resources to this project. At ANU, the participating labs (Borevitz, Badger, and Pogson) are world leaders in applying advanced bioinformatics and GWAS to plant phenomics, photosynthesis analysis, and drought tolerance. CSIRO is currently a research leader in developing next generation tools for enabling plant phenomics. PSI is an international leader in developing plant fluorescence imaging hardware and high throughput phenotyping systems. Together we form a team that is uniquely suited to develop the TraitCapture software pipeline.
PSI will develop additional hardware and operational protocols to generate population level phenomic data. CSIRO will develop image analysis methods to extract plant level composite phenotypes from multilayered developmental time series data. ANU will identify heritable genetic traits and deconstruct the deeply correlated phenomic data into finely resolved quantitative trait loci (Borevitz). ANU will also develop phenotyping protocols and experimental assays in the key areas of photosynthesis and drought resistance (Badger, Pogson). To do this, we will implement software for genome wide association studies on sequenced mapping populations in Arabidopsis and Brachypodium. These lines will be subjected to high resolution phenomic analysis in three sets of experiments outlined below.
To extend the capacity of TrayScan, new hardware (stereo RGB, hyperspectral camera, and LED lights) and improved control software will be added. TrayScan software will include new scanning protocols, image processing and user inputs for experimental design. Outputs will have standard formats to match the data management specifications. This allows integration with the remainder of the data pipeline and existing CSIRO data processing and modelling tools. Open standards allow users to customize phenomic capture protocols if they wish to use the raw data outputs of TrayScan machine.
Operation of the TraitCapture system will be performed by a PhD student in consultation with postdocs and CIs. This will assure that the pipeline is optimized for genetic and environmental stress experimental applications and operable by entry-level scientists. The three pilot experiments proposed here will be used to develop the pipeline. The pipeline can be then be applied to a much broader range of biological questions and species. The following experiments will be undertaken in two model plant species, Arabidopsis thaliana and Brachypodium distachyon. These experiments will be replicated at both ANU and CSIRO to provide cross-validation of data, processes and techniques. See Table 1 below for proposed project timeline.
Experiment 1: Capture heritable traits
Plant traits are heritable phenotypes detected at unique developmental time points and under specific environments. Thus, several correlated phenotypes may better measure and describe the same pleiotropic plant trait. To optimize detection and characterization of these plant traits we must quantify and separate the genetic signal from the biological noise that exists among inbred lines. This experimental design provides a novel solution to computer vision because signal/noise thresholds can be optimized (Zhang et al, 2012). We will use 15 biological replicates of 20 diverse genotypes for each Arabidopsis and Brachypodium in experiment 1. Real time phenotyping in the standard growth chambers and thermal/3D imaging (CabScan) will be performed and plants will be analyzed in TrayScan three times per week. Growth and development will be calculated at each time point measuring leaf number, size, growth rate and spectral properties. Whole plant 3D architecture will be interpolated with stereoscopy. TrayScan will be used to measure photosynthetic activity (via chlorophyll fluorescence dynamics) and thermal cooling via leaf transpiration. A terminal drought stress will be applied after four weeks. These measurements will also provide information about the variation between accessions in the timing and nature of their responses to abiotic stress including the accumulation of photoprotective pigments, ability to maintain leaf water potential, and the ability to alter life strategy to avoid stress. For each of the 100’s of specific phenotype measures at 1000’s of time points, the heritability will be calculated. Clustering time points will identify developmental stages. A genetic correlation matrix among phenotypes will be generated and then hierarchically clustered into composite traits (Cheng et al, 2013). These measures will guide optimization in phenomic capture protocols and image analysis, to iteratively improve trait identification and characterization (e.g. pleiotropy) for use in the subsequent mapping experiment.
Experiment 2: Genetic dissection and prediction using GWAS
To identify the underlying causal genetic basis of these complex traits, large genotyped mapping populations will be screened. This set of experiments will again use standard growth conditions (12h days, 22/18C) with terminal drought treatment after four weeks. Both Recombinant Inbred Lines (RILs) and wild inbred haplotype mapping sets (HapMap) will be used for both Arabidopsis and Brachypodium. Arabidopsis Cvi X Ler RILs and Brachypodium Bd21xBd3-1 will be the target sets of RILs with 150 lines that will be run in duplicate. The HapMap sets are 300 lines selected from the global collection (1001genomes.org) of natural accessions to increase genetic variation, mapping resolution and to balance population structure (Li et al, 2010; Brachi et al, 2011). Full sequence data is currently available for Arabidopsis lines. The same resource is under development for Brachypodium to enable GWAS (1002genomes.org). TraitCapture software will be preloaded with Arabidopsis and Brachypodium RIL and HapMap genotype data and will also allow users to upload and share new genotype information including other species. GWAS will associate multiple phenotypes (Cheng et al, 2013) with each SNP while controlling for population structure using mixed effect linear models as implemented in our package QTLRel (Cheng et al, 2011). Once major QTL are identified they will be jointly fit with a full model to estimate major QTL and background effects as best linear unbiased estimates (blups). The blups allow phenotypes to be predicted from genotypes which is important for fitting functional structural plant models.
Experiment 3: Environmental effects and phenotypic prediction using Functional Structural Plant Models
Controlled conditions that minimize environmental variation are ideal for identification and mapping of complex traits. However, they are limited in their ability to translate genetic effects to the field. To overcome this gap, dynamic growth chamber conditions will be used to parameterize plant models with both environment and genetic effects (Xu et al, 2011 and Zhang et al, 2012). The Borevitz lab growth chambers (SpectralPhenoClimatron), have been retrofitted to simulate regional seasonal climates (Li et al, 2010) with a recently funded grant (LE130100081). This capability will be leveraged for the phenomic analysis proposed here. Success in this step will validate the concept and software tools which can be translated to other species and environments. Temperate and sub-tropical conditions will be simulated to reflect dominant agricultural growing regions (e.g. south Queensland and Victoria) with cyclic drought stress applied throughout the season (Chenu et al, 2013). TrayScan records the pot weight and adds water to a specified amount controlling moisture for each plant. Light, temperature, and humidity regimes will vary every 15-min in the SpectralPhenoClimatron according to the specified regional climate. These same values will be input parameters to Functional Structural Plant Models. By measuring growth on each plant during the simulated growing season, the models are parameterized to allow prediction of phenotypic outcomes for other sets of climate values (Wilczek et al, 2009). Joint modelling across both regional conditions on half of the plants will determine environmental parameter estimates. Validation of the prediction will be performed on the remaining half of the plants not used in training the model, e.g. cross validation. Finally, values from the temperate growth conditions will be used to predict the sub-tropical results, and vice versa, to test model robustness. Incorporating genetic variation in FSPMs (Xu et al, 2011) will allow prediction of phenotypes from genotypes in a range of environments. To improve predictions, deviations between model results and observations can be used as a residual phenotype in GWAS to identify new components and further improve the model. This iterative approach allows new gene identification and model optimization (Cooper et al, 2009).
Plants will be monitored continuously ‘in-chamber’ by RGB cameras and the images analyzed for variation in growth rate, morphology, life stage, plant architecture, height, and pigment accumulation. Three times per week, plants from experimental blocks will be moved into TrayScan to measure photosynthetic activity, hyperspectral profile and water use, via pot weighting and thermal transpiration. These phenotypes will be measured before and in response to terminal drought stress and during simulated growing seasons in climate chambers.
The Postdoctoral scientists (statistical geneticist and computer vision modeler) and PhD student will be working with new technologies, computational methods, across institutions and with international partners. This project will provide a unique cross-training environment implementing cutting edge skills in genomics, statistical genetics, phenomics techniques, computer vision, and modelling. There is a team of established senior scientists that have a long and successful track record in training that will provide supervision. Finally, the project will host visiting scientists from PSI and other partner institutions to provide knowledge transfer in the use of the new methods and to distribute new software to a global community of phenomics users.
Experiment 1 Arabidopsis
Experiment 1 Brachy
Experiment 1 Brachy
Experiment 1 Arabidopsis
Image data layer system
Integration of hyperspec LEDs
Beta versions of software
Experiment 2 Arabidopsis
Experiment 2 Brachy RILs
Experiment 2 Brachy HapMap
Experiment 2 Arabidopsis RILs
3D plant mesh reconstruction
Genotype -> Phenotype database
Light, temp, moisture control
Experiment 3 Arabidopsis
Experiment 3 Brachy
Publications & documentation
FSPM with environment and genetics to predict phenotype
Publications & documentation
TrayScan software running
GWAS and FSPM
Table 1. Project timeline