short title: fpca-based gwas for time-series data in sorghummay 27, 2020  · 1 short title:...

41
Short title: FPCA-based GWAS for time-series data in sorghum 1 2 Increased power and accuracy of causal locus identification in time-series 3 genome-wide association in sorghum 4 5 Chenyong Miao 1 , Yuhang Xu 2 , Sanzhen Liu 3 , Patrick S. Schnable 4 , and James 6 C. Schnable 1* 7 1 Quantitative Life Science Initiative, Center for Plant Science Innovation, 8 Department of Agronomy and Horticulture, University of Nebraska-Lincoln, 9 Lincoln, NE USA 10 2 Department of Applied Statistics and Operations Research, Bowling Green State 11 University, Bowling Green, OH USA 12 3 Department of Plant Pathology, Kansas State University, Manhattan, KS USA 13 4 Department of Agronomy, Iowa State University, Ames, IA USA 14 * Corresponding author: James C. Schnable ([email protected]) 15 Author contributions: 16 CM and JCS designed the study. CM, SL, PSS, and JCS contributed methodology 17 and provided datasets. CM and YX analyzed the data. CM and JCS wrote the 18 paper. All authors reviewed and approved the manuscript. 19 One-sentence summary: Applying functional principal component analyses to 20 time-series genome-wide association studies increases the power and accuracy of 21 gene identification in a sorghum association panel. 22 23 Abstract 24 The phenotypes of plants develop over time and change in response to the 25 environment. New engineering and computer vision technologies track these 26 phenotypic changes. Identifying the genetic loci regulating differences in the 27 pattern of phenotypic change remains challenging. This study used functional 28 principal component analysis (FPCA) to achieve this aim. Time-series phenotype 29 data was collected from a sorghum (Sorghum bicolor) diversity panel using a 30 Plant Physiology Preview. Published on May 27, 2020, as DOI:10.1104/pp.20.00277 Copyright 2020 by the American Society of Plant Biologists https://plantphysiol.org Downloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Upload: others

Post on 16-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

Short title: FPCA-based GWAS for time-series data in sorghum 1

2

Increased power and accuracy of causal locus identification in time-series 3

genome-wide association in sorghum 4

5

Chenyong Miao1, Yuhang Xu

2, Sanzhen Liu

3, Patrick S. Schnable

4, and James 6

C. Schnable1*

7

1Quantitative Life Science Initiative, Center for Plant Science Innovation, 8

Department of Agronomy and Horticulture, University of Nebraska-Lincoln, 9

Lincoln, NE USA 10 2Department of Applied Statistics and Operations Research, Bowling Green State 11

University, Bowling Green, OH USA 12 3Department of Plant Pathology, Kansas State University, Manhattan, KS USA 13

4Department of Agronomy, Iowa State University, Ames, IA USA 14

*Corresponding author: James C. Schnable ([email protected]) 15

Author contributions: 16

CM and JCS designed the study. CM, SL, PSS, and JCS contributed methodology 17

and provided datasets. CM and YX analyzed the data. CM and JCS wrote the 18

paper. All authors reviewed and approved the manuscript. 19

One-sentence summary: Applying functional principal component analyses to 20

time-series genome-wide association studies increases the power and accuracy of 21

gene identification in a sorghum association panel. 22

23

Abstract 24

The phenotypes of plants develop over time and change in response to the 25

environment. New engineering and computer vision technologies track these 26

phenotypic changes. Identifying the genetic loci regulating differences in the 27

pattern of phenotypic change remains challenging. This study used functional 28

principal component analysis (FPCA) to achieve this aim. Time-series phenotype 29

data was collected from a sorghum (Sorghum bicolor) diversity panel using a 30

Plant Physiology Preview. Published on May 27, 2020, as DOI:10.1104/pp.20.00277

Copyright 2020 by the American Society of Plant Biologists

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 2: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

2/30

number of technologies including RGB and hyperspectral imaging. This imaging 31

lasted for thirty-seven days and centered on reproductive transition. A new higher 32

density marker set was generated for the same population. Several genes known 33

to control trait variation in sorghum have been previously cloned and 34

characterized. These genes were not confidently identified in genome-wide 35

association analyses at single time points. However, FPCA successfully identified 36

the same known and characterized genes. FPCA analyses partitioned the role 37

these genes play in controlling phenotypes. Partitioning was consistent with the 38

known molecular function of the individual cloned genes. These data demonstrate 39

that FPCA-based genome-wide association studies can enable robust time-series 40

mapping analyses in a wide range of contexts. Moreover, time-series analysis can 41

increase the accuracy and power of quantitative genetic analyses. 42

Introduction 43

Quantitative genetic approaches are widely used across all domains of biology to 44

identify genetic loci controlling variation in target traits. Many genes where 45

naturally occurring functionally variable alleles control variation in agronomically 46

relevant traits have been identified. Both quantitative trait locus (QTL) mapping 47

(structured populations) or genome-wide association studies (association panels) 48

have been used to identify these loci (Chen et al., 2016; Navarro et al., 2017; 49

Huang et al., 2010). Collecting phenotypic data from the hundreds to thousands of 50

accessions required for QTL mapping or genome-wide association study (GWAS) 51

has historically been a time- and resource-intensive undertaking. Hence, most 52

attempts to identify genes controlling variation in a target trait employ data from a 53

single time point, usually either at maturity or after a fixed number of days from 54

planting. However, recent engineering advances, such as wearable devices, 55

automated phenotyping greenhouses, field phenotyping robots, unmanned aerial 56

vehicles, and computer vision advances, are lowering the barriers and activation 57

energy required to score traits from multiple points throughout development 58

(Moore et al., 2013; Oren et al., 2017; Furbank and Tester, 2011; Honsdorf et al., 59

2014; Fernandez et al., 2017; Holman et al., 2016; Feldman et al., 2017, 2018). 60

Plant growth and development is a dynamic process, responding to environmental 61

perturbations. It is regulated by different suites of genes at different times and in 62

different environments (Yan et al., 1998; Wu and Lin, 2006; Moore et al., 2013; 63

Würschum et al., 2014; Muraya et al., 2017; Feldman et al., 2017, 2018). The 64

availability and use of time-series trait data from mapping and association 65

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 3: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

3/30

populations has the potential to increase the accuracy and power of gene mapping 66

studies (Moore et al., 2013; Sang et al., 2019). It may also provide greater insight 67

into the biologically distinct roles different loci play in determining final 68

phenotypes. However, integrating time-series data into statistical frameworks 69

originally envisioned for single-trait measurements across large populations is not 70

straightforward. A range of approaches are currently being explored by the 71

research community. 72

There are several approaches to the use of time-series phenotypic data in 73

mapping studies. The most straightforward is to conduct QTL mapping or GWAS 74

separately at each time point and then summarize the mapping results (Yan et al., 75

1998; Wu et al., 2010; Osman et al., 2013). However, this approach is not robust 76

when faced with missing data, or when subsets of the population are scored on 77

alternating time points. This procedure also requires complex approaches to 78

multiple testing correction given the partially correlated nature of both linked 79

genetic markers and measurements of the same trait in the same individuals at 80

multiple time points. A second widely employed method is to summarize patterns 81

of change over time using predefined functions with discrete numbers of variables 82

(Ma et al., 2002; Wu et al., 2004; Paine et al., 2012). QTL mapping or GWAS can 83

then be conducted for the values of the different variables within the equation as 84

that function is fit to data from different individuals. This approach can be 85

powerful and is able to impute missing data points in individuals. However, this 86

approach can fail when the pattern of phenotypic change over time follows an 87

unknown function, is too complex to fit, or does not conform to the expected 88

function (Feldman et al., 2018). Functional principal component analysis (FPCA), 89

as a more general method, provides some of the strengths of fitting parametric 90

functions sharing data across time points. This includes the ability to impute 91

missing values without requiring that patterns of phenotypic change over time fit 92

any particular function (Kwak et al., 2016; Xu et al., 2018a,b). A variation of 93

nonparametric functional principal component-based mapping has been 94

successfully employed to identify loci controlling the gravitropism response in A. 95

thaliana seedlings (Moore et al., 2013; Kwak et al., 2016). Kwak et al. (2016) 96

concluded that dimensional reduction via FPCA may increase the power to detect 97

QTL in recombinant inbred populations relative to prior approaches that include 98

trait data from each time point. Muraya et al. (2017) employed an FPCA method 99

adapted from Kwak et al. (2016) to identify several trait-associated SNPs for maize 100

(Zea mays) biomass accumulation in vegetative development. This represented an 101

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 4: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

4/30

advancement over parametric regression, which had not been able to identify any 102

trait-associated SNPs using the same trait and marker dataset. However, none of 103

the identified loci coincided either with markers identified in single time point 104

analyses, or with known loci controlling the trait of interest. 105

Three cloned sorghum (Sorghum bicolor) genes controlling height–dwarf1, 106

dwarf2, and dwarf3–are segregating in the Sorghum Association Panel (SAP). 107

Dwarf1 (Dw1, Sobic.009G229800) was cloned by two independent fine mapping 108

studies (Hilley et al., 2016; Yamaguchi et al., 2016). The function of the Dw1 gene 109

appears to act in the brassinosteroid signaling pathway to influence sorghum plant 110

height by reducing cell proliferation activity in the internodes (Hirano et al., 2017). 111

Dwarf2 (Dw2, Sobic.006G067700) on chromosome 6 is in tight linkage with 112

maturity1 (Ma1, Sobic.006G057866), a gene with a large effect on sorghum 113

flowering time. In many cases, both loci are introgressed together during the 114

backcrossing and conversion process of adapting tropical sorghum germplasm to 115

grow in temperate latitudes (Quinby, 1974; Higgins et al., 2014). The Dw2 gene 116

encodes a protein kinase which is homologous to KIPK in Arabidopsis and 117

influences sorghum plant height by reducing the length of internodes (Hilley et al., 118

2017). Maturity1 gene has the largest impact on flowering time through encoding 119

the pseudoresponse regulator protein 37 (PRR37) to modulate flowering time in 120

sorghum (Murphy et al., 2011). Dwarf3 (Dw3, Sobic.007G163800) encodes a 121

MDR transporter orthologous to brachytic2 in maize, and appears to influence cell 122

elongation through a role in polar auxin transport (Multani et al., 2003). Previous 123

sorghum height mapping studies identified epistatic interactions between Dw1 and 124

Dw3, but no significant epistatic interactions have been reported between Dw2 and 125

either Dw1 or Dw3 (Brown et al., 2008; Hilley et al., 2016; Yamaguchi et al., 126

2016). 127

In this study, we employed a new approach to functional principal 128

component analysis of non-parametric regression data (Xu et al., 2018a). The 129

procedure’s effectiveness at both controlling false positives and identifying known 130

true positives in sorghum was evaluated. A phenotypically constrained subset of 131

the SAP was utilized (Casa et al., 2008). This population was grown and imaged 132

through vegetative and reproductive development in a high-throughput 133

phenotyping facility. Organ-level semantic segmentation from hyperspectral 134

images was employed to extract phenotypic values (Miao et al., 2020). Genome-135

wide association using the functional principal component scores derived from the 136

FPCA approach described by Xu et al. (2018a) could be used to successfully 137

identify all three known dwarf genes. Three novel signals were also detected. The 138

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 5: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

5/30

signal on chromosome 3 was independently validated in data from a previous 139

study of a different sorghum population. These results favorably contrasted with 140

terminal phenotyping or time point by time point analyses in the same population, 141

and provide additional insight into the distinct biological roles of three known 142

dwarf genes in determining sorghum height. 143

144

Results 145

Loss of association power in phenotypically constrained populations 146

The classical sorghum dwarfing genes have large effects on plant height and are 147

segregating within the SAP (Morris et al., 2013; Li et al., 2015). Field-collected 148

plant height data for 357 lines from the SAP correctly identified both Dw1 and 149

Dw2 [Figure 1A] as well as a third previously reported locus known to influence 150

variation in height in this population (Li et al., 2015). Thirty-eight lines from the 151

field study exceeded the physical limit on maximum height for the imaging facility 152

employed in this study [Figure 1B]. After the exclusion of these 38 lines as well as 153

an additional 27 lines which failed to germinate or thrive in the greenhouse, field-154

measured height data for the remaining 292 lines was still sufficient to identify 155

Dw1, albeit with substantially reduced statistical significance. However, after the 156

removal of these phenotypically extreme lines, signals from Dw2 and the other 157

previously reported height-controlling loci were no longer statistically significant 158

[Figure 1C]. Dw3, located on the long arm of chromosome 7, was not identified in 159

association analyses using either field data on terminal plant heights from the 160

complete set of 357 SAP lines, or in analyses of data only from the phenotypically 161

constrained subset of SAP lines employed for greenhouse experiments. Excluding 162

lines in the mapping population not only decreased the total population size, but 163

also reduced the minor allele frequency of SNPs near genes known to control 164

variation in sorghum height [Figure S1]. 165

Genetic associations with sorghum height at different time points 166

The phenotypically constrained subset of 292 SAP lines was imaged at the 167

University of Nebraska’s Greenhouse Innovation Center (UNGIC). Imaging lasted 168

approximately one month, and centered on reproductive development. Each plant 169

was imaged using a set of five cameras, including top views and side views from 170

multiple angles, and hyperspectral imaging from a single side view perspective 171

(Ge et al., 2016; Liang et al., 2017; Miao et al., 2020). Two methods were 172

employed to extract plant height at individual time points. The first was whole-173

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 6: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

6/30

plant segmentation, an approach widely used in plant image analysis (Gehan et al., 174

2017): segmentation of an image into plant pixels and not plant pixels, and 175

measuring height by the difference between the minimum and maximum y-axis 176

values for plant pixels [Figure 2A]. The second approach was a recently described 177

method (Miao et al., 2020) to semantically segment "plant pixels" of sorghum 178

plants into separate stalk, leaf, and panicle classes [Figure 2B, S2]. There are many 179

different ways to measure the plant height (Brown et al., 2008; Miao et al., 2020). 180

Generally, researchers in the field have preferred to benchmark on the height of 181

stalk, height to the uppermost leaf collar, or the height to the tip of the 182

inflorescence rather than the height to the highest leaf tip. An advantage of the 183

latter method, semantic segmentation, is that it can be used to measure a version of 184

plant height which more closely approximates how height is measured in the field 185

(Miao et al., 2020). Sorghum plant height measured by whole-plant segmentation 186

tended to oscillate over time as new leaves emerged, whereas sorghum plant height 187

measured via the semantic segmentation method tended to be monotonically 188

increasing [Figure 2C]. Generally, researchers in the field have preferred to 189

measure the height of stalk or the height of inflorescence rather than the height to 190

the highest point on the plant, which is often a leaf tip. The semantic segmentation 191

approach makes it possible to measure height using a definition for plant height 192

that corresponds to the definition of plant height used by sorghum geneticists in the 193

field (Miao et al., 2020). 194

A second challenge faced by many time-series imaging studies is the maximum 195

daily throughput provided by fixed imaging infrastructure. In this particular study, 196

individual plants were divided into two groups imaged on alternating days. 197

Therefore, there was no single time point at which image or height data was 198

available for all individual plants [Figure S3A]. In order to estimate the height of 199

each individual plant at each individual time point, nonparametric regression was 200

employed to impute all missing data points [Figure S3B]. To assess the accuracy of 201

this imputation, the percent of variance in height explained by genetic factors was 202

assessed at individual time points for subsets of the population using either height 203

measured directly from images or height imputed from nonparametric curves. In 204

most cases, the proportion of variance in imputed height which could be explained 205

by genetic factors matched or exceeded the equivalent value for directly measured 206

height [Figure S4]. This is consistent with previous work which found that missing 207

values in time-series plant phenomics data can be imputed accurately from 208

nonparametric curves (Xu et al., 2018b). 209

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 7: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

7/30

Independent genome-wide association analyses were conducted for imputed 210

sorghum height as measured via semantic segmentation on each day from 41 to 76 211

days after planting (DAP). Consistent with observations from field-collected plant 212

height data, almost no significant trait/marker associations were observed in the 213

analyses for DAP 65–76 [Figure S5; Supplementary file S1]. In the early days of 214

the experiment, statistically significant signals were identified all over the genome, 215

likely as a result of extreme height values for a small number of lines. Those lines 216

experienced reproductive transition and panicle emergence prior to the start of 217

imaging [Figure S6]. Excluding extreme lines led to the identification of signals 218

near Dw1, Dw3, and Ma3 in data from some early time points mixed in among 219

approximately a dozen other repeatedly identified loci across the genome. A loss 220

of detectable genetic associations between 54 and 66 DAP corresponded roughly 221

to booting and panicle exertion in many sorghum accessions. The peduncle length 222

is under the control of a distinct set of genetic factors from plant height below the 223

flag leaf (Li et al., 2015), and the timing of panicle exertion is under the control of 224

a third distinct set of factors, namely maturity genes (Quinby and Karper, 1945; 225

Quinby, 1967). We speculate that, once a large proportion of lines advanced to the 226

booting stage or beyond, the role these three sets of genetic factors played in 227

determining plant height diluted the total variance attributable to any one single 228

genetic locus, reducing power to identify statistically significant associations. 229

A second set of analyses were conducted where nonparametric curves calculated 230

for individual plants were aligned based on time relative to panicle emergence (e.g. 231

Days After Panicle Emergence or DAPE) rather than time relative to planting 232

(DAP) [Figure 3; Supplementary file S1]. Given variation in the date of panicle 233

emergence, and the dangers of extrapolating nonparametric curves beyond the 234

range of observed data points, it must be noted that this approach meant that data 235

was available only for distinct subsets of the original 292 phenotyped sorghum 236

lines at any given time point. Day-by-day genome-wide association studies were 237

conducted from 14 days prior to panicle emergence to 10 days after panicle 238

emergence, as described above [Figure 4]. No known sorghum flowering-time loci 239

were identified using this approach, with the exception of Ma1. However, due to 240

the strong linkage between Ma1 and Dw2 genes based on previous studies it is not 241

possible to exclude the possibility that the significantly associated markers near 242

Ma1 reflect the effect of Dw2 on plant height (Quinby, 1974; Higgins et al., 2014). 243

The signal corresponding to Dw2 was identified in the DAPE GWAS analysis for 244

six days between -14 to -5 DAPE with the signal 323 kilobases from the known 245

locus. The signal corresponding to Dw1 was identified from -14 to +1 DAPE and 246

was located only 15 kilobases away from the known location of Dw1 on sorghum 247

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 8: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

8/30

chromosome 9. This was closer than the 114 kilobase distance between the gene 248

and the closest significant SNP identified in DAP GWAS results [Figure S6]. Dw3 249

did not show any statistically significant signal in the DAPE based analysis but 250

was identified in some time points when plant data was compared using DAP. 251

Hence, whereas it was possible to identify all three known true-positive genes 252

controlling sorghum plant height through genome-wide association analyses at 253

individual time points, in no case were all three identified in a single analysis 254

across the approximately 60 total genome-wide association studies either 255

conducted at different time points or using different developmental landmarks. 256

Many other confounding associations were also identified. In many cases, both the 257

single most significantly associated SNP, and the "hot zone" of SNPs all showing 258

strong significant association with trait variation, were quite distant from the 259

known causal locus. However, they were still within the range expected given 260

observed LD decay rates in sorghum (Morris et al., 2013; Miao et al., 2019). 261

Mapping genes controlling variation in growth curves 262

All three known true positive height genes were identified by sequential time 263

point–based GWAS. However, there was no single time point, nor any single 264

treatment of the data (DAP or DAPE), where all three of said genes were 265

identified. Many other loci not previously reported to be linked to height were also 266

identified with equal or greater statistical support to known true positive genes 267

across many time points. In order to conduct the individual time point analyses, it 268

was necessary to fit nonparametric curves to each individual plant to impute 269

unobserved values. Functional principal component analyses [Figure 5A] was 270

employed to decompose variation among the curves into four functional principal 271

components that combined to explain >99% of the total variance in curve shape 272

among the plants in the population. The first two functional principal components 273

were able to explain >97% of total variation [Figure 5B & C]. 274

As the first two functional principal components described the vast majority of 275

the variation in patterns of change in plant growth over time, genome-wide 276

association analyses were conducted to identify genes controlling variation in these 277

two phenotypic descriptors [Figure 6; Supplementary file S1]. Sorghum lines with 278

negative scores of the first functional principal component tended to be taller 279

overall and exhibited greater increases in height during panicle emergence than 280

lines with positive scores of the second functional principal component [Figure 6A 281

& B]. Both Dw1 and Dw2 showed statistically significant associations with 282

variation in the first functional principal component, as did a single locus on the 283

short arm of chromosome 3 [Figure 6E]. Sorghum lines with negative functional 284

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 9: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

9/30

principal component two scores tended to be taller prior to panicle emergence, but 285

exhibited limited additional increases in height during panicle emergence. 286

Moreover, lines with positive functional principal component two scores started 287

out shorter, but exhibited big increases in height during panicle emergence. This 288

second set of lines was taller at the end of the experiment [Figure 6C & D]. Dw3 289

showed a statistically significant association with variation in the second 290

functional principal component, as did two other regions of the genome close to 291

the centromeres of sorghum chromosomes 5 and 9 [Figure 6F]. As was seen in 292

Figure 3, none of the hits identified here overlapped with known sorghum maturity 293

genes with the exception of Ma1. However, another potential explanation was that 294

previously uncharacterized loci were controlling flowering time in the 295

environmental conditions used to conduct this study. A genome-wide association 296

analysis for days to panicle emergence identified three well-supported loci: Ma1 297

on chromosome 6 and two loci on sorghum chromosomes 1 and 10. These loci do 298

not correspond to known sorghum maturity genes [Figure S7]. However, with the 299

exception of Ma1, these loci did not overlap with regions of the genome associated 300

with height or patterns of change in height when using panicle emergence as a 301

developmental landmark [Figure 3; Figure 6]. 302

The distance between the nearest SNP significantly associated with functional 303

principal component one and Dw1 (Sobic.009G229800) is approximately 35 304

kilobases. The hot region of SNPs on chromosome 6 which are significantly 305

associated with functional principal component one spans both Dw2 306

(Sobic.006G067700) and Ma1 (Sobic.006G057866). The novel hit for functional 307

principal component one on chromosome 3 was also identified in sequential DAPE 308

GWAS analysis [Figure 4]. The same small interval on chromosome 3 where this 309

novel hit was identified was previously linked to variation in sorghum height in an 310

independent QTL mapping population (Brown et al., 2006). This hit is associated 311

with a large–17 complete genes or gene fragments–tandem array of wall-312

associated kinase (WAK) genes. WAKs play a role in cell elongation and 313

expansion and previous antisense experiments targeting genes in this family in 314

Arabidopsis produced dwarf phenotypes (Wagner and Kohorn, 2001; Lally et al., 315

2001). The distance between the nearest SNP significantly associated with 316

functional principal component two and Dw3 (Sobic.007G163800) is 317

approximately 39 kilobases. The other two significant signals for functional 318

principal component two on chromosome 5 and chromosome 9 were not associated 319

with any immediately obvious candidate genes. This may reflect the location of 320

these hits in low-recombination/high linkage-disequilibrium regions of the 321

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 10: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

10/30

genome, expanding the potential distance between a trait-associated genetic 322

marker and the causal locus. 323

Discussion 324

Engineering and computational advances are making it increasingly practical to 325

dynamically monitor the trait values for organisms across time and development. 326

Time-series data provides opportunities to understand the role individual genetic 327

loci play in shaping phenotype. At the same time, extracting the greatest possible 328

insight from time-series data requires modifications to quantitative genetic 329

approaches traditionally employed for linking genotypic variation to phenotypic 330

variation across an entire population at a single point in time. Time-series data 331

from a phenotypically constrained diversity population in sorghum was used to 332

evaluate a number of approaches for leveraging time-series data to identify and 333

characterize how different loci play different roles in determining phenotype at 334

different stages of development. A key feature which enabled the analyses was that 335

three known large-effect loci controlling height were segregating in the population 336

of plants analyzed. However, these loci were not consistently identified in 337

conventional single point GWAS given the phenotypically constrained set of the 338

plants employed in the study [Figure 1]. In field studies, it is often necessary to 339

restrict the range of phenotypic diversity exhibited by an association population 340

key trait in order to obtain meaningful and comparable trait data in a single 341

environment (Hansey et al., 2011). Artificial selection also tends to reduce the 342

range of phenotypic variation present within elite populations used for crop 343

improvement (Yamasaki et al., 2007). 344

Different plants proceed through different stages of their life cycle at different 345

rates. Comparisons across varieties must, explicitly or implicitly, employ life-cycle 346

landmarks to enable comparisons across individuals. In many field studies, traits 347

are collected from terminal-stage plants, using physiological maturity as the life-348

cycle landmark for comparisons across individuals. In several other studies, 349

including many greenhouse- or growth chamber–based ones, comparisons are 350

performed at a fixed number of days after planting or days after germination, i.e., 351

using planting or germination as the life-cycle landmark. The SAP employed in 352

this study is sampled primarily from sorghum conversion lines collected from 353

around the world (Casa et al., 2008). Despite the introgression of photoperiod 354

insensitivity loci as part of the conversion process, lines vary significantly in total 355

days spent in vegetative development before transitioning to reproductive 356

development. We experimented with using either time of planting or time of 357

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 11: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

11/30

panicle emergence as the life-cycle landmark for phenotypic comparisons. For 358

these particular analyses, using panicle emergence as a landmark provided 359

significantly cleaner results with identification of known height-related loci [Figure 360

4]. In general, time-series trait data enable researchers to experiment with using 361

different landmarks for comparisons across individuals in a population. The correct 362

choice in any given case will depend on the specific goal of the analyses. 363

Plant heights at different time points were extracted from hyperspectral images 364

using a semantic segmentation approach (Miao et al., 2020). Compared to widely 365

used thresholding-based whole plant segmentation approaches in RGB images, 366

semantic segmentation is able to classify each plant pixel to the corresponding 367

organ, which provides more plant height estimations at different stages of 368

development. These estimations are more comparable to measurements made by 369

hand by biologists. The approach shows generalizability within sorghum between 370

mature plants and plants at the late vegetative stage, prior to visible reproductive 371

development but after clear leaf and stalk separations (Miao et al., 2020). 372

However, one source of error with the current classification approach is a tendency 373

to misclassify leaf midribs as stem pixels, particularly early in development. In 374

early vegetative development of sorghum, the shoot apical meristem remains 375

below the soil surface, and the apparent stem is simply a whorl of multiple leaf 376

sheaths. As the dataset evaluated here consists of plants from 40 DAP onwards, it 377

is not possible to draw strong conclusions about how this approach would perform 378

for sorghum seedlings. 379

Simulation studies based on fitting parametric models have demonstrated that 380

integrating longitudinal measures of phenotypes in a single population can provide 381

increased resolution and power to identify QTLs (Sang et al., 2019). Here we 382

employed nonparametric models which required fewer initial assumptions about 383

the pattern of how phenotypes change over time (Yang et al., 2009). Like 384

parametric models, nonparametric models can be used to accurately impute trait 385

values at unobserved time points (Xu et al., 2018b). This enables the combined 386

analyses of time data from larger populations given a fixed capacity in terms of 387

number of individuals phenotyped per day. Relative to previously published 388

functional principal component analyses, the statistical approach employed here is 389

robust to small with uneven numbers of time point observations per sample. 390

Genome-wide association studies based on functional principal component 391

weightings assigned to the time-series data from individual plants, collected on 392

separate days, successfully identified all three known true positive genes 393

controlling height in sorghum. Both Dw1 and Dw2 were associated with variation 394

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 12: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

12/30

in the first functional principal component of sorghum height, which exhibited a 395

consistent effect on height both before and after panicle emergence and exertion 396

[Figure 6A, B & E]. Dw3 was instead associated with the second functional 397

principal component of sorghum height, which exhibited opposite directions of 398

effect on height before and after panicle emergence and exertion [Figure 6C, D & 399

F]. Previous studies suggest that epistasis exists between Dw1 and Dw3, but they 400

are thought to act through different mechanisms or pathways to influence sorghum 401

plant height (Yamaguchi et al., 2016; Hirano et al., 2017). Dw1 influences plant 402

height through the brassinosteroid signaling pathway (Hirano et al., 2017). It 403

reduces the length of all internodes by reducing the number of cells in each 404

internode (Yamaguchi et al., 2016; Hilley et al., 2016). However, Dw3 and its 405

maize ortholog brachytic2 appear to influence plant height through auxin transport. 406

Dw3 primarily reduces the lengths of lower internodes rather than all the 407

internodes, and acts by decreasing the length of cells rather than decreasing the 408

total cell number (Multani et al., 2003; Yamaguchi et al., 2016). The molecular 409

function of Dw2 (Sobic.006G067700) is still not clear but the Sobic.006G067700 410

loss of function phenotype includes reductions in the length of all internodes, 411

which is similar to the loss of function phenotype of Dw1(Sobic.009G229800) and 412

dissimilar to the loss of function phenotype of Dw3 (Sobic.007G163800) (Hilley et 413

al., 2017). These known phenotypes and molecular functions of the three sorghum 414

dwarfing genes are consistent with the pattern observed in the FPCA-based GWAS 415

analyses where Dw1 and Dw2 are associated with a consistent effect before and 416

after panicle emergence. At the same time, Dw3 is associated with a functional 417

principle component which switches the direction of effect before and after panicle 418

emergence. 419

In addition to successfully identifying all three known true positive height genes 420

in sorghum, functional principal component–based mapping for plant height 421

exhibited much greater enrichment for these genes. Three statistically significant 422

loci were identified outside of the predefined known true positive set: the signal on 423

the short arm of chromosome three for the first functional principal component, 424

and two signals in the pericentromeric regions of chromosomes five and nine for 425

the second functional principal component. Given the broader mapping intervals 426

and slower LD decay in pericentromeric regions, it was not possible to confidently 427

conclude whether the signals on chromosomes five and nine correspond to specific 428

previous reports from QTL mapping or genome-wide association in sorghum. 429

However, the chromosome three signal associated with WAK genes was also 430

identified in the DAPE-based sequential GWAS results around panicle emergence 431

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 13: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

13/30

[Figure 4]. This indicates it may play an important role in a shift in the speed that 432

booting stage occurs. This signal was also validated as corresponding to a tightly 433

mapped sorghum plant height QTL identified in a separate recombinant inbred line 434

population (Brown et al., 2006). 435

Compared to the GWAS results on terminal height collected from the field in 436

the phenotypically constrained population [Figure1C], FPCA-based GWAS 437

analyses recovered Dw1 and Dw2 identified using the whole population and also 438

identified Dw3 as well as several additional loci [Figure 6E,F; Figure1A]. 439

However, the known height locus on chromosome 6, which was identified in the 440

field-based data presented in this paper as well as in previous GWAS for sorghum 441

height, was not recovered (Li et al., 2015). Mapping genes controlling height, or 442

most other highly polygenic traits in different field seasons or locations, will tend 443

to identify only partially overlapping sets of loci. This can be a result of genotype 444

by environment interactions, where certain loci influence variation in a given trait 445

in some environments but not others. It can also be a result of stochastic noise in 446

phenotypic measurements. The statistical approaches used for GWAS are designed 447

to minimize false positives and accept a very high false negative rate. As a result, 448

modest-effect loci will rise above the noise in some experimental replications but 449

not others. Consistent with either of these explanations, the known height locus on 450

chromosome 6 was not statistically significant in a GWAS analysis for genes 451

controlling sorghum height conducted by Morris et al. (2013), but was statistically 452

significant in a separate analysis conducted for the same trait by Li et al. (2015). 453

Time-series trait data is rapidly becoming more widely available and collected 454

in a broad range of contexts: model organisms, crops, livestock, humans, etc. 455

Integrating functional principal component analyses into genome-wide association 456

studies allows the identification of known true positive genes controlling 457

phenotypic variation in populations where these genes cannot be confidently 458

identified from data at any single time point. In addition, this study demonstrates 459

that the associations of different known true positive genes with different 460

functional principal components describing variation in the target trait are 461

consistent with the known biological roles and previous quantitative genetic 462

associations of those genes. This in turn suggests that functional principal 463

component-based genome-wide association studies of time-series data can provide 464

greater insight into the distinct roles different trait-associated loci play in 465

determining variation in a single phenotype. A statistical approach to decomposing 466

patterns of phenotypic variation over time into functional principal components 467

was employed that is robust to incomplete and used uneven observations of 468

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 14: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

14/30

different subsets of the population at different time points. This robust approach 469

should aid the broader adoption of functional principal component-based genome-470

wide association in a wider range of quantitative genetics contexts. 471

Materials and Methods 472

Plant materials and growth conditions 473

357 sorghum (Sorghum bicolor) lines from the SAP were planted, grown, and 474

phenotyped at (Latitude: 41.162, Longitude: -96.407) part of the University of 475

Nebraska-Lincoln’s Eastern Nebraska Research and Extension Center in 2016. 476

Plant height, defined as the distance from emergence from the soil to the top of the 477

uppermost panicle, was manually scored at maturity. Due to the height limitation 478

of the imaging chamber equipped in the high-throughput phenotyping facility in 479

the University of Nebraska-Lincoln’s Greenhouse Innovation Center (UNL-GIC) 480

(Latitude: 40.83, Longitude: -96.69), 38 lines with heights in the field >2 meters 481

were excluded from subsequent greenhouse experiments. Seeds from the 482

remaining 319 sorghum SAP lines were sown in the greenhouse of UNL-GIC on 483

June 15, 2016. Sorghum seeds were sown in 9.46-liter pots with Fafard 484

germination mix supplemented with 8.8 kg 3–4 month Osmocote and 8.8 kg 5–6 485

month Osmocote, 1 tablespoon (15 mL) of Micromax Micronutrients, and 1,800 g 486

lime per 764.5 liter (1 cubic yard) of soil. Twenty-seven sorghum lines failed to 487

germinate or failed to grow healthily under greenhouse conditions. These sorghum 488

lines were omitted from downstream analyses, leaving a total of 292 lines for 489

phenotyping. Phenotyped plants were grown under a target photoperiod of 14:10 490

day:night with supplementary light provided by light-emitting diode (LED) growth 491

lamps from 07:00 to 21:00 each day. The target temperature of the growth facility 492

was between 20–28.3°C. After growing in the greenhouse for 40 days, all the 493

plants were moved on to the conveyor belt, which transferred each pot to the 494

imaging chamber every two days and to the watering station each day to keep all 495

the plants growing under a good condition. At the watering station, plants were 496

weighed once per day and watered back to a target weight, including pot, soil, 497

carrier, and plant of 6,300 g from July 25th to August 9th, 7,000 g from August 9th 498

until the termination of the experiment August 31st–76 days after planting (DAP). 499

500

Image data acquisition 501

The imaging of all phenotyped sorghum lines commenced on July 26th and 502

continued until August 31st, 41–76 DAP. Image data was collected using the high-503

throughput phenotyping facility in UNL-GIC previously described (Ge et al., 504

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 15: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

15/30

2016). Each plant was imaged every other day by a visible camera (piA2400-505

17gm, BASLER, Germany with PENTAX lens) and a hyperspectral camera 506

(Headwall Photonics, Fitchburg, MA, USA). The visible camera was used to 507

capture RGB images with the resolution 2,354 × 2,056 including two different 508

zoom levels. The first zoom level was applied at 41–53 DAP. Each pixel 509

represented an area of approximately .45mm2 for objects in the range between the 510

camera and the pot containing the plant. The second zoom level with each pixel 511

representing an area of approximately 1.8mm2 was applied after 54 DAP until the 512

termination of the experiment (76 DAP). The hyperspectral camera has a spectral 513

range of 546–1,700 nm and 243 image bands. Plants were arranged so that most 514

leaves perpendicularly face to the hyperspectral camera. A hyperspectral cube for 515

each plant were captured at a resolution of 320 × 560 pixels. For each 516

hyperspectral cube, a total of 243 separate intensity values were captured for each 517

pixel spanning the range of light wavelengths between 546–1,700 nm. A constant 518

zoom level was applied for all the hyperspectral images throughout the whole 519

experiment with each pixel representing an area of approximately 8.8 mm2. 520

Height extraction from RGB and hyperspectral images 521

The estimate of plant height in RGB images uses the green index threshold method 522

based on the equation 2×Green/(Red + Blue) (Ge et al., 2016). In this study, the 523

green index threshold 1.12 was applied to separate plant pixels from background 524

pixels. After the whole-plant segmentation was done, the number of pixels from 525

the lowermost to the uppermost of the plant in the vertical direction were counted 526

to represent the plant height in the pixel unit. The stalk plus panicle height from 527

hyperspectral images were estimated using a semantic segmentation method based 528

on the hyperspectral signatures of different sorghum organs (Miao et al., 2020). A 529

Linear Discriminant Analysis (LDA) model was adopted from the previous 530

sorghum semantic segmentation project to classify each pixel to background, leaf, 531

stalk, or panicle (Miao et al., 2020). After the semantic segmentation, the stalk plus 532

panicle plant height in the pixel unit was estimated by counting the number of stalk 533

and panicle pixels on the vertical direction. Manual proofing of growth curves was 534

used to identify and correct a modest number of cases in early time points where 535

leaf midrib pixels were misclassified as stalk pixels, resulting in incorrect 536

estimation of plant heights. These cases could also be identified as poor fits of 537

nonparametric regression curves to the problematic data point relative to its 538

neighbors. The real plant heights from RGB and hyperspectral images were 539

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 16: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

16/30

obtained by using the pixel height multiplied by the ratio of the real size and pixel 540

size based on the corresponding zoom level. 541

Nonparametric fitting of growth curves and missing heights imputation 542

The growth curve of each sorghum line was obtained by fitting the heights at 543

different time points using the nonparametric regression. The missing heights were 544

also imputed. Let Yij be the jth observed phenotype of the ith plant, made at day tij, i 545

= 1, 2, ..., n, j = 1, 2, ..., mi, where mi is the total number of days observed for the 546

ith plant. To model the plant growth, the following non-parametric model was 547

employed: 548

Yij =µ(tij)+e(tij) (1) 549

where µ(·) is a mean function of the phenotype development and e(tij) is a zero-550

mean process associated with ith plant observed at tij. Let B(t) = (B1, ..., BK)T

(t) be 551

a vector of B-spline basis functions, where K is the number of basis functions. The 552

estimated mean function can be expressed as µ^ 𝑡 = 𝐵𝑇 𝑡 𝑐 = 𝐵𝑘

𝐾𝑘=1 𝑡 𝑐𝑘 , 553

where c is a vector of coefficients of length K obtained using penalized least 554

squares approach (Ramsay and Silverman, 2005; Xu et al., 2018b). 555

Functional principal component analysis (FPCA) 556

The plant growth curves over time were summarized using functional principle 557

component scores. In FPCA, the process e(tij) in (1) is decomposed into two parts: 558

𝑒 𝑡𝑖𝑗 = 𝜉𝑖,𝑙

𝐿

𝑙=1

∅𝑙 𝑡𝑖𝑗 + 𝜀𝑖𝑗

559

where ξi,l are zero-mean principal components scores with variance λl, ∅𝑙 𝑡𝑖𝑗 are 560

eigenfunctions corresponding to principal components scores, and εij are zero-mean 561

measurement errors with constant variance. In FPCA, eigenfunctions are 562

orthonormal, namely ∫∅𝑙1(t)∅𝑙2(t)dt = 0, for all 𝑙1≠ 𝑙2 and ∫ ∅𝑙2 𝑡 𝑑𝑡 = 1, so the 563

characteristics of phenotype development for the ith genotype can be represented 564

by its principal components scores ξi,l, 𝑙 = 1, 2, ..., L. The variance of the principal 565

components scores, λl,𝑙 = 1, 2, ..., L, are sorted in decreasing order, so the first few 566

principal components scores usually capture the majority of variation in the 567

phenotype data. B-spline bases were employed for the approximation of 568

eigenfunctions. The variance λl and eigenfunctions ∅𝑙 (·) are estimated by 569

eigenvalue decomposition and the principal components scores ξi,l, 𝑙 = 1, 2, ..., L 570

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 17: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

17/30

are estimated using best linear unbiased prediction (Ramsay and Silverman, 2005; 571

Xu et al., 2018a). 572

Genotyping the SAP 573

DNA was extracted from seedling stage plants grown in the Beadle Center 574

Greenhouse complex at University of Nebraska-Lincoln using the same seed 575

employed for phenotyping. Sequencing libraries generated using a modified tGBS 576

protocol (Ott et al., 2017) were constructed and paired-end (151-bp read length) 577

sequenced in 8 lanes of an Illumina HiSeqX instrument. This produced an average 578

of 4 million paired-end reads per sample. The quality of raw reads was controlled 579

using the Trimmomatic/0.33 software with the parameters ’TRAILING:20’, 580

’SLIDINGWINDOW:4:20’, and ’MINLEN:40’ (Bolger et al., 2014). High-quality 581

reads were mapped to the sorghum reference genome (V4) (McCormick et al., 582

2017) using the BWA/0.7.17 MEM algorithm with default parameters (Li and 583

Durbin, 2009). After the alignment, the GATK/4.1 HaplotypeCaller function was 584

used to perform SNP calling with default parameters (McKenna et al., 2010) and 585

the raw VCF file including 358 samples was generated. Quality controls of the raw 586

VCF including missing data rate (<0.7), minor allele frequency (>0.01), and 587

heterozygous rate (<0.05) were applied using customized python scripts 588

(https://github.com/freemao/schnablelab). Missing data was then imputed using 589

Beagle/4.1 with parameters ’window=6600, overlap=1320’ (Browning and 590

Browning, 2016). The sizes of sliding windows and the overlap windows were set 591

to capture 10% of all SNPs on a given chromosome and 2% of all SNPs on a given 592

chromosome, respectively. A final set of 569,305 high-confidence SNPs was 593

generated and employed in all downstream analyses. 594

Genome-wide association study analyses 595

Three kinds of GWAS were conducted in this study: 1. Sequential GWAS of plant 596

height based on days after planting (DAP sequential GWAS); 2. Sequential GWAS 597

of plant height based on days after panicle emergence (DAPE sequential GWAS); 598

3. GWAS of the dynamic trait of the plant growth curve (FPCA GWAS). 599

For DAP sequential GWAS, the plant heights of 292 lines at each time point 600

were used as the phenotypes. As different subsets of the population were 601

phenotyped on alternating dates, height values for all plants on individual data 602

were drawn from the values for the nonparametrically fit curves for that genotype, 603

as described above. Genotype data of the 292 lines was drawn the original 604

genotype data including 358 lines from SAP. After subsetting to the data from the 605

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 18: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

18/30

292 phenotyped SAP lines, any SNPs which now fell below the previous criteria 606

used to filter SNPs for the full set of 358 lines–minor allele frequency (>0.01) and 607

the heterozygous rate (<0.05)–were excluded. A total of 36 GWAS were 608

conducted from 41 DAP to 75 DAP using the same genotype data. For the DAPE 609

sequential GWAS, X-axis (time) values for individual plant growth curves were 610

recentered based on the date of panicle emergence. The panicle emergence was 611

defined when the tip of the panicle is visible from the sorghum flag leaf sheath in 612

the image. In all but thirteen cases, panicle emergence occurred within the window 613

in which image data was collected. In one case, panicle emergence occurred prior 614

to the first day of imaging; and in twelve cases, panicle emergence occurred 615

subsequent to the last day of imaging. As a result, the number of lines with 616

observed data was inconsistent across individual time points. Analyses were 617

conducted between -14 and 10 DAPE. Within this interval, observed height values 618

were present for at least 235 lines at each time point. For each time point, genotype 619

information was subset and refiltered based on the set of lines with available trait 620

data, as described above. A total of 25 GWAS were conducted from -14 to 10 621

DAPE using the corresponding customized marker datasets generated for 622

phenotyped lines at each individual time point. For the FPCA GWAS, the first and 623

second principal component (PC) scores of each sorghum line were used as trait 624

values. The genotype data used was identical to that employed for the DAP 625

sequential GWAS analyses. 626

All the GWAS were conducted using mixed linear models (MLM) implemented 627

by GEMMA/0.95 (Zhou and Stephens, 2012). The first five principal components 628

derived from the genotype data using Tassel/5.0 (Bradbury et al., 2007) were fitted 629

to MLM as the fixed effect. Meanwhile, the kinship matrix calculated using 630

“gemma -gk 1” command was fitted to MLM as the random effect. All 63 GWAS 631

jobs were run on the Holland Computing Center’s Crane cluster at the University 632

of Nebraska-Lincoln. The number of independent SNPs in each genotype data was 633

determined using the GEC/0.2 software (Li et al., 2012). A Bonferroni corrected p-634

value of 0.05 calculated based on the number of independent SNPs in each specific 635

analysis was applied as the cutoff for statistical significance in each separate 636

GWAS analysis (Yang et al., 2014). 637

638

Data and code availability 639

Both imputed and unimputed genotype datasets used in this study have been 640

deposited on Figshare (Miao and Schnable, 2019). Raw image data, including 641

RGB and hyperspectral collected from each plant at each time point are being 642

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 19: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

19/30

deposited and disseminated through CyVerse (https://doi.org/10.25739/p39b-643

dz61). The R code implementing the nonparametric regression and FPCA used in 644

this study has been deposited on GitHub 645

(https://github.com/freemao/FPCA_GWAS). 646

Accession Numbers 647

Sequences for the four major genes/proteins mentioned in this paper Dwarf1 (Dw1, 648

Sobic.009G229800), Dwarf2 (Dw2, Sobic.006G067700), maturity1 (Ma1, 649

Sobic.006G057866), Dwarf3 (Dw3, Sobic.007G163800) were retrieved from v3.1.1 650

of the sorghum genome as provided in Phytozome v12.1. 651

Sequence data from this article can be found in the GenBank/EMBL data libraries 652

under accession numbers_. 653

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 20: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

20/30

654

Supplemental Data 655

656

Supplemental Figure S1. The distributions of minor allele frequency (MAF) of 657

SNPs around dwarf genes. 658

in unconstrained (blue) and constrained (yellow) populations. 659

Supplemental Figure S2. Semantic segmentation of an example sorghum plant 660

using a hyperspectral image. 661

Supplemental Figure S3. Imputing unobserved height values for sorghum 662

genotypes through nonparametric regression. 663

Supplemental Figure S4. Proportion of variance explained for observed and 664

imputed plant heights. 665

Supplemental Figure S5. Summary of significant SNPs identified in independent 666

GWAS for each time point anchoring on days after planting rather than days after 667

panicle emergence. 668

Supplemental Figure S6. Manhattan plots for GWAS on height. 669

measured at 41 days after planting (Panel A) and 72 days after planting (Panel B). 670

Supplemental Figure S7. Manhattan plot for GWAS on the date of panicle. 671

emergence. 672

Supplemental Table S1. Significant SNPs identified by DAP Sequential GWAS. 673

674

675

Acknowledgements 676

This work was supported by a University of Nebraska Agricultural Research 677

Division seed grant to JCS and a National Science Foundation Award (OIA-678

1557417) to JCS. This project was completed utilizing the Holland Computing 679

Center of the University of Nebraska, which receives support from the Nebraska 680

Research Initiative. 681

Conflicts of Interest 682

JCS, PSS, and SL have equity interests in Data2Bio, a company which provides 683

genotyping services using the same protocol employed for genotyping in this 684

study. The authors affirm that they have no other conflicts of interest. 685

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 21: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

21/30

Figure Legends 686

Figure 1. Reduced power to identify causal loci in phenotypically constrained 687

populations. (A) A genome wide association analyses for plant height, defined as 688

the distance between the soil surface and the top of the panicle at maturity, 689

using field collected data for 357 lines from sorghum association panel and the 690

set of genotype call data used in this study. The location of Dw1, Dw2 and Ma1 691

are indicated with dashed lines, as is an additional known height locus (KHL) 692

identified in multiple prior GWAS conducted on height in this population using 693

different genetic marker data. (B) Distribution of observed heights for the 357 694

lines employed for association analysis in panel A. The set of thirty eight lines 695

above two meters in height are marked in red. (C) An genome wide association 696

analysis identical to that shown in panel A but with the exclusion of lines with the 697

field heights >2 meters (38 lines) and those which we were not able to 698

successfully germinate and phenotype in this study (27 lines). 699

700

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 22: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

22/30

Figure 2. Different methods to define and measure plant height produce 701

different outcomes. (A) Conventional RGB images of a single sorghum plant (PI 702

576401) taken on eight different days spanning the transition from vegetative to 703

reproductive development. Pixels identified as "plant" through whole plant 704

segmentation are outlined in red. Measured plant height, defined as the distance 705

between the plant pixels with the smallest and greatest y-axis value is indicated 706

by the horizontal blue bar in each image. (B) Semantically segmented images of 707

same plant taken on the same day from a moderately different viewing angle 708

using a hyperspectral camera. Pixels classified as "leaf" are indicated in green, 709

pixels classified as "stem" are indicated in orange, and pixels classified as 710

"panicle" are indicated in purple. Measured plant height, defined as the distance 711

between stem or panicle pixels with the smallest and greatest y-axis value is 712

indicated by the horizontal red bar in each image. (C) Observed and imputed 713

plant heights for the same sorghum plant on each day within the range of 714

phenotypic data collection. Blue and red circles indicate measured height values 715

from whole plant segmentation of RGB images and semantic segmentation of 716

hyperspectral images, respectively. Solid blue and solid red lines indicate height 717

values imputed for unobserved time points using nonparametric regression for 718

whole plant and semantic height datasets, respectively. 719

720

Figure 3. Comparison of change in plant height over time for members of the 721

SAP population when anchoring either on planting date (DAP) or panicle 722

emergence date (DAPE). (A) Growth curves imputed using nonparametric 723

regression for 20 representative sorghum genotypes, anchored for comparison 724

based on sharing the same date of planting. (B) Growth curves imputed using 725

nonparametric regression for the same 20 sorghum genotypes shown in panel A, 726

anchored for comparison based on sharing the same date of panicle emergence. 727

Lines with identical colors in panel A and panel B indicate data taken from the 728

same plants. Regression lines are not extended beyond the range of observed 729

datapoints. As panicle emergence occurred less than 18 days after the start of 730

imaging for some lines and less than 17 days before the last day of imaging for 731

others, curves in panel B are incomplete. 732

733

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 23: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

23/30

Figure 4. Several known causal loci show statistically significant associations with 734

plant height when sequential genome wide association studies are conducted 735

using data anchored to the date of panicle emergence. A summary of where 736

statistically significant trait associated SNPs were identified in separate genome 737

wide association studies conducted using height data for each day between -14 738

to +10 days after panicle emergence (DAPE). Each vertical column summarizes 739

the results from one of the twenty-five independently conducted genome wide 740

association studies. Each sorghum chromosome is divided into sixteen bins 741

containing equal numbers of SNP markers. Each cell in each vertical column is 742

color coded based on the single most significant p-value observed for any marker 743

within that bin on that day. Light pink cells indicate bins which contain no 744

markers which exceed the multiple testing corrected threshold for statistical 745

significance. The locations of the two cloned dwarf genes and the one cloned 746

maturity gene which were successfully identified in analysis of data from at least 747

one time point are indicated with horizontal dashed lines. 748

749

750

Figure 5 Two functional principal components explain >97% of variance in the 751

sorghum growth curves observed in this study. Functional principal component 752

analysis seeks to describe the pattern of change in height over time of each 753

observed plant using a mean function combined with variable weightings of a set 754

of eigenfunctions. (A) Comparison of the empirical mean function (red) for all 755

growth curves observed in this study (gray) and the mean function estimated 756

using functional principal component analysis (blue). (B) Illustration of how 757

changing the score for functional principal component one alters the resulting 758

growth curve. (SD = Standard Deviation) (C) Illustration of how changing the score 759

for functional principal component two alters the resulting growth curve. 760

761

762

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 24: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

24/30

Figure 6. Mapping genes associated with variation in functional principal 763

component scores among sorghum genotypes. (A) Distribution of functional 764

principal component one scores among the 292 genotypes phenotyped as part of 765

this study. Genotypes with the most negative values for functional principal 766

component one are indicated in red, and genotypes with the most positive values 767

for functional principal component one are indicated in blue. (B) Growth curves 768

for a subset of genotypes with the most negative values for functional principal 769

component one are indicated in red. Growth curves for a subset of genotypes 770

with the most positive values for functional principal component one are 771

indicated in blue. (C) Distribution of functional principal component two scores 772

among the 292 genotypes phenotyped as part of this study. Genotypes with the 773

most negative values for functional principal component two are indicated in red, 774

and genotypes with the most positive values for functional principal component 775

two are indicated in blue. (D) Growth curves for a subset of genotypes with the 776

most negative values for functional principal component two are indicated in red. 777

Growth curves for a subset of genotypes with the most positive values for 778

functional principal component two are indicated in blue. (E) Results of 779

conducting a genome wide association analysis for functional principal 780

component one scores. (F) Results of conducting a genome wide association 781

analysis for functional principal component two scores. In panels E and the 782

positions of three cloned dwarf genes Dw1, Dw2, and Dw3 as well as the cloned 783

maturity gene Ma1 are indicated using vertical dash lines. Horizontal dash lines 784

indicate multiple testing corrected cutoff of a statistically significant association. 785

786

787

References 788

Bolger, Anthony M, Marc Lohse, and Bjoern Usadel (2014), “Trimmomatic: a 789

flexible trimmer for illumina sequence data.” Bioinformatics, 30, 2114–2120. 790

Bradbury, Peter J, Zhiwu Zhang, Dallas E Kroon, Terry M Casstevens, Yogesh 791

Ramdoss, and Edward S Buckler (2007), “Tassel: software for association 792

mapping of complex traits in diverse samples.” Bioinformatics, 23, 2633–2635. 793

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 25: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

25/30

Brown, Patrick J, PE Klein, E Bortiri, CB Acharya, WL Rooney, and S Kresovich 794

(2006), “Inheritance of inflorescence architecture in sorghum.” Theoretical and 795

Applied Genetics, 113, 931–942. 796

Brown, Patrick J, William L Rooney, Cleve Franks, and Stephen Kresovich 797

(2008), “Efficient mapping of plant height quantitative trait loci in a sorghum 798

association population with introgressed dwarfing genes.” Genetics, 180, 629–799

637. 800

Browning, Brian L and Sharon R Browning (2016), “Genotype imputation with 801

millions of reference samples.” The American Journal of Human Genetics, 98, 802

116–126. 803

Casa, Alexandra M, Gael Pressoir, Patrick J Brown, Sharon E Mitchell, William L 804

Rooney, Mitchell R Tuinstra, Cleve D Franks, and Stephen Kresovich (2008), 805

“Community resources and strategies for association mapping in sorghum.” Crop 806

Science, 48, 30. 807

Chen, Wei, Wensheng Wang, Meng Peng, Liang Gong, Yanqiang Gao, Jian Wan, 808

Shouchuang Wang, Lei Shi, Bin Zhou, Zongmei Li, et al. (2016), “Comparative 809

and parallel genome-wide association studies for metabolic and agronomic traits 810

in cereals.” Nature Communications, 7, 12767. 811

Feldman, Max J, Patrick Z Ellsworth, Noah Fahlgren, Malia A Gehan, Asaph B 812

Cousins, and Ivan Baxter (2018), “Components of water use efficiency have 813

unique genetic signatures in the model c4 grass setaria.” Plant Physiology, 178, 814

699–715. 815

Feldman, Max J, Rachel E Paul, Darshi Banan, Jennifer F Barrett, Jose Sebastian, 816

Muh-Ching Yee, Hui Jiang, Alexander E Lipka, Thomas P Brutnell, Jose R 817

Dinneny, et al. (2017), “Time dependent genetic analysis links field and 818

controlled environment phenotypes in the model c4 grass setaria.” PLoS 819

Genetics, 13, e1006841. 820

Fernandez, Maria G Salas, Yin Bao, Lie Tang, and Patrick S Schnable (2017), “A 821

high-throughput, field-based phenotyping technology for tall biomass crops.” 822

Plant Physiology, pp–00707. 823

Furbank, Robert T and Mark Tester (2011), “Phenomics–technologies to relieve 824

the phenotyping bottleneck.” Trends in Plant Science, 16, 635–644. 825

Ge, Yufeng, Geng Bai, Vincent Stoerger, and James C Schnable (2016), 826

“Temporal dynamics of maize plant growth, water use, and leaf water content 827

using automated high throughput rgb and hyperspectral imaging.” Computers 828

and Electronics in Agriculture, 127, 625–632. 829

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 26: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

26/30

Gehan, Malia A, Noah Fahlgren, Arash Abbasi, Jeffrey C Berry, Steven T Callen, 830

Leonardo Chavez, Andrew N Doust, Max J Feldman, Kerrigan B Gilbert, John G 831

Hodge, et al. (2017), “Plantcv v2: Image analysis software for high-throughput 832

plant phenotyping.” PeerJ, 5, e4088. 833

Hansey, Candice N, James M Johnson, Rajandeep S Sekhon, Shawn M Kaeppler, 834

and Natalia de Leon (2011), “Genetic diversity of a maize association population 835

with restricted phenology.” Crop Science, 51, 704–715. 836

Higgins, RH, CS Thurber, I Assaranurak, and Patrick J Brown (2014), 837

“Multiparental mapping of plant height and flowering time qtl in partially 838

isogenic sorghum families.” G3: Genes, Genomes, Genetics, 4, 1593–1602. 839

Hilley, Josie, Sandra Truong, Sara Olson, Daryl Morishige, and John Mullet 840

(2016), “Identification of dw1, a regulator of sorghum stem internode length.” 841

PLoS One, 11, e0151271. 842

Hilley, Josie L, Brock D Weers, Sandra K Truong, Ryan F McCormick, Ashley J 843

Mattison, Brian A McKinley, Daryl T Morishige, and John E Mullet (2017), 844

“Sorghum dw2 encodes a protein kinase regulator of stem internode length.” 845

Scientific Reports, 7, 4616. 846

Hirano, Ko, Mayuko Kawamura, Satoko Araki-Nakamura, Haruka Fujimoto, 847

Kozue Ohmae-Shinohara, Miki Yamaguchi, Akihiro Fujii, Hiroaki Sasaki, 848

Shigemitsu Kasuga, and Takashi Sazuka (2017), “Sorghum dw1 positively 849

regulates brassinosteroid signaling by inhibiting the nuclear localization of 850

brassinosteroid insensitive 2.” Scientific Reports, 7, 1–10. 851

Holman, Fenner, Andrew Riche, Adam Michalski, March Castle, Martin Wooster, 852

and Malcolm Hawkesford (2016), “High throughput field phenotyping of wheat 853

plant height and growth rate in field plot trials using uav based remote sensing.” 854

Remote Sensing, 8, 1031. 855

Honsdorf, Nora, Timothy John March, Bettina Berger, Mark Tester, and Klaus 856

Pillen (2014), “High-throughput phenotyping to detect drought tolerance qtl in 857

wild barley introgression lines.” PLoS One, 9, e97047. 858

Huang, Xuehui, Tao Sang, Qiang Zhao, Qi Feng, Yan Zhao, Canyang Li, 859

Chuanrang Zhu, Tingting Lu, Zhiwu Zhang, Meng Li, et al. (2010), “Genome-860

wide association studies of 14 agronomic traits in rice landraces.” Nature 861

Genetics, 42, 961. 862

Kwak, Il-Youp, Candace R Moore, Edgar P Spalding, and Karl W Broman (2016), 863

“Mapping quantitative trait loci underlying function-valued traits using 864

functional principal component analysis and multi-trait mapping.” G3: Genes, 865

Genomes, Genetics, 6, 79–86. 866

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 27: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

27/30

Lally, David, Peter Ingmire, Hong-Yun Tong, and Zheng-Hui He (2001), 867

“Antisense expression of a cell wall–associated protein kinase, wak4, inhibits cell 868

elongation and alters morphology.” The Plant Cell, 13, 1317–1332. 869

Li, Heng and Richard Durbin (2009), “Fast and accurate short read alignment with 870

burrows–wheeler transform.” Bioinformatics, 25, 1754–1760. 871

Li, Miao-Xin, Juilian MY Yeung, Stacey S Cherny, and Pak C Sham (2012), 872

“Evaluating the effective numbers of independent tests and significant p-value 873

thresholds in commercial genotyping arrays and public imputation reference 874

datasets.” Human Genetics, 131, 747–756. 875

Li, Xin, Xianran Li, Eyal Fridman, Tesfaye T Tesso, and Jianming Yu (2015), 876

“Dissecting repulsion linkage in the dwarfing gene dw3 region for sorghum plant 877

height provides insights into heterosis.” Proceedings of the National Academy of 878

Sciences, 112, 11823–11828. 879

Liang, Zhikai, Piyush Pandey, Vincent Stoerger, Yuhang Xu, Yumou Qiu, Yufeng 880

Ge, and James C Schnable (2017), “Conventional and hyperspectral time-series 881

imaging of maize lines widely used in field trials.” GigaScience, 7, gix117. 882

Ma, Chang-Xing, George Casella, and Rongling Wu (2002), “Functional mapping 883

of quantitative trait loci underlying the character process: a theoretical 884

framework.” Genetics, 161, 1751–1762. 885

McCormick, Ryan F, Sandra K Truong, Avinash Sreedasyam, Jerry Jenkins, 886

Shengqiang Shu, David Sims, Megan Kennedy, Mojgan Amirebrahimi, Brock D 887

Weers, Brian McKinley, et al. (2017), “The sorghum bicolor reference genome: 888

improved assembly, gene annotations, a transcriptome atlas, and signatures of 889

genome organization.” The Plant Journal. 890

McKenna, Aaron, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian 891

Cibulskis, Andrew Kernytsky, Kiran Garimella, David Altshuler, Stacey Gabriel, 892

Mark Daly, et al. (2010), “The genome analysis toolkit: a mapreduce framework 893

for analyzing next-generation dna sequencing data.” Genome Research, 20, 894

1297–1303. 895

Miao, Chenyong, Alejandro Pages, Zheng Xu, Eric Rodene, Jinliang Yang, and 896

James C. Schnable (2020), “Semantic segmentation of sorghum using 897

hyperspectral data identifies genetic associations.” Plant Phenomics. 898

Miao, Chenyong and James C Schnable (2019), “Genotype dataset for sorghum 899

association panel sap.” 900

Miao, Chenyong, Jinliang Yang, and James C Schnable (2019), “Optimising the 901

identification of causal variants across varying genetic architectures in crops.” 902

Plant Biotechnology Journal, 17, 893–905. 903

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 28: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

28/30

Moore, Candace R, Logan S Johnson, Il-Youp Kwak, Miron Livny, Karl W 904

Broman, and Edgar P Spalding (2013), “Highthroughput computer vision 905

introduces the time axis to a quantitative trait map of a plant growth response.” 906

Genetics, 195, 1077–1086. 907

Morris, Geoffrey P, Punna Ramu, Santosh P Deshpande, C Thomas Hash, Trushar 908

Shah, Hari D Upadhyaya, Oscar RieraLizarazu, Patrick J Brown, Charlotte B 909

Acharya, Sharon E Mitchell, et al. (2013), “Population genomic and genome-910

wide association studies of agroclimatic traits in sorghum.” Proceedings of the 911

National Academy of Sciences, 110, 453–458. 912

Multani, Dilbag S, Steven P Briggs, Mark A Chamberlin, Joshua J Blakeslee, 913

Angus S Murphy, and Gurmukh S Johal (2003), “Loss of an mdr transporter in 914

compact stalks of maize br2 and sorghum dw3 mutants.” Science, 302, 81–84. 915

Muraya, Moses M, Jianting Chu, Yusheng Zhao, Astrid Junker, Christian Klukas, 916

Jochen C Reif, and Thomas Altmann (2017), “Genetic variation of growth 917

dynamics in maize (zea mays l.) revealed through automated non-invasive 918

phenotyping.” The Plant Journal, 89, 366–380. 919

Murphy, Rebecca L, Robert R Klein, Daryl T Morishige, Jeff A Brady, William L 920

Rooney, Frederick R Miller, Diana V Dugas, 921

Patricia E Klein, and John E Mullet (2011), “Coincident light and clock 922

regulation of pseudoresponse regulator protein 37 (prr37) controls photoperiodic 923

flowering in sorghum.” Proceedings of the National Academy of Sciences, 108, 924

16469–16474. 925

Navarro, J Alberto Romero, Martha Willcox, Juan Burgueño, Cinta Romay, Kelly 926

Swarts, Samuel Trachsel, Ernesto Preciado, Arturo Terron, Humberto Vallejo 927

Delgado, Victor Vidal, et al. (2017), “A study of allelic diversity underlying 928

flowering-time adaptation in maize landraces.” Nature Genetics, 49, 476. 929

Oren, Seval, Halil Ceylan, Patrick S Schnable, and Liang Dong (2017), “High-930

resolution patterning and transferring of graphene-based nanomaterials onto tape 931

toward roll-to-roll production of tape-based wearable sensors.” Advanced 932

Materials Technologies, 2, 1700223. 933

Osman, Khalid A, Bin Tang, Yaping Wang, Juanhua Chen, Feng Yu, Liu Li, 934

Xuesong Han, Zuxin Zhang, Jianbin Yan, Yonglian Zheng, et al. (2013), 935

“Dynamic qtl analysis and candidate gene mapping for waterlogging tolerance at 936

maize seedling stage.” PloS One, 8, e79305. 937

Ott, Alina, Sanzhen Liu, James C Schnable, Cheng-Ting Yeh, Cassy Wang, and 938

Patrick S Schnable (2017), “Tunable genotyping-by-sequencing (tgbs®) enables 939

reliable genotyping of heterozygous loci. biorxiv.” 940

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 29: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

29/30

Paine, CE, Toby R Marthews, Deborah R Vogt, Drew Purves, Mark Rees, Andy 941

Hector, and Lindsay A Turnbull (2012), “How to fit nonlinear plant growth 942

models and calculate growth rates: an update for ecologists.” Methods in Ecology 943

and Evolution, 3, 245–256. 944

Quinby, John Roy (1974), Sorghum Improvement and the Genetics of Growth. 945

Texas agricultural experiment station. 946

Quinby, JR (1967), “The maturity genes of sorghum.” In Advances in Agronomy, 947

volume 19, 267–305, Elsevier. 948

Quinby, JR and RE Karper (1945), “Inheritance of three genes that influence time 949

of floral initiation and maturity date in milo.” Journal of the American Society of 950

Agronomy. 951

Ramsay, J. O. and B. W. Silverman (2005), Functional Data Analysis (2nd ed.). 952

Springer-Verlag, New York. 953

Sang, Mengmeng, Hexin Shi, Kun Wei, Meixia Ye, Libo Jiang, Lidan Sun, and 954

Rongling Wu (2019), “A dissection model for mapping complex traits.” The 955

Plant Journal, 97, 1168–1182. 956

Wagner, Tanya A and Bruce D Kohorn (2001), “Wall-associated kinases are 957

expressed throughout plant development and are required for cell expansion.” 958

The Plant Cell, 13, 303–318. 959

Wu, Rongling and Min Lin (2006), “Functional mapping—how to map and study 960

the genetic architecture of dynamic complex traits.” Nature Reviews Genetics, 7, 961

229. 962

Wu, Rongling, Chang-Xing Ma, Min Lin, Zuoheng Wang, and George Casella 963

(2004), “Functional mapping of quantitative trait loci underlying growth 964

trajectories using a transform-both-sides logistic model.” Biometrics, 60, 729–965

738. 966

Wu, Xianshan, Zhenghang Wang, Xiaoping Chang, and Ruilian Jing (2010), 967

“Genetic dissection of the developmental behaviours of plant height in wheat 968

under diverse water regimes.” Journal of Experimental Botany, 61, 2923–2937. 969

Würschum, Tobias, Wenxin Liu, Lucas Busemeyer, Matthew R Tucker, Jochen C 970

Reif, Elmar A Weissmann, Volker Hahn, Arno Ruckelshausen, and Hans Peter 971

Maurer (2014), “Mapping dynamic qtl for plant height in triticale.” BMC 972

Genetics, 15, 59. 973

Xu, Yuhang, Yehua Li, and Dan Nettleton (2018a), “Nested hierarchical functional 974

data modeling and inference for the analysis of functional plant phenotypes.” 975

Journal of the American Statistical Association, 113, 593–606. 976

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 30: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

30/30

Xu, Yuhang, Yumou Qiu, and James C Schnable (2018b), “Functional modeling of 977

plant growth dynamics.” The Plant Phenome Journal, 1, 170007. 978

Yamaguchi, Miki, Haruka Fujimoto, Ko Hirano, Satoko Araki-Nakamura, Kozue 979

Ohmae-Shinohara, Akihiro Fujii, Masako Tsunashima, Xian Jun Song, Yusuke 980

Ito, Rie Nagae, et al. (2016), “Sorghum dw1, an agronomically important gene 981

for lodging resistance, encodes a novel protein involved in cell proliferation.” 982

Scientific Reports, 6, 28366. 983

Yamasaki, Masanori, Stephen I Wright, and Michael D McMullen (2007), 984

“Genomic screening for artificial selection during domestication and 985

improvement in maize.” Annals of Botany, 100, 967–973. 986

Yan, Juqiang, Jun Zhu, Cixin He, Mebrouk Benmoussa, and Ping Wu (1998), 987

“Molecular dissection of developmental behavior of plant height in rice (oryza 988

sativa l.).” Genetics, 150, 1257–1265. 989

Yang, Jie, Rongling Wu, and George Casella (2009), “Nonparametric functional 990

mapping of quantitative trait loci.” Biometrics, 65, 30–39. 991

Yang, Wanneng, Zilong Guo, Chenglong Huang, Lingfeng Duan, Guoxing Chen, 992

Ni Jiang, Wei Fang, Hui Feng, Weibo Xie, Xingming Lian, et al. (2014), 993

“Combining high-throughput phenotyping and genome-wide association studies 994

to reveal natural genetic variation in rice.” Nature Communications, 5, 5087. 995

Zhou, Xiang and Matthew Stephens (2012), “Genome-wide efficient mixed-model 996

analysis for association studies.” Nature Genetics, 44, 821. 997

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 31: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

1

Figure 1. Reduced power to identify causal loci in phenotypically constrained populations. (A) A genome wide

association analyses for plant height, defined as the distance between the soil surface and the top of the panicle at

maturity, using field collected data for 357 lines from sorghum association panel and the set of genotype call data

used in this study. The location of Dw1, Dw2 and Ma1 are indicated with dashed lines, as is an additional known

height locus (KHL) identified in multiple prior GWAS conducted on height in this population using different genetic

marker data. (B) Distribution of observed heights for the 357 lines employed for association analysis in panel A.

The set of thirty eight lines above two meters in height are marked in red. (C) An genome wide association analysis

identical to that shown in panel A but with the exclusion of lines with the field heights >2 meters (38 lines) and

those which we were not able to successfully germinate and phenotype in this study (27 lines).

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 32: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

1

Figure 2. Different methods to define and measure plant height produce different outcomes. (A) Conventional

RGB images of a single sorghum plant (PI 576401) taken on eight different days spanning the transition from

vegetative to reproductive development. Pixels identified as "plant" through whole plant segmentation are

outlined in red. Measured plant height, defined as the distance between the plant pixels with the smallest and

greatest y-axis value is indicated by the horizontal blue bar in each image. (B) Semantically segmented images of

same plant taken on the same day from a moderately different viewing angle using a hyperspectral camera. Pixels

classified as "leaf" are indicated in green, pixels classified as "stem" are indicated in orange, and pixels classified as

"panicle" are indicated in purple. Measured plant height, defined as the distance between stem or panicle pixels

with the smallest and greatest y-axis value is indicated by the horizontal red bar in each image. (C) Observed and

imputed plant heights for the same sorghum plant on each day within the range of phenotypic data collection.

Blue and red circles indicate measured height values from whole plant segmentation of RGB images and semantic

segmentation of hyperspectral images, respectively. Solid blue and solid red lines indicate height values imputed

for unobserved time points using nonparametric regression for whole plant and semantic height datasets,

respectively.

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 33: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

1

Figure 3. Comparison of change in plant height over time for members of the SAP population when anchoring

either on planting date (DAP) or panicle emergence date (DAPE). (A) Growth curves imputed using nonparametric

regression for 20 representative sorghum genotypes, anchored for comparison based on sharing the same date of

planting. (B) Growth curves imputed using nonparametric regression for the same 20 sorghum genotypes shown

in panel A, anchored for comparison based on sharing the same date of panicle emergence. Lines with identical

colors in panel A and panel B indicate data taken from the same plants. Regression lines are not extended beyond

the range of observed datapoints. As panicle emergence occurred less than 18 days after the start of imaging for

some lines and less than 17 days before the last day of imaging for others, curves in panel B are incomplete.

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 34: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

1

Figure 4. Several known causal loci show statistically significant associations with plant height when sequential

genome wide association studies are conducted using data anchored to the date of panicle emergence. A

summary of where statistically significant trait associated SNPs were identified in separate genome wide

association studies conducted using height data for each day between -14 to +10 days after panicle emergence

(DAPE). Each vertical column summarizes the results from one of the twenty-five independently conducted

genome wide association studies. Each sorghum chromosome is divided into sixteen bins containing equal

numbers of SNP markers. Each cell in each vertical column is color coded based on the single most significant p-

value observed for any marker within that bin on that day. Light pink cells indicate bins which contain no markers

which exceed the multiple testing corrected threshold for statistical significance. The locations of the two cloned

dwarf genes and the one cloned maturity gene which were successfully identified in analysis of data from at least

one time point are indicated with horizontal dashed lines.

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 35: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

1

Figure 5. Two functional principal components explain >97% of variance in the sorghum growth curves observed in

this study. Functional principal component analysis seeks to describe the pattern of change in height over time of

each observed plant using a mean function combined with variable weightings of a set of eigenfunctions. (A)

Comparison of the empirical mean function (red) for all growth curves observed in this study (gray) and the mean

function estimated using functional principal component analysis (blue). (B) Illustration of how changing the score

for functional principal component one alters the resulting growth curve. (SD = Standard Deviation) (C) Illustration

of how changing the score for functional principal component two alters the resulting growth curve.

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 36: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

1

Figure 6. Mapping genes associated with variation in functional principal component scores among sorghum

genotypes (A) Distribution of functional principal component one scores among the 292 genotypes phenotyped as

part of this study. Genotypes with the most negative values for functional principal component one are indicated

in red, and genotypes with the most positive values for functional principal component one are indicated in blue.

(B) Growth curves for a subset of genotypes with the most negative values for functional principal component one

are indicated in red. Growth curves for a subset of genotypes with the most positive values for functional principal

component one are indicated in blue. (C) Distribution of functional principal component two scores among the 292

genotypes phenotyped as part of this study. Genotypes with the most negative values for functional principal

component two are indicated in red, and genotypes with the most positive values for functional principal

component two are indicated in blue. (D) Growth curves for a subset of genotypes with the most negative values

for functional principal component two are indicated in red. Growth curves for a subset of genotypes with the

most positive values for functional principal component two are indicated in blue. (E) Results of conducting a

genome wide association analysis for functional principal component one scores. (F) Results of conducting a

genome wide association analysis for functional principal component two scores. In panels E and the positions of

three cloned dwarf genes Dw1, Dw2, and Dw3 as well as the cloned maturity gene Ma1 are indicated using vertical

dash lines. Horizontal dash lines indicate multiple testing corrected cutoff of a statistically significant association.

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 37: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

Parsed CitationsBolger, Anthony M, Marc Lohse, and Bjoern Usadel (2014), "Trimmomatic: a flexible trimmer for illumina sequence data."Bioinformatics, 30, 2114–2120.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Bradbury, Peter J, Zhiwu Zhang, Dallas E Kroon, Terry M Casstevens, Yogesh Ramdoss, and Edward S Buckler (2007), "Tassel:software for association mapping of complex traits in diverse samples." Bioinformatics, 23, 2633–2635.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Brown, Patrick J, PE Klein, E Bortiri, CB Acharya, WL Rooney, and S Kresovich (2006), "Inheritance of inflorescence architecture insorghum." Theoretical and Applied Genetics, 113, 931–942.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Brown, Patrick J, William L Rooney, Cleve Franks, and Stephen Kresovich (2008), "Efficient mapping of plant height quantitative traitloci in a sorghum association population with introgressed dwarfing genes." Genetics, 180, 629–637.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Browning, Brian L and Sharon R Browning (2016), "Genotype imputation with millions of reference samples." The American Journal ofHuman Genetics, 98, 116–126.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Casa, Alexandra M, Gael Pressoir, Patrick J Brown, Sharon E Mitchell, William L Rooney, Mitchell R Tuinstra, Cleve D Franks, andStephen Kresovich (2008), "Community resources and strategies for association mapping in sorghum." Crop Science, 48, 30.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Chen, Wei, Wensheng Wang, Meng Peng, Liang Gong, Yanqiang Gao, Jian Wan, Shouchuang Wang, Lei Shi, Bin Zhou, Zongmei Li, etal. (2016), "Comparative and parallel genome-wide association studies for metabolic and agronomic traits in cereals." NatureCommunications, 7, 12767.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Feldman, Max J, Patrick Z Ellsworth, Noah Fahlgren, Malia A Gehan, Asaph B Cousins, and Ivan Baxter (2018), "Components of wateruse efficiency have unique genetic signatures in the model c4 grass setaria." Plant Physiology, 178, 699–715.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Feldman, Max J, Rachel E Paul, Darshi Banan, Jennifer F Barrett, Jose Sebastian, Muh-Ching Yee, Hui Jiang, Alexander E Lipka,Thomas P Brutnell, Jose R Dinneny, et al. (2017), "Time dependent genetic analysis links field and controlled environment phenotypesin the model c4 grass setaria." PLoS Genetics, 13, e1006841.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Fernandez, Maria G Salas, Yin Bao, Lie Tang, and Patrick S Schnable (2017), "A high-throughput, field-based phenotyping technologyfor tall biomass crops." Plant Physiology, pp–00707.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Furbank, Robert T and Mark Tester (2011), "Phenomics–technologies to relieve the phenotyping bottleneck." Trends in Plant Science,16, 635–644.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Ge, Yufeng, Geng Bai, Vincent Stoerger, and James C Schnable (2016), "Temporal dynamics of maize plant growth, water use, and leafwater content using automated high throughput rgb and hyperspectral imaging." Computers and Electronics in Agriculture, 127, 625–632.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Gehan, Malia A, Noah Fahlgren, Arash Abbasi, Jeffrey C Berry, Steven T Callen, Leonardo Chavez, Andrew N Doust, Max J Feldman,Kerrigan B Gilbert, John G Hodge, et al. (2017), "Plantcv v2: Image analysis software for high-throughput plant phenotyping." PeerJ, 5,e4088.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Hansey, Candice N, James M Johnson, Rajandeep S Sekhon, Shawn M Kaeppler, and Natalia de Leon (2011), "Genetic diversity of amaize association population with restricted phenology." Crop Science, 51, 704–715.

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 38: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Higgins, RH, CS Thurber, I Assaranurak, and Patrick J Brown (2014), "Multiparental mapping of plant height and flowering time qtl inpartially isogenic sorghum families." G3: Genes, Genomes, Genetics, 4, 1593–1602.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Hilley, Josie, Sandra Truong, Sara Olson, Daryl Morishige, and John Mullet (2016), "Identification of dw1, a regulator of sorghum steminternode length." PLoS One, 11, e0151271.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Hilley, Josie L, Brock D Weers, Sandra K Truong, Ryan F McCormick, Ashley J Mattison, Brian A McKinley, Daryl T Morishige, and JohnE Mullet (2017), "Sorghum dw2 encodes a protein kinase regulator of stem internode length." Scientific Reports, 7, 4616.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Hirano, Ko, Mayuko Kawamura, Satoko Araki-Nakamura, Haruka Fujimoto, Kozue Ohmae-Shinohara, Miki Yamaguchi, Akihiro Fujii,Hiroaki Sasaki, Shigemitsu Kasuga, and Takashi Sazuka (2017), "Sorghum dw1 positively regulates brassinosteroid signaling byinhibiting the nuclear localization of brassinosteroid insensitive 2." Scientific Reports, 7, 1–10.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Holman, Fenner, Andrew Riche, Adam Michalski, March Castle, Martin Wooster, and Malcolm Hawkesford (2016), "High throughput fieldphenotyping of wheat plant height and growth rate in field plot trials using uav based remote sensing." Remote Sensing, 8, 1031.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Honsdorf, Nora, Timothy John March, Bettina Berger, Mark Tester, and Klaus Pillen (2014), "High-throughput phenotyping to detectdrought tolerance qtl in wild barley introgression lines." PLoS One, 9, e97047.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Huang, Xuehui, Tao Sang, Qiang Zhao, Qi Feng, Yan Zhao, Canyang Li, Chuanrang Zhu, Tingting Lu, Zhiwu Zhang, Meng Li, et al. (2010),"Genome-wide association studies of 14 agronomic traits in rice landraces." Nature Genetics, 42, 961.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Kwak, Il-Youp, Candace R Moore, Edgar P Spalding, and Karl W Broman (2016), "Mapping quantitative trait loci underlying function-valued traits using functional principal component analysis and multi-trait mapping." G3: Genes, Genomes, Genetics, 6, 79–86.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Lally, David, Peter Ingmire, Hong-Yun Tong, and Zheng-Hui He (2001), "Antisense expression of a cell wall–associated protein kinase,wak4, inhibits cell elongation and alters morphology." The Plant Cell, 13, 1317–1332.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Li, Heng and Richard Durbin (2009), "Fast and accurate short read alignment with burrows–wheeler transform." Bioinformatics, 25,1754–1760.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Li, Miao-Xin, Juilian MY Yeung, Stacey S Cherny, and Pak C Sham (2012), "Evaluating the effective numbers of independent tests andsignificant p-value thresholds in commercial genotyping arrays and public imputation reference datasets." Human Genetics, 131, 747–756.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Li, Xin, Xianran Li, Eyal Fridman, Tesfaye T Tesso, and Jianming Yu (2015), "Dissecting repulsion linkage in the dwarfing gene dw3region for sorghum plant height provides insights into heterosis." Proceedings of the National Academy of Sciences, 112, 11823–11828.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Liang, Zhikai, Piyush Pandey, Vincent Stoerger, Yuhang Xu, Yumou Qiu, Yufeng Ge, and James C Schnable (2017), "Conventional andhyperspectral time-series imaging of maize lines widely used in field trials." GigaScience, 7, gix117.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Ma, Chang-Xing, George Casella, and Rongling Wu (2002), "Functional mapping of quantitative trait loci underlying the characterprocess: a theoretical framework." Genetics, 161, 1751–1762.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title https://plantphysiol.orgDownloaded on March 19, 2021. - Published by

Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 39: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

McCormick, Ryan F, Sandra K Truong, Avinash Sreedasyam, Jerry Jenkins, Shengqiang Shu, David Sims, Megan Kennedy, MojganAmirebrahimi, Brock D Weers, Brian McKinley, et al. (2017), “The sorghum bicolor reference genome: improved assembly, geneannotations, a transcriptome atlas, and signatures of genome organization.” The Plant Journal.

McKenna, Aaron, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran Garimella, DavidAltshuler, Stacey Gabriel, Mark Daly, et al. (2010), "The genome analysis toolkit: a mapreduce framework for analyzing next-generationdna sequencing data." Genome Research, 20, 1297–1303.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Miao, Chenyong, Alejandro Pages, Zheng Xu, Eric Rodene, Jinliang Yang, and James C. Schnable (2020), “Semantic segmentation ofsorghum using hyperspectral data identifies genetic associations.” Plant Phenomics.

Miao, Chenyong and James C Schnable (2019), “Genotype dataset for sorghum association panel sap.”

Miao, Chenyong, Jinliang Yang, and James C Schnable (2019), "Optimising the identification of causal variants across varying geneticarchitectures in crops." Plant Biotechnology Journal, 17, 893–905.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Moore, Candace R, Logan S Johnson, Il-Youp Kwak, Miron Livny, Karl W Broman, and Edgar P Spalding (2013), "Highthroughputcomputer vision introduces the time axis to a quantitative trait map of a plant growth response." Genetics, 195, 1077–1086.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Morris, Geoffrey P, Punna Ramu, Santosh P Deshpande, C Thomas Hash, Trushar Shah, Hari D Upadhyaya, Oscar RieraLizarazu,Patrick J Brown, Charlotte B Acharya, Sharon E Mitchell, et al. (2013), "Population genomic and genome-wide association studies ofagroclimatic traits in sorghum." Proceedings of the National Academy of Sciences, 110, 453–458.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Multani, Dilbag S, Steven P Briggs, Mark A Chamberlin, Joshua J Blakeslee, Angus S Murphy, and Gurmukh S Johal (2003), "Loss of anmdr transporter in compact stalks of maize br2 and sorghum dw3 mutants." Science, 302, 81–84.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Muraya, Moses M, Jianting Chu, Yusheng Zhao, Astrid Junker, Christian Klukas, Jochen C Reif, and Thomas Altmann (2017), "Geneticvariation of growth dynamics in maize (zea mays l.) revealed through automated non-invasive phenotyping." The Plant Journal, 89,366–380.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Murphy, Rebecca L, Robert R Klein, Daryl T Morishige, Jeff A Brady, William L Rooney, Frederick R Miller, Diana V Dugas,

Patricia E Klein, and John E Mullet (2011), "Coincident light and clock regulation of pseudoresponse regulator protein 37 (prr37)controls photoperiodic flowering in sorghum." Proceedings of the National Academy of Sciences, 108, 16469–16474.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Navarro, J Alberto Romero, Martha Willcox, Juan Burgueño, Cinta Romay, Kelly Swarts, Samuel Trachsel, Ernesto Preciado, ArturoTerron, Humberto Vallejo Delgado, Victor Vidal, et al. (2017), "A study of allelic diversity underlying flowering-time adaptation in maizelandraces." Nature Genetics, 49, 476.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Oren, Seval, Halil Ceylan, Patrick S Schnable, and Liang Dong (2017), "High-resolution patterning and transferring of graphene-basednanomaterials onto tape toward roll-to-roll production of tape-based wearable sensors." Advanced Materials Technologies, 2, 1700223.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Osman, Khalid A, Bin Tang, Yaping Wang, Juanhua Chen, Feng Yu, Liu Li, Xuesong Han, Zuxin Zhang, Jianbin Yan, Yonglian Zheng, etal. (2013), "Dynamic qtl analysis and candidate gene mapping for waterlogging tolerance at maize seedling stage." PloS One, 8, e79305.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Ott, Alina, Sanzhen Liu, James C Schnable, Cheng-Ting Yeh, Cassy Wang, and Patrick S Schnable (2017), "Tunable genotyping-by-sequencing (tgbs®) enables reliable genotyping of heterozygous loci. biorxiv."

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Paine, CE, Toby R Marthews, Deborah R Vogt, Drew Purves, Mark Rees, Andy Hector, and Lindsay A Turnbull (2012), "How to fitnonlinear plant growth models and calculate growth rates: an update for ecologists." Methods in Ecology and Evolution, 3, 245–256.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title https://plantphysiol.orgDownloaded on March 19, 2021. - Published by

Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 40: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

Quinby, John Roy (1974), Sorghum Improvement and the Genetics of Growth. Texas agricultural experiment station.Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Quinby, JR (1967), “The maturity genes of sorghum.” In Advances in Agronomy, volume 19, 267–305, Elsevier.

Quinby, JR and RE Karper (1945), “Inheritance of three genes that influence time of floral initiation and maturity date in milo.” Journalof the American Society of Agronomy.

Ramsay, J. O. and B. W. Silverman (2005), Functional Data Analysis (2nd ed.). Springer-Verlag, New York.Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Sang, Mengmeng, Hexin Shi, Kun Wei, Meixia Ye, Libo Jiang, Lidan Sun, and Rongling Wu (2019), "A dissection model for mappingcomplex traits." The Plant Journal, 97, 1168–1182.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Wagner, Tanya A and Bruce D Kohorn (2001), "Wall-associated kinases are expressed throughout plant development and are requiredfor cell expansion." The Plant Cell, 13, 303–318.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Wu, Rongling and Min Lin (2006), "Functional mapping-how to map and study the genetic architecture of dynamic complex traits."Nature Reviews Genetics, 7, 229.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Wu, Rongling, Chang-Xing Ma, Min Lin, Zuoheng Wang, and George Casella (2004), "Functional mapping of quantitative trait lociunderlying growth trajectories using a transform-both-sides logistic model." Biometrics, 60, 729–738.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Wu, Xianshan, Zhenghang Wang, Xiaoping Chang, and Ruilian Jing (2010), "Genetic dissection of the developmental behaviours ofplant height in wheat under diverse water regimes." Journal of Experimental Botany, 61, 2923–2937.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Würschum, Tobias, Wenxin Liu, Lucas Busemeyer, Matthew R Tucker, Jochen C Reif, Elmar A Weissmann, Volker Hahn, ArnoRuckelshausen, and Hans Peter Maurer (2014), "Mapping dynamic qtl for plant height in triticale." BMC Genetics, 15, 59.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Xu, Yuhang, Yehua Li, and Dan Nettleton (2018a), "Nested hierarchical functional data modeling and inference for the analysis offunctional plant phenotypes." Journal of the American Statistical Association, 113, 593–606.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Xu, Yuhang, Yumou Qiu, and James C Schnable (2018b), "Functional modeling of plant growth dynamics." The Plant Phenome Journal,1, 170007.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Yamaguchi, Miki, Haruka Fujimoto, Ko Hirano, Satoko Araki-Nakamura, Kozue Ohmae-Shinohara, Akihiro Fujii, Masako Tsunashima,Xian Jun Song, Yusuke Ito, Rie Nagae, et al. (2016), "Sorghum dw1, an agronomically important gene for lodging resistance, encodes anovel protein involved in cell proliferation." Scientific Reports, 6, 28366.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Yamasaki, Masanori, Stephen I Wright, and Michael D McMullen (2007), "Genomic screening for artificial selection duringdomestication and improvement in maize." Annals of Botany, 100, 967–973.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Yan, Juqiang, Jun Zhu, Cixin He, Mebrouk Benmoussa, and Ping Wu (1998), "Molecular dissection of developmental behavior of plantheight in rice (oryza sativa l.)." Genetics, 150, 1257–1265.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Yang, Jie, Rongling Wu, and George Casella (2009), "Nonparametric functional mapping of quantitative trait loci." Biometrics, 65, 30–39.Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Yang, Wanneng, Zilong Guo, Chenglong Huang, Lingfeng Duan, Guoxing Chen, Ni Jiang, Wei Fang, Hui Feng, Weibo Xie, Xingminghttps://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.

Page 41: Short title: FPCA-based GWAS for time-series data in sorghumMay 27, 2020  · 1 Short title: FPCA-based GWAS for time-series data in sorghum 2 3 Increased power and accuracy of causal

Lian, et al. (2014), "Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variationin rice." Nature Communications, 5, 5087.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

Zhou, Xiang and Matthew Stephens (2012), "Genome-wide efficient mixed-model analysis for association studies." Nature Genetics, 44,821.

Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title

https://plantphysiol.orgDownloaded on March 19, 2021. - Published by Copyright (c) 2020 American Society of Plant Biologists. All rights reserved.