Diversity analysis and structural modeling of some traits in wheat genotypes

Wheat is the most important grain crop in the world which provide people with almost 50% of the required calories [1]. Breeding programs aim to increase the selection efficiency by assessing more genetic variations among wheat genotypes [2], this can be studied through different methods of multivariate analysis such as principle component and cluster analysis. Principle component analysis is used to reduce the large number of traits to a limited number which represents the majority of the existent variation [3]. Al-Otayk [4] applied principle component analysis to study the variation in wheat germplasm, their results showed remarkable variation among them. Categorize germplasm in many groups depending on their variation is applied by Cluster analysis [5]. Cluster analysis was applied by Devesh et al. [1] depending on the agronomic traits of various wheat trait. Poudel et al. [6] estimated the diversity between wheat genotypes, their results showed that wheat genotypes were clustered in various main and sub main clusters. Sahu et al. [7] declared that correlation is used to just to illustrate relation between traits, but not for prediction of any trait, whereas path analysis considers as an efficient method for confirming the correlation depending on the effects and reasons of these effects and to eliminate any false effect. Abd El-Mohsen [8] mentioned that prediction of grain yield via other traits can be applied by regression analysis. The objectives of this investigation were to: (i) evaluate the magnitude of potential diversity between exotic and local wheat genotypes by using principal component analysis and cluster analysis, (ii) study the nature of structural modeling between grain yield and other traits via Regression and path analysis, (iii) define the superior genotypes regarding grain yield in various locations to be used in breeding programs.


INTRODUCTION
Wheat is the most important grain crop in the world which provide people with almost 50% of the required calories [1]. Breeding programs aim to increase the selection efficiency by assessing more genetic variations among wheat genotypes [2], this can be studied through different methods of multivariate analysis such as principle component and cluster analysis. Principle component analysis is used to reduce the large number of traits to a limited number which represents the majority of the existent variation [3]. Al-Otayk [4] applied principle component analysis to study the variation in wheat germplasm, their results showed remarkable variation among them. Categorize germplasm in many groups depending on their variation is applied by Cluster analysis [5]. Cluster analysis was applied by Devesh et al. [1] depending on the agronomic traits of various wheat trait. Poudel et al. [6] estimated the diversity between wheat genotypes, their results showed that wheat genotypes were clustered in various main and sub main clusters. Sahu et al. [7] declared that correlation is used to just to illustrate relation between traits, but not for prediction of any trait, whereas path analysis considers as an efficient method for confirming the correlation depending on the effects and reasons of these effects and to eliminate any false effect. Abd El-Mohsen [8] mentioned that prediction of grain yield via other traits can be applied by regression analysis. The objectives of this investigation were to: (i) evaluate the magnitude of potential diversity between exotic and local wheat genotypes by using principal component analysis and cluster analysis, (ii) study the nature of structural modeling between grain yield and other traits via Regression and path analysis, (iii) define the superior genotypes regarding grain yield in various locations to be used in breeding programs.

Plant Materials
Seventeen genetic resources of primitive wheat originated from different countries (Italy, Ethiopia and Syria) were used Diversity analysis and structural modeling of some traits in wheat genotypes in this study (Table 1). They were kindly provided by the International Centre for Agricultural Research in the Dry Areas (ICARDA), and by the genetic resources department in the General Commission for Scientific Agricultural Research (GCSAR), Syria.

Field Experiments
The investigation was carried out at three research centers (Homs, Al-Swaida and Tartous) in the General Commission for Scientific Agricultural Research (GCSAR), Syria during the season 2018/2019. The experiment was arranged in a randomized complete block design (RCBD), with three replicates. Each replicate consisted of four rows, each row of 1meter long, with 20cm between rows and 5cm between plants, depth of planting was 3-5 cm. Recommended cultural practices like irrigation and pesticide control were applied according to Agricultural ministry guides. Days to flowering, days to heading, plant height(cm), awn length(cm), total tillers number per plant, fertile tillers number per plant, spike length(cm), spike peduncle length(cm), grain number per spike, granis weight per spike, 1000-grain weight(g) and grain yield per plant(g) were all measured on ten randomly selected plants (Table 2).

Statistical Analysis
Summary statistics were analyzed by SPSS.12 software. Principal component analyses to assess level of variance in traits between genotypes was conducted by Minitap.12 software. Cluster analysis was also analyzed via Minitap.12 software depending on single linkage method with Euclidean distance measure. Correlation and simple linear regression SLR between grain yield per plant as the dependent variable and each of the studied traits as independent variables and Biplot were all analyzed via Genstat.12 software. Path coefficients analysis to identify the direct and indirect effects of studied traits on grain yield was done according to Singh and Chudhary [9] using Genstat.12 software, the strength of direct and indirect effects value was scaled according to [10] as follows: More than 1: very strong, (0.3 -0.9): strong, (0.2 -0.29): medium, (0.1 -0.19) weak, less than 0.1: neglected.

RESULTS AND DISCUSSION
Summary statistics for studied traits revealed that the mean of (days to heading and days to flowering) was (132.95, 136.03) day respectively, with a range of (56.00, 62.00) day respectively. Mean of (plant height, awn length, spike length and spike peduncle length) was (124.33, 10.25, 11.23, 40.18) cm respectively with a range of (156. 20, 14.30, 16.70, 44.63) cm respectively. The range of total and fertile tillers number per plant was (52.60, 34.30) respectively, with a mean of (16.73, 11.86) tiller respectively. The mean and range of grains number per spike were (11.23 and 84.60) grain respectively.
Results also indicated that the mean of (granis weight per spike, 1000-grain weight and grain yield per plant) was (1.50, 35.37 and 13.55) g respectively with a range of (4.70, 73.70, 50.71) g respectively. Grain yield per plant was the main source of the explored variations between wheat genotypes since its coefficient of variation reached 61.9%, followed by grain weight per spike 46% (Table 3). Other traits such as fertile tillers number per plant and grains number per spike also were sources of remarkable variation (39.7, 35.8) respectively.

Principle Component Analysis
Twelve principal components were obtained for the studied traits, but only the first four principal components (PC1, PC2, PC3 and PC4) present the variability between wheat genotypes relating to studied traits as they revealed Eigen value larger than 1.0(3.85, 2.49, 1.51 and1.10) respectively (Figure 1), while other components were neglected as they resulted an Eigen value less than one. The maximum variability was 74% divided to (32.1, 20.7, 12.6 and 9.2) % revealed by the first four principle components (PC1, PC2, PC3 and PC4) respectively ( Table 4).      were related to the fourth principle component PC4 (-0.750, 0.408) respectively (Table 5). Loading plot of the studied traits relating to the first two principle components indicated that all traits were negatively connected to PC1 (except days to heading), while five traits: Grains weight per spike, awn length, total tillers per plant, fertile tillers per plant and grain yield per plant were negatively connected to PC2 ( Figure 2). Principle components Biplot distribution of genotypes and locations for the final grain yield indicated that Tartous was the best location followed by Homs, and the genotype SD09 was superior in grain yield per plant followed by SH5 and IP39 comparing to all other genotypes ( Figure 3).

Correlation Analysis
Results showed significant positive correlation between days to flowering and days to heading   0.557**) respectively, and finally between 1000-grain weight and grain yield per plant (0.244**) ( Table 6). All correlations between studied traits were plotted (Figure 4).

Regression Analysis
Results of simple linear regression SLR to predict grain yield per plant as a dependent trait by each of other studied traits as independent traits reveled that only six traits can be used as predictors of grain yield per plant (days to heading, awn length, total tillers per plant, fertile tillers per plant, spike peduncle length and grains weight per spike) as they had a significant regression with grain yield per plant ( Table 7). The coefficient of determination R 2 between grain yield and days to heading was 0.208 which means that the late heading genotypes will have a reduction in grain yield in rate of 20.8% because of the negative correlation between days to heading and grain yield per plant. The coefficient of determination R 2 between grain yield and other traits like awn length, total tillers number per plant, fertile tillers number per plant, spike peduncle length and grain weight per plant were (0.096, 0.326, 0.421, 0.122 and 0.310) respectively, indicated that each of these traits can predict separately the variation in grain yield per plant at in a rate of (9.6, 32.6, 42.1, 1.2 and 3.10) % respectively ( Figure 5). Regression equation is listed in Table. 8.

Path Analysis
Results of path analysis showed that the direct effects on grain yield per plant for fertile tillers number per plant and grains weight per spike were strong and positive (0.33807 and 0.53401) respectively, while strong but negative for days to heading (-0.44066), positive and medium for days to flowering      Table 9).

Cluster Analysis
Results of cluster analysis depending on studied field traits revealed that the similarity coefficient ranged from 43.50% to 100%, and three main groups of genotypes were clustered together, each of contained sub clusters. The first main group included nine genotypes, which seven originated from Italy (IC80, IC81, IC83, IC89, IP46, IP47, IP49) and one originated from Ethiopia (ED04) and another one originated from Syria (SD11). The second main group contained five genotypes originated all from Italy (IP39, IP50, IP40, IP45, IP52). The third main group consisted of three Syrian originated genotypes (SD09, SH3 and SH5) ( Figure. 6).

DISCUSSION
According to our results, the coefficient of variation for grain yield was the highest in representing variation between wheat genotypes, this could be explained due the complex nature of yield trait which affected by most of the quantitative traits.
Our results agree with the findings of Grzesiak et al. [11] who found that the coefficient of variation for wheat grain yield was  . Correlation between days to heading and grain yield was negative which means that late heading plants are accompanied with low yield and vice versa. Quantitative traits were mainly correlated positively with each other and with grain yield since they are affecting the final grain yield. These results were reported also by Marzougui et al. [12] who found that grain yield had significant and negative correlation with days to heading, while positive with spike peduncle length, and also positive between spike peduncle length and grains weight per spike, these were also approved [13,14]. Regarding Principle component analysis, the first four components represented about three quarter of variation which refer to the importance of the studied traits in providing wide variability to be explored, this agrees with the finding of Sisodia and Rai [15] who declared that the first principle components explain the main variability in various wheat genotypes. In addition, our results genotypes in the second clusters. Wide variation was obvious between different cluster which make our results valuable for breeders to determine the method of choosing which genotypes to be used as parents for wheat improvement, This agrees with Mutwali et al. [19] who declared that related wheat genotypes were clustered together, and with [20] who reported that each cluster contains very close genotypes, and also with Jaiswal et al. [21]. According to path analysis results, both traits fertile tillers number per plant and grains weight per spike had positive strong direct effect on grain yield per plant and their correlation with grain yield were positive also, which means that both traits could be used effectively as selection indexes for grain yield, other scientists reported similar results [22,23,24,25].

CONCLUSIONS
Valuable variation was confirmed according to our results between studied genotypes, caused by grain yield and most of the studied traits which be a sign of their importance. It was concluded also that the total and fertile tillers number per plant were the best predictors of the final grain yield per plant, and fertile tillers per plant with grains weight per spike could be used as selection indexes of wheat grain yield. Our results confirmed are in agreement with results of Devesh et al. [1] who indicated that the PC1 in wheat contained spike peduncle length and plant height, and PC3 contained tillers number per plant, and in general most of yield traits were located in the first three components. It was noticed that the Syrian originated genotype SD09 was the superior in grain yield per plant and this could be explained to its adaptability to local environments comparing with the exotic genotypes. Regression study between grain yield and days to heading revealed that the late heading genotypes will suffer from a grain yield reduction in rate of 20.8%, while any raise in awn length, total tillers number per plant, fertile tillers number per plant, spike peduncle length and grains weight per plant will cause an increasing of grain yield in a rate of (9.6, 32.6, 42.1, 1.2 and 3.10) % respectively, and these results sounds logic since these traits are related to yield as it is well-known, so these traits can be used as significant predictors for grain yield. Our findings are in agreement with many researchers' results [16][17][18]. Our results of cluster analysis concluded that most genotypes that obtained from the same origin gathered in the same cluster like the majority of Italian genotypes in the first cluster, as well as in the third cluster regarding the Syrian genotypes. Our results also insured that genotypes from the same species were clustered together like all Triticum polonicum   the necessity to evaluate different genotypes originated from many sources to get benefit of their variability and to emphasize on the Syrian genotype SD09 in wheat breeding program for yield improvement.