Date of Graduation


Document Type


Degree Name

Master of Science in Food Safety (MS)

Degree Level



Agricultural, Food and Life Sciences


Ehsan Shakiba

Committee Member

Leandro Mozzoni

Second Committee Member

Reuben M. Ceballos

Third Committee Member

Trenton L. Roberts

Fourth Committee Member

Bo Zhang

Fifth Committee Member

Ainong Shi


GWAS;LCh;Natto;Seed Hardness;Soybean


Natto is a specialty fermented soyfood made from small-seeded (seeds-1) soybean varieties. Seed hardness and seed coat color are important seed traits that determine the texture and appearance of natto and are thus valuable to breeders. Prior research has identified quantitative trait loci (QTL, hereafter) for seed hardness, but its nature as a quantitative trait heavily influenced by the environment means that it is still poorly understood. Prior research has identified the primary genetic components of seed coat color using simple visual inspection, but few studies have investigated the usefulness of more quantitative measurements, such as the color space coordinates developed by Commission Internationale de l’Eclairage. The objectives of this research were 1) to assess seed hardness and the seed coat color components of lightness, chroma, and hue in a diverse variety of genotypes to determine suitable parents for natto breeding, 2) to evaluate the environmental influence on these traits, 3) analyze the genetic diversity of the genotypes used in this study, 4) perform genome-wide association studies (GWAS) to identify single nucleotide polymorphisms (SNPs) that may be used for marker-assisted selection (MAS) or genomic selection (GS) of seed hardness and seed coat color, and 5) to compare the effectiveness of different GWAS models for identifying SNPs associated with the traits of interest. An association panel was assembled using 168 natto accessions from the USDA soybean germplasm, 51 natto breeding lines and 49 conventional breeding lines from the University of Arkansas, and 49 natto breeding lines from Virginia Tech. All genotypes were grown in 2021 as an augmented block design with four single-replication blocks, each grown in four different locations in Arkansas. DNA was isolated from young leaf tissue of 285 lines and genotyped at the Soybean Genomics and Improvement Laboratory using the SoySNP50k platform (Illumina, Inc., San Diego, CA). Phenotypic data were collected and analyzed in JMP Pro 16, genetic diversity and population structure were calculated using GAPIT, and TASSEL was used to perform GWAS using 32,724 SNPs. Association analyses were conducted using the general linear, mixed linear, and single marker regression models, and the significance threshold was an LOD > 3. ANOVA conducted on aggregated data showed a significant environmental effect in all locations. A student t-test indicated a sufficient genetic diversity to perform GWAS on the individual location tests. Phenotypic analysis for identification of suitable natto breeding genotypes revealed six high-performing genotypes for seed hardness, nine for lightness and chroma, and 12 for hue. Two genotypes, PI 458281 B and PI 603713, were found to have optimal phenotype for multiple traits, and were identified as potential natto breeding parents. Genetic diversity analysis identified three distinct sub-populations within the association panel. Seed source, region of origin, variety, and level of inbreeding had no significant effect on the traits of interest. GWAS identified 11 SNPs for hardness, 15 for lightness, six for chroma, and nine for hue. One of the seed hardness SNPs, ss715579472, was colocalized to Chr 1 within 600 kbp of Ha2, a seed hardness QTL identified and confirmed by prior research. These results will be useful for developing new natto cultivars through marker-assisted selection.