Genome sequences and molecular resources for Macadamia tree breeding in South Africa

dc.contributor.advisorMyburg, Alexander A.
dc.contributor.coadvisorFourie, Gerda
dc.contributor.coadvisorPierneef, Rian Ewald
dc.contributor.coadvisorHefer, Charles A.
dc.contributor.emailmranketse@gmail.comen_US
dc.contributor.postgraduateRanketse, Mary
dc.date.accessioned2025-03-14T11:21:57Z
dc.date.available2025-03-14T11:21:57Z
dc.date.created2025-04
dc.date.issued2024-11
dc.descriptionThesis (PhD (Genetics))--University of Pretoria, 2024.en_US
dc.description.abstractBreeding for improved tree species is a long and tedious process due to the long generation time, and plants typically have complex genomes and varying phenotypes. Breeding programs require many factors to be considered to firstly establish and then maintain the objective of breeding for improved cultivars for various environments. The application of DNA based molecular markers have shown to be very useful at reducing the breeding cycle time and the efficient management of breeding programs and genetic resources. Microsatellite markers or simple sequence repeats are a low cost, rapid DNA marker system that is efficient for various analyses such as cultivar identification, population diversity and parentage. Whole genome sequencing and assembly is a powerful tool for understanding and characterising genomes, and for the discovery of DNA based molecular markers. Whole genome sequencing is the first step towards unravelling the complex plant populations and genomes. Genome annotation is the next step that adds a layer of biological information that can be used to understand complex biological processes and varying phenotypes. Macadamia nuts are the most expensive in the world and the genus is the most economically important in the Proteaceae family. South Africa is the largest producer of macadamia nuts globally and is thus an important crop to the country, although Macadamia species is native to Australia, and was commercialized in Hawaii. Due to its importance, this study aimed to understand the population dynamics of Macadamia in South Africa using microsatellite markers; and conduct whole genome sequencing, assembly, and annotation of important cultivars to contribute towards developing molecular information that can be used for breeding programs. The key findings of this study are presented below. The South African macadamia industry mainly grows cultivars that are imported from various countries. Thirteen microsatellite markers were used to perform genetic fingerprinting, determine the genetic diversity and population structure of 110 macadamia cultivars in South Africa, in the context of international genetic diversity. The present study compared 31 locally selected cultivars to 31 imported from Hawaii, 19 from Australia, two from California, one from Israel, and 26 from a local breeding population. The microsatellite markers were able to differentiate the two commercial species Macadamia integrifolia and Macadamia tetraphylla into separate groups and the two groups coincided with countries of origin. The South African local selections were mainly composed of M. tetraphylla like cultivars. The Hawaiian imported selections were spread over the M. integrifolia like species group, and a second group that was intermediary of M. integrifolia and M. tetraphylla, consisting of hybrids of varying degrees between the two species. The Australian selections were mainly in the hybrid range, with a few accessions in the M. integrifolia like group. The results showed that the local South African macadamia selections had a unique genetic structure compared to the Hawaiian and Australian selected cultivars. We sequenced, assembled, and annotated three cultivars using Illumina short read sequencing and Oxford Nanopore long read sequencing. Santa Anna is an M. tetraphylla species representative, and two hybrid cultivars of importance to South Africa, Beaumont/HAES 695 (M. integrifolia x M. tetraphylla) and HAES 791 (M. integrifolia x M. tetraphylla x M. ternifolia). The genome assembly sizes ranged from 750.0 Mb for Santa Anna, 762.2 Mb for HAES 695 and 836.5 Mb for the HAES 791 cultivar. Santa Anna had the least number of contigs (579), and the HAES 695 and Santa Anna genomes had 705 and 965 contigs respectively. Contig N50 for HAES 695 and Santa Anna were 2.1 Mb and 1.9 Mb, and HAES 791 was 3.6 Mb. The BUSCO completeness scores were 97.0% and 97.4% (HAES 695 and Santa Anna) and 99.0% (HAES 791). Genome annotation resulted in 37,572; 36,328; and 30,600 genes found in Santa Anna; HAES 695; and HAES 791 respectively. The genome assemblies were compared to the published M. integrifolia HAES 741, HAES 344 and GR1, and an M. tetraphylla genome. Our genome assembly and annotation statistics are comparable to the published genomes and are contiguous and of high quality. Nut oil is an important trait in macadamia as macadamia has the highest content of healthy fatty acids, specifically palmitoleic acid (omega 7 fatty acid), which is not found in concentrations higher than 1% in other tree nuts. This study analysed the fatty acid biosynthesis associated genes from the annotation data and compared our results to other tree nut and oil producing crop species. Four protein families important to fatty acid biosynthesis were analysed: fatty acid desaturase (FAD), stearoyl-CoA-desaturase (SAD), 3-oxoacyl-acyl-protein-carrier-synthase (KAS), and the oleoyl-acyl carrier protein synthase (FATA) and palmitoyl-acyl carrier protein synthase (FATB). The results revealed that Macadamia had the most stearoyl-[acyl-carrier-protein] 9-desaturase 6 (SAD6) encoding genes, followed by palmitoyl-acyl carrier protein thioesterase (FATB) encoding genes, compared to the other species. Macadamia had more KAS encoding genes compared to other tree nut species. The study was able to determine the unique genetic profile of South African locally selected cultivars and developed a technology pipeline for the local macadamia nut industry to perform routine genotyping analysis using microsatellite markers. Furthermore, the genome assemblies and annotation will add to the growing genomic resources for Macadamia. The fatty acid biosynthesis associated gene analysis may explain the unique fatty acid content of macadamia nuts and deserves further investigation beyond this study. In conclusion, this study forms an important foundational analysis of Macadamia genomics and contributes towards developing molecular tools towards advanced genomic breeding programs for South Africa and globally.en_US
dc.description.availabilityUnrestricteden_US
dc.description.degreePhD (Genetics)en_US
dc.description.departmentBiochemistry, Genetics and Microbiology (BGM)en_US
dc.description.facultyFaculty of Natural and Agricultural Sciencesen_US
dc.description.sdgSDG-02:Zero Hungeren_US
dc.description.sdgSDG-09: Industry, innovation and infrastructureen_US
dc.description.sdgSDG-12:Responsible consumption and productionen_US
dc.description.sponsorshipMacadamias South Africa (SAMAC)en_US
dc.description.sponsorshipNational Research Foundation (NRF)en_US
dc.description.sponsorshipMacadamia Protection Programme (MaPP), UPen_US
dc.description.sponsorshipForest Molecular Genetics (FMG) Programme, UPen_US
dc.identifier.citation*en_US
dc.identifier.doihttps://doi.org10.25403/UPresearchdata.28595471en_US
dc.identifier.otherA2025en_US
dc.identifier.urihttp://hdl.handle.net/2263/101511
dc.identifier.uriDOI: https://doi.org/10.25403/UPresearchdata.28595471.v1
dc.language.isoenen_US
dc.publisherUniversity of Pretoria
dc.rights© 2023 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
dc.subjectUCTDen_US
dc.subjectSustainable Development Goals (SDGs)en_US
dc.subjectMacadamiaen_US
dc.subjectMicrosatellitesen_US
dc.subjectGenetic diversityen_US
dc.subjectPopulation structureen_US
dc.subjectWhole genome sequencingen_US
dc.subjectOxford nanoporeen_US
dc.subjectGenome annotationen_US
dc.subjectComparative genomicsen_US
dc.subjectFatty acid biosynthesisen_US
dc.subjectGenesen_US
dc.subjectPalmitoleic aciden_US
dc.subjectTree nutsen_US
dc.subjectGenetic breedingen_US
dc.titleGenome sequences and molecular resources for Macadamia tree breeding in South Africaen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ranketse_Genome_2024.pdf
Size:
17.79 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: