Compiled by Norris H. Williams, J. Richard Abbott, Kurt Neubig, W. Mark Whitten

Task assignment number SL849#148
5 June 2009 project # 00076531

Abstract

The central premise of DNA barcoding is that each species has a unique set of DNA sequences, of which a carefully chosen subset (a “DNA barcode”) can serve as a baseline reference for comparative identification. Traditionally, plant identification is based on morphology, with the limitation that many species must be collected at a very specific time of the year so that reproductive features can be used for accurate identification. With DNA data, once a reference library is built, barcode identification of plants is theoretically possible at all life stages, from seed to mature plant. Even a fragment of a leaf might be used to identify a species. DNA barcoding has practical utility for all research, wildlife and land management, and conservation efforts that rely on identifying plant species, and it also has applications in forensics, biosecurity, trade in controlled species, foodstuffs, and herbal medicines, and in scientific questions involving evolution, biogeography, and population structure. DNA barcoding is not meant to replace detailed population genetic studies, but it will provide a baseline of genetic data for comparative analysis, and, with sampling of multiple populations, barcoding studies can provide a first glimpse at genetic differences that could reflect underlying patterns of cryptic or incipient species differences. DNA barcoding will enhance our understanding of the Florida flora and should become a useful tool for research and wildlife management. As a prelude of a larger “Barcoding the Flora of Florida” project, we collected and sequenced matK, rbcL, and trnH-psbA spacer for all 136 species on the Florida Exotic Pest Plant Council list of invasive species. With 100% success at generating DNA barcode data for all FLEPPC species and a total DNA sequencing success rate of about 99% (on an individual gene region basis), this study provides compelling evidence that barcoding is logistically feasible. Furthermore, these data generally show excellent differentiation among the invasive species of Florida. However, more data for all species in Florida are required for reliable plant identification by way of DNA barcoding, i.e., the DNA reference library needs to be built for the entire flora.

Introduction

DNA barcoding is currently a hot topic, drawing much excitement for its potential applications. The central premise of DNA barcoding is that each plant species has a unique set of DNA, of which a carefully chosen subset (a “DNA barcode”), can serve as a unique baseline reference for identification.

For this project, we requested funds to barcode the complete list of Florida Exotic Pest Plant Council (FLEPPC) invasive species. The objectives of this project have been: 1) to produce well-vouchered, accurately identified specimens and field-preserved tissues (to ensure high quality DNA) of plant species on the FLEPPC list (a total of 67 spp in Category I and 69 spp in Category II); 2) to provide long-term curation of these voucher specimens; 3) to extract DNA from these specimens and store the DNA in archival conditions, creating a lasting, reliable databank of vouchered DNA samples ultimately to be expanded to include all vascular plant species in Florida; 4) to generate sequence data of three DNA regions (barcodes) for all FLEPPC species, providing insight into the utility of DNA barcoding in Florida; and 5) to analyze and publish these data, with a special emphasis on potential utility for conservation and land management issues, for use by all agencies involved in conservation and wildlife management in Florida.

DNA barcoding has numerous promising uses: identification of different life stages that are often unidentifiable with certainty using traditional morphology, e.g., seeds and seedlings; identification of fragments of plant material; forensics; verification of foodstuffs and herbal medicines; biosecurity and trade in controlled species; and inventories or ecological surveys. Apart from the obvious botanical and genetic research, which are important for our understanding and proper management of the flora, barcoding can directly benefit ecological plot work, which often requires identifying tiny sterile plantlets and conservation or wildlife management work by zoologists or entomologists trying to identify plants (often via gut content or fecal analysis) being used by their study organisms. Thus, there are potential implications for DNA barcoding for all research, management, and conservation efforts that rely on identifying plant species.

Vouchers are plant reference specimens: essential, scientific proof of accuracy for the databank of tissue and DNA that will be used to generate DNA barcode reference data, and they have a lasting research value. A DNA barcoding project must be anchored by voucher specimens and their long-term curation in an herbarium. The UF herbarium (FLAS), part of the Florida Museum of Natural History (FLMNH), is the largest in Florida and represents one of the most complete collections of the flora of Florida. FLAS has the facilities to identify, process, curate, and maintain the voucher specimens. In addition to providing expertise, facilities, and personnel to conduct this project, FLAS & FLMNH are uniquely qualified in that we can store the extracted DNA long-term in our new, NSF-funded, liquid nitrogen tissue & DNA long term storage facility (http://www.floridamuseum.ufl.edu/grr/).

Materials and Methods

Taxon sampling

Specimens were collected from wild-collected and cultivated plants (Table 1). Sampling included all of the species on the Florida Exotic Plant Pest Council list of invasive plant species. In addition, 28 samples (of 27 species) were sampled in duplicate to independently corroborate the validity and accuracy of the DNA data.

Choice of DNA barcode regions

At present, the botanists have not agreed upon a standardized set of DNA regions to use as plant barcodes (see http://www.barcoding.si.edu/plant_working_group.html). Choice of regions involves tradeoffs between universality and ease of amplification/ sequencing vs. DNA variation. Based upon comparisons of Fazekas et al. (2008), we selected three plastid regions: rbcL, matK, and the trnH-psbA intergenic spacer.

Extractions, Amplification and Sequencing

All freshly-collected material was preserved in silica gel (Chase & Hills, 1991). Genomic DNA was extracted using a modified cetyl trimethylammonium bromide (CTAB) technique (Doyle & Doyle, 1987), scaled to a 1 mL volume reaction. Approximately 10 mg of dried tissue were ground in 1 mL of CTAB 2X buffer and 7 µL of proteinase-K at manufacturer’s recommended concentration. Some total DNAs were then cleaned with Qiagen QIAquick PCR purification columns to remove any inhibitory secondary compounds. Amplifications were performed using an Eppendorf Mastercycler EP Gradient S thermocycler and Sigma brand reagents in 25 µL volumes with the following reaction components: 0.5-1.0 µL template DNA (~10-100 ng), 16-17.5 µL water, 2.5 µL 10X buffer, 2-4 µL MgCl2, 0.5 µL of 10 µM dNTPs, 0.5 µL each of 10 µM primers, and 0.5 units Sigma Jumpstart Taq polymerase. All regions were amplified with the parameters 94oC, 3 min; 33X (94oC, 45 sec; 55oC, 45 sec; 72oC, 2 min); 72oC, 3 min. For rbcL, primers rbcLa F (ATGTCACCACAAACAGAGACTAAAGC) and rbcLa R (GTAAAATCAAGTCCACCRCG) were used (Kress and Erickson 2007). For trnH-psbA spacer, primers trnH-psbA F (TGATCCACTTGGCTACATCCGCC) and trnH-psbA R (GCTAACCTTGGTATGGAAGT) from Xu et al. (2000). For matK, primers 390 F (CGATCTATTCATTCAATATTTC) and 1326 R (TCTAGCACACGAAAGTCGAAGT) from Cuenoud et al. (2002). Alternative primers for ferns were used for matK: matK x F (ATACCCCATTTTATTCATCC) and Equisetum R (GTACTTTTATGTTTACGAGC) (http://www.kew.org/barcoding/update.html [site not active])

Products were cleaned using ExoSAP™ (USB Corporation, OH, USA) following the manufacturer’s protocols and were directly sequenced using BigDye terminator reagents on an ABI 3130 automated sequencer according to the manufacturer’s protocols (Applied Biosystems, Foster City, California, USA). Electropherograms were edited and assembled using Sequencher 4.6™ (GeneCodes, Ann Arbor, MI, USA). All sequences are being deposited in GenBank and will be released on 1 September 2009 (Table 1).

Data analysis

Sequence data of matK and rbcL were manually aligned using Se-Al v2.0a11 (Rambaut, 1996). No sequence data were excluded from analyses. Indels (insertions/deletions) were not coded as characters. Analyses were performed using PAUP*4.0b10 (Swofford, 2003) with Fitch parsimony (unordered characters with equal weights; Fitch, 1971) using a heuristic search strategy consisted of branch swapping by tree bisection reconnection (TBR), stepwise addition with 5000 random-addition replicates holding 5 trees at each step, and saving multiple trees (MulTrees). Sequence data were verified, where applicable, by “blasting” against GenBank (http://www.ncbi.nlm.nih.gov/) deposited sequences.

Results

Out of a total of 136 species (a total of 164 accessions) on the FLEPPC list across the three gene regions we had a total success of amplification and sequencing at around 99% (see table 1) of complete sampling (Nexus files of the full data sets are freely available on our FTP site and anyone can download these Nexus files at our FTP site (ftp://flmnh.ufl.edu/public/FLEPPC/fleppcmatrices/). The matrices as self-extracting files are also available on the web site devoted to this project (https://www.floridamuseum.ufl.edu/herbarium/research/barcoding/). For the gene region rbcL (ffwcc-uf8162-rbcL.sea.bin), we were able to obtain complete data across all species sampled. For matK (ffwcc-uf8162-matK.sea.bin), data could not be obtained for three samples, all ferns. For the trnH-psbA spacer (ffwcc-uf8162-trnH-psbA.sea.bin), data could not be obtained for five samples: one accession of Colocasia esculenta (but we did obtain the sequence from a second accession of this species), Melia azedarach, Myriophyllum spicatum, Tribulus cistoides, and Xanthosoma sagittifolium. All of these are resolved by the combined rbcL/matK sequences, so this is not a concern to us.

The “problems”

In the collection phase of this project, some species were especially difficult to locate in the wild. For example, Ipomoea aquatica and I. fistulosa could only be procured from horticultural material. Despite being tracked invasives, these species were not readily encountered in the landscape. During the lab process, several problems arose especially during PCR of the DNA region matK. This region proved to be very difficult to amplify using the barcoding primer recommendations made by Kew (http://www.kew.org/barcoding/update.html), with an amplification success around 55%. However, after a literature search for more universal primers, the same portion of matK was amplified more reliably (i.e., ~90%) using primers from Cuenoud et al. (2002). This region also proved difficult to amplify because of the large phylogenetic breadth among the sampled taxa in this study. That is, when one tries to amplify a region that is as variable as matK across all of vascular plants, it is very challenging to get “universal” primers to work consistently. Ferns of the genus Nephrolepis proved especially difficult to sequence; more research will be needed to find matK primers that amplify across all vascular plants (or even all ferns). Preliminarily, we therefore recommend using a combination of primers to have increased success in amplification of matK.

The three accessions of Alstonia macrophylla did not form a clade as sister taxa (with the result that Alstonia does not appear monophyletic), and it seems one of the Alstonia macrophylla accessions (JRA25075), is an unknown entity (morphologically similar to, yet genetically distinct from, A. macrophylla) that is ‘hiding’ in the Miami landscape or that lateral gene flow (e.g., hybridization) with another cultivated Alstonia species (or potentially even a taxon from another genus) has confused the issue. When one considers that there are over 130 species named in Alstonia, it seems this is a likely explanation.

Conclusions/Recommendations

The 100% success rate of obtaining data for all species in this study and the nearly 99% success at obtaining data on a per-region basis provides compelling evidence that barcoding is logistically feasible. Many other published studies on the feasibility of barcoding have focused on which DNA regions to use or in which taxonomic situations might barcoding fail or succeed. Because this data set represents a very broad phylogenetic sampling, we think that this is good evidence that these data can be obtained, analyzed, and compared for a larger sampling of species for the state.

Although not directly informative to the process of barcoding, these data were phylogenetically analyzed as a secondary process for DNA sequence verification. An encouraging result from this analysis is that nearly all of the species are monophyletic where accessions were duplicated. The monophyly of these species sampled further demonstrates the feasibility of barcoding (i.e., synapomorphic DNA characters connote unique sequence variation critical to barcoding efforts). However, because these data will eventually be deposited in an online repository (Consortium for the Barcode of Life, CBOL), this phylogenetic analysis is an unfair test of how the data will be practically used and implemented. CBOL uses a modified “BLAST” analysis to identify an unknown sequence.

If the FLEPPC species are compared to each other (Figure 1), all of these species can be easily distinguished. Thus any worker with the proper equipment can now sequence a plant suspected of being a FLEPPC listed exotic, with the caveat that 100% identification is not guaranteed. Unfortunately, we do not yet know if these data alone are sufficient to differentiate the invasive plant species from the rest of the flora. If a person were to sequence the barcode DNA regions for an unknown sample and compared the sequence to our dataset of invasive species, the result could be taken as a false positive. Without a complete sampling of all species that occur within Florida, the identification of an unknown sample would be open to interpretation. For example, the two species of Cyperus (C. involucratus and C. prolifer) have many closely related species native to Florida. Because closely related species will inherently have similar DNA sequences, this makes identification of an unknown sequence very difficult without a complete reference of DNA sequences. Without further sampling of Florida plant species, there may be some cases of false positives in identifying FLEPPC taxa, confusing them with closely related taxa. Because of this, we recommend that all plant species in Florida be sequenced for DNA barcoding. The standardized protocols for plant barcoding are still under active discussion and development, and research may eventually produce better barcode regions. With the DNAs of these vouchered specimens in UF archival cryogenic storage, it will be easy to expand upon these studies as plant barcode protocols become more refined.

Future dissemination
In addition to providing sequence data for the previously mentioned three gene regions and vouchers for all material collected, we plan on enhancing these data with additional information. First, we plan to host images of the invasive species (herbarium specimens, and live plant photographs) on the FLAS herbarium website. Second, the data will be submitted to GenBank, the international repository for sequence data. Because the Consortium for the Barcode of Life (CBOL) is not accepting barcoding sequence data for plants at this time, these data have not been submitted to that institution yet. However, when it does go online, we will submit these sequences. And third, all total DNAs will be placed in the cryogenic facility (FLMNH-GRR) in the museum for long term storage, making these available to any researcher in the world, and also providing a ready-made starting point for any future DNA work, including following up on advances in DNA barcoding. Finally, we will continue to build on what we have started with the FLEPPC taxa, working toward DNA barcoding the entire flora of Florida.

Literature Cited

Table 1. Vouchers of species used in this study and success for the three DNA regions

(Y = adequate sequence data, NA = no available data). Vouchers are deposited at the UF Herbarium, Florida Museum of Natural History (FLAS), except for two deposited at Fairchild Tropical Garden (FTG).

VoucherSpecies -- Categories I & IImatKrbcLtrnH-psbA
JR Abbott 24803Abrus precatoriusYYY
JR Abbott 23682Acacia auriculiformisYYY
JR Abbott 25025Adenanthera pavoninaYYY
JR Abbott 25217Agave sisalanaYYY
SB Davis 0339Albizia julibrissinYYY
JR Abbott 23666Albizia lebbeckYYY
JR Abbott 24381Aleurites fordiiYYY
SB Davis 0336Aleurites fordiiYYY
JR Abbott 25075Alstonia macrophyllaYYY
JR Abbott 25100Alstonia macrophyllaYYY
JR Abbott 25122Alstonia macrophyllaYYY
JR Abbott 24898Alternanthera philoxeroidesYYY
SB Davis 0385Antigonon leptopusYYY
JR Abbott 23482Ardisia crenataYYY
SB Davis 0570Ardisia crenataYYY
JR Abbott 24112Ardisia ellipticaYYY
JR Abbott 25155Aristolochia littoralisYYY
JR Abbott 23671Asparagus aethiopicusYYY
SB Davis 0362Asparagus aethiopicusYYY
JR Abbott 23992Asystasia gangeticaYYY
JR Abbott 24907Bauhinia variegataYYY
JR Abbott 24949Begonia cucullataYYY
JR Abbott 24971Bischofia javanicaYYY
JR Abbott 23815Blechum pyramidatumYYY
SB Davis 0310Broussonetia papyriferaYYY
JR Abbott 25259Callisia fragransYYY
P Howell 1086Calophyllum antillanumYYY
JR Abbott 24924Casuarina cunninghamianaYYY
JR Abbott 24914Casuarina equisetifoliaYYY
JR Abbott 24461Casuarina glaucaYYY
JR Abbott 24911Casuarina glaucaYYY
JR Abbott 25026Cecropia palmataYYY
JR Abbott 25258Cestrum diurnumYYY
JR Abbott 24813Chamaedorea seifriziiYYY
SB Davis 0298Cinnamomum camphoraYYY
JR Abbott 23492Clematis ternifloraYYY
SB Davis 1149Clematis ternifloraYYY
JR Abbott 24708Colocasia esculentaYYNA
SB Davis 1225Colocasia esculentaYYY
JR Abbott 24812Colubrina asiaticaYYY
G Ionta 36Cryptostegia madagascariensisYYY
JR Abbott 24913Cupaniopsis anacardioidesYYY
JR Abbott 25287Cyperus involucratusYYY
SB Davis 0706Cyperus involucratusYYY
JR Abbott 25162Cyperus proliferYYY
JR Abbott 23669Dalbergia sissooYYY
SB Davis 0903Dioscorea alataYYY
JR Abbott 23715Dioscorea bulbiferaYYY
JR Abbott 23238Eichhornia crassipesYYY
SB Davis 0558Elaeagnus pungensYYY
JR Abbott 24912Epipremnum pinnatum cv. aureumYYY
JR Abbott 25209Epipremnum pinnatum cv. aureumYYY
JR Abbott 23855Eugenia unifloraYYY
JR Abbott 10684Ficus altissimaYYY
JR Abbott 10686Ficus microcarpaYYY
JR Abbott 25027Flacourtia indicaYYY
JR Abbott 24966Hemarthria altissimaYYY
JR Abbott 25101Hibiscus tiliaceusYYY
JR Abbott 23507Hydrilla verticillataYYY
JR Abbott 25275Hydrilla verticillataYYY
JR Abbott 25274aHygrophila polyspermaYYY
JR Abbott 25280Hygrophila polyspermaYYY
JR Abbott 23713Hymenachne amplexicaulisYYY
JR Abbott 25272Hymenachne amplexicaulisYYY
SB Davis 0679Imperata cylindricaYYY
JR Abbott 25288Ipomoea aquaticaYYY
JR Abbott 25278Ipomoea fistulosaYYY
JR Abbott 24970Jasminum dichotomumYYY
JR Abbott 24120Jasminum fluminenseYYY
JR Abbott 24984Jasminum sambacYYY
SB Davis 1290Kalanchoe pinnataYYY
JR Abbott 07909Koelreuteria elegans ssp. formosanaYYY
JR Abbott 22698Lantana camaraYYY
JR Abbott 24948Leucaena leucocephalaYYY
JR Abbott 23514Ligustrum lucidumYYY
JR Abbott 23510Ligustrum sinenseYYY
JR Abbott 24951Limnophila sessilifloraYYY
JR Abbott 25021Livistona chinensisYYY
JR Abbott 23490Lonicera japonicaYYY
JR Abbott 23657Ludwigia peruvianaYYY
JR Abbott 22434Lygodium japonicumYYY
Hutchinson s.n.Lygodium microphyllumYYY
JR Abbott 25203Lygodium microphyllumYYY
JR Abbott 08431Macfadyena unguis-catiYYY
JR Abbott 24050Manilkara zapotaYYY
JR Abbott 23686Melaleuca quinquenerviaYYY
JR Abbott 23439Melia azedarachYYNA
K Samelson 1Melinis minutifloraYYY
P Howell 1087Melinis minutifloraYYY
JR Abbott 25024Merremia tuberosaYYY
JR Abbott 25270Mimosa pigraYYY
JR Abbott 24049Murraya paniculataYYY
JR Abbott 25284Myriophyllum spicatumYYNA
JR Abbott 23516Nandina domesticaYYY
SB Davis 0577Nandina domesticaYYY
SB Davis 0317Nephrolepis cordifoliaNAYY
JR Abbott 25195Nephrolepis multifloraNAYY
JR Abbott 24123Neyraudia reynaudianaYYY
Cuda s.n.Nymphoides cristataYYY
JR Abbott 25017Paederia cruddasianaYYY
JR Abbott 23719Paederia foetidaYYY
JR Abbott 23632Panicum maximumYYY
JR Abbott 23833Panicum maximumYYY
JR Abbott 23028Panicum repensYYY
SB Davis 1422Passiflora bifloraYYY
JR Abbott 23653Pennisetum purpureumYYY
JR Abbott 25215Pennisetum purpureumYYY
JR Abbott 24732Pennisetum setaceumYYY
SB Davis 1472Phoenix reclinataYYY
JR Abbott 25171Phyllostachys aureaYYY
JR Abbott 24890Pistia stratiotesYYY
JR Abbott 25072Pittosporum pentandrumYYY
JR Abbott 24905Psidium cattleianumYYY
M.S. Frank 541Psidium guajavaYYY
JR Abbott 18042Pteris vittataNAYY
John Dowe 326 (FTG)Ptychosperma elegansYYY
JR Abbott 23100Pueraria montana var. lobataYYY
JR Abbott 23684Rhodomyrtus tomentosaYYY
JR Abbott 22668Rhynchelytrum repensYYY
JR Abbott 23663Ricinus communisYYY
JR Abbott 24953Ricinus communisYYY
Cuda s.n.Rotala rotundifoliaYYY
JR Abbott 23860Ruellia tweedianaYYY
SB Davis 0426Ruellia tweedianaYYY
JR Abbott 24906Sansevieria hyacinthoidesYYY
JR Abbott 19279Sapium sebiferumYYY
JR Abbott 24342Scaevola taccadaYYY
JR Abbott 24916Scaevola taccadaYYY
JR Abbott 24877Schefflera actinophyllaYYY
JR Abbott 23660Schinus terebinthifoliusYYY
JR Abbott 25173Scleria lacustrisYYY
SB Davis 0496Senna pendula var. glabrataYYY
JR Abbott 23503Sesbania puniceaYYY
SB Davis 0300Sesbania puniceaYYY
JR Abbott 07901Solanum diphyllumYYY
JR Abbott 24964Solanum diphyllumYYY
JR Abbott 25186Solanum jamaicenseYYY
JR Abbott 25268Solanum tampicenseYYY
M.S. Frank 634Solanum torvumYYY
JR Abbott 24945Solanum viarumYYY
JR Abbott 25267Solanum viarumYYY
JR Abbott 24879Sphagneticola trilobataYYY
SB Davis 0412Sphagneticola trilobataYYY
SB Davis 0445Stachytarpheta cayennensisYYY
Larry Noblick 5272 (FTG)Syagrus romanzoffianaYYY
JR Abbott 24908Syngonium podophyllumYYY
SB Davis 0922Syngonium podophyllumYYY
JR Abbott 23676Syzygium cuminiYYY
JR Abbott 25125Syzygium cuminiYYY
JR Abbott 25028Syzygium jambosYYY
J Possley 94Tectaria incisaYYY
P Howell 1088Tectaria incisaYYY
JR Abbott 25029Terminalia catappaYYY
JR Abbott 25030Terminalia muelleriYYY
JR Abbott 24100Thespesia populneaYYY
SB Davis 0290Tradescantia fluminensisYYY
JR Abbott 24897Tradescantia spathaceaYYY
JR Abbott 24857Tribulus cistoidesYYNA
JR Abbott 23658Urena lobataYYY
JR Abbott 22309Urochloa muticaYYY
JR Abbott 24845Vitex trifoliaYYY
SB Davis 1410Washingtonia robustaYYY
SB Davis 0243Wisteria sinensisYYY
JR Abbott 24707Xanthosoma sagittifoliumYYNA

Figure 1. A phylogram generated from rbcL and matK data generated during this study

(the trnH-psbA spacer was excluded for this figure). Branch lengths are proportional to sequence variation (i.e., the longer the branches, the more likely the taxon is to be distinguishable).

Figure 1. A phylogram generated from rbcL and matK data generated during this study (the trnH-psbA spacer was excluded for this figure). Branch lengths are proportional to sequence variation (i.e., the longer the branches, the more likely the taxon is to be distinguishable).