Botanical Journal of the Linnean Society, 2021, 197, 1–14. With 4 figures.
Can plastid genome sequencing be used for species
identification in Lauraceae?
1
Plant Phylogenetics and Conservation Group, Center for Integrative Conservation, Xishuangbanna
Tropical Botanical Garden, Chinese Academy of Sciences, Kunming, China
2
University of Chinese Academy of Sciences, Beijing, China
3
Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, China
4
Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of
Sciences, Mengla, China
5
State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of
Sciences, Beijing, China
6
Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, China
7
Herbarium (KUN), Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
8
Tibet Agriculture & Animal Husbandry University, Nyingchi, China
9
Shandong Provincial Key Laboratory of Plant Stress Research, College of Life Sciences, Shandong
Normal University, Ji’nan, China
10
Australian Centre for Evolutionary Biology and Biodiversity & Sprigg Geobiology Centre, School of
Biological Sciences, University of Adelaide, Adelaide, Australia
11
Institute of Evolutionary Biology, Ashworth Laboratories, The University of Edinburgh, Edinburgh, UK
12
Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
13
Genetics and Conservation Section, Royal Botanic Garden Edinburgh, Edinburgh, UK
Received 23 June 2020; revised 27 December 2020; accepted for publication 25 January 2021
Using DNA barcoding for species identification remains challenging for many plant groups. New sequencing
approaches such as complete plastid genome sequencing may provide some increased power and practical benefits
for species identification beyond standard plant DNA barcodes. We undertook a case study comparing standard DNA
barcoding to plastid genome sequencing for species discrimination in the ecologically and economically important
family Lauraceae, using 191 plastid genomes for 131 species from 25 genera, representing the largest plastome data
set for Lauraceae to date. We found that the plastome sequences were useful in correcting some identification errors
and for finding new and cryptic species. However, plastome data overall were only able to discriminate c. 60% of the
species in our sample, with this representing a modest improvement from 40 to 50% discrimination success with
the standard plant DNA barcodes. Beyond species discrimination, the plastid genome sequences revealed complex
relationships in the family, with 12/25 genera being non-monophyletic and with extensive incongruence relative to
nuclear ribosomal DNA. These results highlight that although useful for improving phylogenetic resolution in the
family and providing some species-level insights, plastome sequences only partially improve species discrimination,
and this reinforces the need for large-scale nuclear data to improve discrimination among closely related species.
ADDITIONAL KEYWORDS: cytonuclear discordance – DNA barcoding – nrDNA – phylogenetics – plastomes.
*Corresponding authors. E-mails: jieli@xtbg.ac.cn; Alex.
Twyford@ed.ac.uk; jbyang@mail.kib.ac.cn; PHollingsworth@
rbge.org.uk.
© 2021 The Linnean Society of London.
1
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs
licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the
work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For
commercial re-use, please contact journals.permissions@oup.com
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
ZHI-FANG LIU1,2,13, HUI MA1, XIU-QIN CI1,3, LANG LI1,3, YU SONG4, BING LIU5,6,
HSI-WEN LI7, SHU-LI WANG1,2,8, XIAO-JIAN QU9, JIAN-LIN HU1,2,
XIAO-YAN ZHANG1,2, JOHN G. CONRAN10, ALEX D. TWYFORD11,*, JUN-BO YANG12,*,
PETER M. HOLLINGSWORTH13,* and JIE LI1,3,*
2
Z.-F. LIU ET AL.
INTRODUCTION
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
The aim of DNA barcoding is to use standardized
DNA sequences to aid in species identification (Hebert
et al., 2003; Hollingsworth, 2011). Various regions have
been proposed for DNA barcoding in plants (Kress &
Erickson, 2007; Lahaye et al., 2008), with the plastid
loci matK+rbcL adopted as core DNA barcodes (CBOL
Plant Working Group, 2009), with these loci now
widely used alongside the nuclear region ITS and
other plastid loci such as trnH–psbA (Chase et al.,
2005; Kress et al., 2005; China Plant BOL Group, 2011;
Hollingsworth, Graham & Little, 2011). Despite many
benefits of using these standardized loci for plant DNA
barcoding, it has long been recognized that no single
suite of loci will be suitable across all plant taxa. In
groups where standard DNA barcoding ‘fails’ due to
technical issues, such as mutations in primer-binding
regions, or biological issues, such as rapid divergence,
researchers must augment the standard loci with
additional sequence data. The massive improvements
in genomic sequencing technologies allow researchers
to explore many possible options for ‘genomic DNA
barcodes’ to improve plant species identification.
Whole plastid genomes (plastomes) have been
proposed as suitable targets for the next wave of plant
DNA barcoding approaches (Kane et al., 2012; Ruhsam
et al., 2015; Hollingsworth et al., 2016; Twyford & Ness,
2017; Krawczyk et al., 2018), as they can be recovered
using ‘genome skimming’ (low-coverage whole genome
sequencing), a cost-effective and scalable sequencing
approach that can be performed on a range of material
(such as degraded herbarium samples) without prior
sample optimization (Alsos et al., 2020). As well as
plastome sequences, genome skimming also typically
recovers the nuclear ribosomal DNA (nrDNA)
assembly, collectively extending the plant barcode
from a few thousand to hundreds of thousands of bases.
Genome skimming also helps to circumvent primer
issues, polymerase chain reaction (PCR) failures from
amplicon sequencing of degraded DNA and different
loci being preferred for different taxonomic groups,
because shotgun sequencing is effective in routinely
recovering plastid loci and nuclear ribosomal sequences
due to their high copy numbers in plant cells (Coissac
et al., 2016). However, there remain drawbacks to this
approach for DNA barcoding of plants, as the plastid
is a small organelle in which all loci are tightly linked
and it therefore does not necessarily reflect the diverse
history of the nuclear genome. This is compounded by
the typically uniparental inheritance of the plastid
genome and resulting sensitivity to demographic
change. Similarly, the associated nrDNA, although a
useful additional source of characters, still represents a
fraction of the nuclear genome and can be problematic
for species identification and phylogenetic analysis due
to a range of issues, such as high diversity affecting
the reliability of sequence alignments, incomplete
concerted evolution and frequent paralogy. These
issues create a tension between the technical appeal
of plastomes and nrDNA in terms of ease of use, vs.
their suitability as representatives of the evolutionary
history of a species. Although the discriminatory
power of next-generation DNA barcoding in plants
has been evaluated in some recent studies (e.g. Kane
et al., 2012; Li et al., 2015; Ruhsam et al., 2015; Ji et al.,
2019; Pang et al., 2019), few studies have undertaken
direct comparisons between the suitability of standard
DNA barcodes and plastomes for genus and species
identification using multiple individuals per species
from a specific family.
Lauraceae were established by de Jussieu (1789)
in his Genera Plantarum, based on the type genus
Laurus L. (Linnaeus, 1753). Lauraceae are evergreen
or sometimes deciduous shrubs or trees (except for
Cassytha L., which is a twining, virtually leafless,
parasitic perennial vine), often with aromatic bark
and foliage (Chanderbali, van der Werff & Renner,
2001; Li et al., 2008a). Lauraceae comprise c. 50
genera and 2500–3000 species from predominantly
tropical and subtropical regions (van der Werff &
Richter, 1996), and they are most diverse in tropical
Asia, tropical America and Madagascar (Gentry, 1988;
van der Werff & Richter 1996; Li et al., 2008a). They
are economically and ecologically important as sources
of medicines, timber, fruits, spices and perfumes
(Kostermans, 1957; van der Werff & Richter, 1996; Li
et al., 2008a), and are present in wet forests at any
elevation and are frequently forest dominants (van
der Werff & Richter, 1996). Nevertheless, despite their
importance, the classification of Lauraceae is poorly
known (van der Werff & Richter, 1996) and their broadscale classification has depended traditionally on the
morphology of inflorescences and flowers (Nees von
Esenbeck, 1836; Rohwer, 1993; van der Werff & Richter,
1996), although many groups have species that show
exceptions. For example, whereas most Lauraceae
flowers are regular with three whorls, groups such as
Laureae have flowers that are frequently irregular,
sometimes with more than three whorls of fertile
stamens (Rohwer, 1993). Vegetative morphological
similarities between taxa and intra-taxon variability
are also causes of taxonomic confusion.
Previous phylogenetic studies have used various
plastid (matK, trnK, trnL–trnF, psbA–trnH, trnT–
trnL, rps16) and nuclear regions (26S, RPB2, LEAFY
and ITS) to study relationships in Lauraceae (Rohwer,
2000; Chanderbali et al., 2001; Rohwer & Rudolph,
2005; Li et al., 2008a, 2011; Rohwer et al., 2009;
Huang et al., 2016; Mo et al., 2017). Recently, Song
et al. (2020) showed that expanding to whole plastome
PLASTOMES FOR SPECIES IDENTIFICATION
MATERIAL AND METHODS
SAMPLING AND SEQUENCING
Our data set consists of plastid genomes and nrDNA
from 80 de novo genome skims, augmented with plastid
genome-only data from GenBank and LCGDB (https://
lcgdb.wordpress.com) for a further 111 individuals
(last search 3 September 2019). The resulting 191
plastome samples represented 133 species, 131 of
which represent 25 of the 55 currently recognized
genera of Lauraceae and two of which represent the
outgroup family Calycanthaceae.
The 191 samples included 101 species with N = 1
sample, and a further 90 individuals from 34 species
with N > 1 individual plastid genome sampled (mean
three individuals per species, range two to five) from
16 genera. At the genus level, 25 genera were sampled,
with 21 genera with more than one individual species
sampled. A summary of the data set generated for
this study is shown in Table 1 with details of the taxa
sampled in the Supporting Information (Table S1).
From the complete plastid genomes, we could also
subsample regions for comparative analyses of species
discrimination. For these analyses we compared (1) the
complete plastid genome, (2) the gene regions from the
plastid genomes, and (3) the standard DNA barcode
matrix (rbcL+matK+trnH–psbA) and some barcode
regions proposed as being useful for the family, e.g.
Lauraceae-specific barcodes (ycf1+ndhH–rps15+trnL–
ycf2; for further information see Note S1).
The 80 de novo plastid genomes (including one
individual sequenced twice) were derived from samples
collected from eight provinces in China (Guangdong,
Guangxi, Hainan, Hubei, Jiangxi, Sichuan, Yunnan
and Xizang) and sites in Japan and Myanmar
(Supporting Information, Table S2). Leaf tissue for
each taxon was dried with silica gel and vouchers
were deposited at the Herbarium of Xishuangbanna
Tropical Botanical Garden, Chinese Academy of
Sciences (HITBC), Yunnan, China. The specimens
and vouchers were identified by morphological and
molecular comparisons as described previously (Liu
et al., 2017).
Total genomic DNA was extracted using a modified
CTAB method (Doyle & Doyle, 1987) with a Tiangen
DNA secure Plant Kit (DP320). Yield and integrity of
genomic DNA extracts were quantified by fluorometric
quantification on the Qubit (Invitrogen, Carlsbad, CA,
USA) using the dsDNA HS kit and by visual assessment
on a 1% agarose gel. The extracted DNA was sheared
Table 1. Comparison of characteristics of different data sets in Lauraceae
Data sets
Subsets
Number
of taxa
Number
of species
Number
of genera
Number
of sites
Best fit model
of ML analysis
191 plastomes
data set
Plastomes
Concatenated
genes
Extracted specific
matrix
Extracted
standard matrix
80 plastomes
80 rDNA
191
191
131
131
25
25
180 129
79 387
TVM+F+I+G4
GTR+F+I+G4
191
129
24
2704
TVM+F+G4
191
131
25
2144
K3Pu+F+I+G4
79
79
71
71
21
21
172 022
6281
TVM+F+I+G4
TN+F+I+G4
Cytonuclear
discordance data set
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
sequences can produce better resolved evolutionary
relationships than Sanger sequencing of a few key
loci. Although overall relationships among Lauraceae
are mostly well known, species relationships in
genera are still poorly understood. Most studies to
date have only sampled a single individual per taxon,
and sampling multiple individuals per species across
diverse taxa would allow us to test for species-level
monophyly and discrimination (Ji et al., 2019). Our
previous study (Liu et al., 2017) used standard DNA
plant barcodes to resolve Lauraceae relationships and
classification by sampling multiple individuals per
species; however, the resolution was poor.
Accordingly, here we investigate whether the
plastome can improve species discrimination relative
to standard DNA barcodes in Lauraceae. Specifically,
we have four aims: (1) to determine if plastome-based
DNA barcodes improve taxonomic resolution and
the potential for species identification compared to
standard DNA barcodes; (2) to establish whether some
proposed ‘Lauraceae-specific’ plastid barcodes provide
useful information for species discrimination; (3) to
relate genetic to morphological data to detect cases
of mistaken identity and facilitate species discovery;
and (4) to reconstruct phylogenetic relationships
in Lauraceae and see if plastid genomes match the
species boundaries determined from analysis of
nrDNA sequences.
3
4
Z.-F. LIU ET AL.
into c. 500-bp fragments for library construction
using the standard protocol for the NEBNext Ultra
IITMDNA Library Prep Kit for Illumina. All samples
were sequenced on the Illumina HiSeq 2000 and
Illumina HiSeq X at BGI and Novogene (Supporting
Information, Table S2).
AMAS (Borowiec, 2016). In addition, the specific and
standard plastid barcode sequences from the 191
plastomes were extracted and concatenated using
Geneious11.1.4.
SPECIES DISCRIMINATION AND
PHYLOGENETIC ANALYSES
ASSEMBLY, ANNOTATION AND GENE SUBSAMPLING
Our measures of species discrimination success for
the genomic regions in question sampled multiple
individuals per species to assess if these individuals
were more closely related to each other than to
other species. The species discrimination statistics
only used cases where N > 1 individual sampled per
species, but singleton species samples were included
as decoys, as they occupy phylogenetic space and
can ‘disrupt’ species-level monophyly, causing
species recovery to ‘fail’, but they are not themselves
included in the discrimination statistics. Overall, we
recorded the proportion of species and genera that
resolved as monophyletic following phylogenetic
analysis.
The utility of different data sets for species and
genera identification were investigated using the two
tree-based approaches: ML (maximum likelihood)
methods using IQTREE (Trifinopoulos et al., 2016) and
NJ (neighbour joining) methods using Geneious11.1.4.
The best-fit model for each data set was determined
using ModelFinder (Kalyaanamoorthy et al., 2017),
with the best-fit substitution model selected by the
option –TEST and using a tree search with 1000
bootstrap replicates in a single run (Kalyaanamoorthy
et al., 2017). The number of species or genera with
multiple accessions resolving as monophyletic was
recorded, as was the branch support for each node with
> 50% support.
To provide a best estimate of the phylogeny of
Lauraceae, we also undertook phylogenetic analysis
using ASTRAL-III (Zhang et al., 2018), as a recent study
verified that the multispecies coalescent method for
determining phylogeny offered a high level of accuracy
with plastid data (Gonçalves et al., 2019). This approach
considers variation in the phylogenetic signal across
plastid genes, so we based our family-level phylogenetic
tree on concatenated gene regions (excluding intergenic spacers) from the 191 plastome data set,
summarized by the coalescent-based ASTRAL method.
The concatenated matrix yielded 113 gene trees (absent
genes were treated as missing data). Construction of the
species tree was performed using the separate gene ML
trees from the 191 plastomes as input for ASTRAL-III
(nodes with < 10% bootstrap support were collapsed),
with 100 bootstrap replicates generated to assess
bootstrap support using the coalescent model. The R
package phytools (Revell, 2012) was used to compare
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
GetOrganelle (Jin et al., 2018) was used for assembly
of plastomes and nrDNA (18S–ITS1–5.8S–ITS2–26S).
GetOrganelle uses baiting and iterative mapping to
assemble plastomes with minimal manual intervention
and integrates SPAdes (Bankevich et al., 2012), Bowtie2
(Langmead & Salzberg, 2012), BLAST+ (Camacho et al.,
2009) and Bandage (Wick et al., 2015). Comparison of
the published plastomes for Lauraceae (Song et al., 2015,
2016, 2017a, b, 2018, 2020; Hinsinger & Strijk, 2017;
Wu et al., 2017; Zhao et al., 2018) led us to choose Litsea
glutinosa (Lour.) C.B.Rob (KU382356) as the plastid
genome reference for assembly, and the pipeline reference
Embryophyta plant nuclear as the partial or complete
nrDNA (18S–ITS1–5.8S–ITS2–26S) sequence assembly
reference https://github.com/Kinggerm/GetOrganelle.
All plastid genomes were checked manually for assembly
quality, particularly at the inverted repeat boundaries.
MAFFT (Kuraku et al., 2013; Katoh, Rozewicki &
Yamada, 2017) was used for sequence alignment,
followed by a manual check using Mesquite (Maddison
& Maddison, 2018) and Geneious11.1.4. Alignments in
FASTA format were exported for each data set.
Plastid genomes were annotated using PGA (Qu
et al., 2019) and GeSeq (Tillich et al., 2017). For
plastome annotations, as well as Litsea glutinosa, two
other well-annotated early-diverging species were
also used as references [Caryodaphnopsis henryi Airy
Shaw (MF939346) and a new unpublished species
of Beilschmiedia Nees (C4011)]. To standardize
annotation, we also re-annotated the previously
published sequences with this workflow. After
annotation, a manual check was undertaken and the
reading frame was verified in Geneious11.1.4 (https://
www.geneious.com) by visually inspecting the start
and stop codons. The orientations of the inverted
repeats (IRs) were checked by LASTZ (Harris, 2007).
Transfer RNAs (tRNAs) were confirmed by their
specific structure predicted by tRNAscan-SE 2.0
(Lowe & Chan, 2016; de Santana Lopes et al., 2019)
compared with other published annotated genomes
(Shinozaki, 1986; Song et al., 2017b; Qu et al., 2019).
Gene extraction was performed using the script
‘get_annotated_regions_from_gb.py’ of Jin (https://
github.com/Kinggerm/PersonalUtilities) to obtain
the annotated regions, then checked manually using
Geneious11.1.4 with the matrix concatenated using
PLASTOMES FOR SPECIES IDENTIFICATION
RESULTS
SEQUENCE CHARACTERISTICS
The aligned consensus length of the 191 complete
plastomes was 180 129 bp, with the corresponding
concatenated genes, the extracted ‘Lauraceae-specific’
and ‘standard’ barcode matrices being 79 387, 2704 and
2144 bp, respectively. For the newly sequenced samples
used in the analysis of cytonuclear discordance, the
nrDNA (18S–ITS1–5.8S–ITS2–26S) alignment was
6281 bp, with the corresponding plastome length
172 022 bp (Table 1). The plastid genome sizes were
similar between accessions, except for the parasitic
genus Cassytha (Supporting Information, Table S3),
with Cassytha filiformis L. MH03, MH04 having the
smallest plastid genome (114 705 bp). Cassytha has
lost one IR region and most ndh genes, with remnants
of some ndh regions as pseudogenes. The largest
plastid genome is Syndiclis sp. ZF61 (Cryptocaryeae)
with 158 639 bp, and Cryptocaryeae overall have larger
plastome genomes (157 057–158 639 bp based on the
unaligned sequences; Table S3), due in part to multiple
large insertions in Beilschmiedia, Cryptocarya R.Br.,
Endiandra R.Br. and Syndiclis Hook.f. For example,
in Beilschmiedia, there are insertions up to 723 bp
relative to the related genus Neocinnamomum H.Liu.
However, there are also large deletions, such as a
1657-bp deletion in Beilschmiedia compared with
other early-diverging genera. For most species with
more than one individual, the genome sizes were
mostly the same, although some were variable, e.g.
Neocinnamomum delavayi (Lecomte) H.Liu and
Phoebe bournei (Hemsl.) Yen C.Yang (Table S3). These
length variations mainly relate to indels located at the
beginning of the large single copy (LSC) region.
MISTAKEN IDENTIFICATION AND SPECIES DISCOVERY
After combining DNA sequences and relating these
sequences to existing morphological characters, a few
putative species were divergent from most individuals
sequenced for their assigned species or genus, and
these were found to be nested in other taxa (labelled
red in Fig. 1 and Supporting Information, Figs S1–S9).
For the publicly available downloaded samples, we can
only speculate about potential misidentifications, as
even checking voucher specimens can leave uncertainty.
For the individuals that we sampled and sequenced,
the examination procedure of Liu et al. (2017) was
followed; we rechecked our sequences together with
the morphology of our vouchers, herbarium specimens
from HITBC, KUN and living specimens in the XTBG
and KIB botanic gardens.
Twelve individuals were found to be mislabelled or
potentially so (Table 2), including ZF14, which was
similar vegetatively to Alseodaphnopsis petiolaris
(Meisn.) H.W.Li & J.Li and initially identified as such.
However, the plastome sequence of ZF14 clustered
with Machilus Nees. Recollecting the sample and
repeating the experimental procedures and analyses
resulted in ZF14b (Supporting Information, Tables S1–
S3) still clustering with Machilus (Fig. 1; Figs S1–S9);
however, as Machilus was found to be monophyletic in
previous studies (Li et al., 2011; Song et al., 2020), we
therefore assumed our initial identification of ZF14 as
Alseodaphnopsis petiolaris was an error (Table 2).
An additional anomaly in Cassytha was deemed
worthy of further investigation. As there is just one
species of Cassytha (C. filiformis) described in China
(Li et al., 2008a), we treated all Cassytha samples as
C. filiformis; however, our plastome data showed that
MH01, 02 and SZ01 clustered with the GenBank
plastome of C. capillaris Meisn. 1258175302 collected
from Indonesia. A comparison of all plastomes of
Cassytha spp. showed that the plastome sizes of
the GenBank C. capillaris 1258175302 and our new
sequences of MH01, 02 and SZ01 were much larger than
those of the remaining samples assigned to C. filiformis
(MH03, 04, 1258175251, 1243302039 and 1474379909),
with clear insertion/deletion sites separating these
sample groups. Morphological comparisons were also
made and the sample fruits of SZ01 were found to be
reddish and barely strigose. These features are recorded
as distinguishing characteristics of C. capillaris (Weber,
1981, 2007). The specimens of MH01 and 02 were older,
and less suited for comparisons, particularly as the
colours have faded on their specimens.
In addition, Mo et al. (2017) described GLQ26 and
GLQ33 as a new record for Yunnan Province, China,
of Alseodaphnopsis rugosa (Merr. & Chun) H.W.Li
& J.Li. However, our results placed these samples
instead with the recently described Alseodaphnopsis
maguanensis L.Li & J.Li (Li et al., 2020).
Further comparisons of other samples using genetic
data matched against morphological characters
suggests that several of them represent potential
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
ML trees between the 191 plastomes and concatenated
genes trees.
To understand the relationships between plastomes
and nrDNA better, a further analysis was conducted
using only our 80 assembled whole plastid genomes
and associated nrDNA (18S–ITS1–5.8S–ITS2–26S).
This data set was used as the nrDNA and the plastome
sequence data are derived from the same individuals.
Cytonuclear discordance analysis of the 80 newly
sequenced plastomes and nuclear DNA data sets
was performed using ML and phytools was used for
comparing the resulting ML trees.
5
6
Z.-F. LIU ET AL.
Table 2. Original species determinations and correct
species using DNA barcodes
Corrected species
determination
Actinodaphne pilosa GLQ34
Alseodaphnopsis rugosa GLQ26
Neolitsea sp. GLQ34
Alseodaphnopsis
maguanensis GLQ26
Alseodaphnopsis
maguanensis GLQ33
Machilus sp. ZF14
Endiandra sp. C40
Cassytha capillaris
MH01
Cassytha capillaris
MH02
Cassytha capillaris
SZ01
Lindera sp. JP31
Machilus sp. ZF59
Lindera communis
1433040893
Persea americana
SY9559
Alseodaphnopsis rugosa GLQ33
Alseodaphnopsis petiolaris ZF14
Beilschmiedia robusta C40
Cassytha filiformis MH01
Cassytha filiformis MH02
Cassytha filiformis SZ01
Cinnamomum caudiferum JP31
Cinnamomum sp. ZF59
Lindera nacusua 1433040893
Persea americana var. drymifolia
SY9559
new species of Alseodaphnopsis, Beilschmiedia and
Phoebe Nees (labelled in blue in Fig. 1 and Supporting
Information, Figs S1–S9). These samples warrant
further investigation and are noted here as having
either divergent sequences or atypical morphologies
compared to their sister species in the phylogenetic tree.
COMPARISON OF DISCRIMINATION EFFICIENCY
The resolution rates of species (41.2–58.8%) and
genera (52.4–66.7%) varied by data source and
analytical method (Table 3; Fig. 2). The species and
genera successfully distinguished are indicated with
a circle and a star in Supporting Information Figures
S1–S8. The two tree-based methods (ML and NJ) have
the same discrimination ability at the species level,
whereas ML performed better than NJ at the genus
level (Fig. 2). The plastomes and concatenated genes
gave the highest resolution rates (Table 3; Fig. 2; Figs
S1–S4) and a comparison is shown in Figure S9, with
lower resolution from the Lauraceae-specific barcodes,
and the standard barcodes giving the lowest resolution
of all (Table 3; Fig. 2; Figs S5–S8).
ANALYSES OF PHYLOGENY AND
Figure 1. Phylogenetic relationships of Lauraceae species
tree generated from concatenated genes of the plastid
genomes bases on ASTRAL analysis.
CYTONUCLEAR DISCORDANCE
Along with a comparison of the previous phylogenetic
hypotheses and based on all previous studies, Figure 3
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
Original species determination
11/21 (52.4%)
12/21 (57.1%)
13/21 (61.9%)
14/21 (66.7%)
14/21 (66.7%)
13/21 (61.9%)
12/20 (60%)
11/20 (55%)
14/34 (41.2%)
14/34 (41.2%)
20/34 (58.8%)
20/34 (58.8%)
Successful species/
sampled species
Successful genera/
sampled genera
20/34 (58.8%)
20/34 (58.8%)
16/32 (50%)
16/32 (50%)
Standard-NJ
Standard-ML
Specific-NJ
Specific-ML
Concatenated
genes-NJ
Concatenated
genes-ML
Plastomes-ML
Barcode regions
Plastomes-NJ
7
Figure 2. Discrimination efficiency comparison of 191
plastomes and sub-sampled regions.
shows the current most likely phylogenetic backbone
of Lauraceae. The results of our study are shown in
Figure 1. Our results strongly support Lauraceae as
monophyletic [bootstrap support (BS) = 100%] (Fig. 1),
sister to a clade containing the outgroup species
[Calycanthus chinensis (W.C.Cheng & S.Y.Chang)
P.T.Li and Calycanthus floridus var. glaucus
(Willd.) Torr. & A.Gray; Calycanthaceae]. Lauraceae
formed seven clades (Fig. 1), agreeing with previous
phylogenetic analyses (Fig. 3). In the first-diverging
clade I, Cryptocaryeae comprised five genera, with
Eusideroxylon Teijsm. & Binn. sister to a clade with
three monophyletic genera (Cryptocarya, Endiandra
and Syndiclis) and Beilschmiedia paraphyletic
(Fig. 1). Clade II (BS = 100%) consisted of Cassytheae
with one genus, Cassytha, but displaying long
branch lengths (Supporting Information, Fig. S9).
Clade III (BS = 100%), Neocinnamomeae, included
only Neocinnamomum. Clade IV (BS = 100%),
Caryodaphnopsideae, similarly only contained
Caryodaphnopsis Airy Shaw. Seven genera of Perseeae
(Persea Mill., Dehaasia Blume, Nothaphoebe Blume,
Alseodaphne Nees, Alseodaphnopsis, Phoebe and
Machilus) were sister to a clade comprising taxa
from Cinnamomeae and Laureae (Fig. 1). Persea was
separated into two clades, with P. borbonia L. mixed
with Dehaasia, Nothaphoebe and Alseodaphne,
whereas P. americana Mill. was sister to the remaining
Perseeae. Species of Alseodaphnopsis were found
in two separate clades: one resolved at the base of
the clade containing the Phoebe–Machilus lineage,
and the other sister to Machilus (Fig. 1). The sister
relationship between Cinnamomeae and Laureae
was moderately supported (BS = 64%) (Fig. 1). For
Cinnamomeae, Nectandra angustifolia Nees & Mart.
ex Nees was sister to the remainder. Sassafras J.Presl
was nested in Cinnamomum Schaeff., making the
latter paraphyletic. In Laureae, Iteadaphne Blume
(one species only), Laurus (two sampled species),
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
Table 3. Success in species discrimination based on data from the 191 plastomes data set (based on species with two or more sampled individuals per species)
PLASTOMES FOR SPECIES IDENTIFICATION
8
Z.-F. LIU ET AL.
Neolitsea (Benth.) Merr. and Parasassafras D.G.Long
(one species only) were monophyletic (Fig. 1), whereas
Lindera Thunb., Litsea Lam. and Actinodaphne Nees
were either poly- or paraphyletic.
Alseodaphnopsis was monophyletic in a previous
nrDNA study (Mo et al., 2017), but our study using
plastid genes did not support this. Phylogenetic
incongruence of plastomes and nuclear DNA
extends across the hierarchical taxonomic levels of
Lauraceae, even though the information provided by
nrDNA is limited (Fig. 4). First, although two earlydiverging tribes, Cryptocaryeae and Cassytheae,
gave consistent relationships, there were conflicting
patterns on the placement of Cryptocarya depauperata
H.W.Li and Cryptocarya hainanensis Merr. Second,
Neocinnamomeae and Caryodaphnopsideae changed
their phylogenetic positions in the nrDNA tree relative
to the plastome tree, and this inconsistency was well
supported (100%) in a bootstrap analysis (Fig. 4). Third,
although Cinnamomeae, Laureae and Perseeae were
consistent at the tribe level, cytonuclear discordance
occurred within them (Fig. 4.2). In total, 19 individuals
from our 80 samples were consistent across the
plastome and nrDNA trees, with the remaining 61
individuals showing conflict between the nuclear and
plastome phylogenomic analyses.
DISCUSSION
DNA BARCODING PERFORMANCE
There remain relatively few studies assessing the power
of complete plastome sequences in plant barcoding (Ji
et al., 2019). The discriminatory power revealed by
previous studies sampling multiple individuals from
multiple congeneric species is variable. For instance,
the plastome was shown to successfully distinguish
some closely related species in Quercus L. (Pang et al.,
2019), Stipa L. (Krawczyk et al., 2018) and Taxus L. (Fu
et al., 2019). However, plastomes failed to significantly
improve species identification in Panax L. (Ji et al.,
2019) and New Caledonian Araucaria Juss. (Ruhsam
et al., 2015).
Our comparison in Lauraceae indicates that the
discrimination rates of the plastome was higher
than those of standard DNA barcodes or Lauraceaespecific barcodes. The Lauraceae-specific barcodes
(i.e. loci selected as having potential for use in
Lauraceae) gave a modest increase in resolution,
but this was still only improved from 41% at the
species level (for standard barcodes) to 50%. The
highest species resolution in our study was c. 60%
from the complete plastome sequences. Even here,
however, with far from complete species-level
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
Figure 3. Review of current phylogenetic relationships of Lauraceae based on previous Sanger and plastid genome
sequences.
PLASTOMES FOR SPECIES IDENTIFICATION
9
Cinnamomeae Clade I
Cinnamomeae Clade II
Neolitsea levinei 20160028
Neolitsea homilantha JP20
Neolitsea sp. GLQ34
Actinodaphne henryi ZF41
100
Lindera villipes SL05
100
100
100
100
100
100
Lindera thomsonii var.vernayana ZF44
Iteadaphne caudata ZF56
Parasassafras confertiflorum LC001
Litsea liyuyingi ZF20
Litsea dilleniifolia ZF39
Litsea monopetala ZF04
100
100
100
100
Lindera obtusiloba var.heterophylla SL07
Litsea sericea SL06
Litsea rubescens KZ04
Litsea tibetana SL09
Litsea glutinosa ZF03
Laurus nobilis KZ02
100
100
61
100
100
Lindera communis ZF21
Lindera communis KZ06
Lindera sp. JP31
100
100
100
100
100
100
100
100
100
100
96
Cinnamomum glanduliferum KZ08
Cinnamomum camphora KZ05
Cinnamomum japonicum ZF50
Cinnamomum burmannii ZF38
Cinnamomum verum ZF33
Cinnamomum tamala YB08
Sassafras tzumu ZF48
Sassafras albidum KZ23
Cinnamomum porrectum YB04
Cinnamomum caudiferum FD24
Machilus rufipes s20
Machilus duthiei KZ03
98
Machilus yunnanensis HLT
Machilus faberi WH02
Machilus pauhoi WH03
94
100
Machilus sp. ZF14b
100
100
100
100
Machilus sp. ZF14
Machilus minutiflora ZF37
100
Machilus sp. ZF59
Alseodaphnopsissp.
sp. ZF57
100
99
100
100
Alseodaphnopsis hainanensis ZF68
Alseodaphnopsis rugosa ZF67
Alseodaphnopsis sp. nov. C4013
100
100
100
52
100
100
100
80
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
Caryodaphnopsis tonkinensis GLQ08
Caryodaphnopsis malipoensis GLQ13
Neoinnamomum delavayi KZ01
Neocinnamomum caudatum ZF22
Neocinnamomum mekongense ZF23
Neolitsea homilantha JP20
100
53
100
79
97
52
100
100
Neolitsea sp. GLQ34
100
Parasassafras confertiflorum LC001
62
Lindera sp. JP31
Cinnamomum glanduliferum KZ08
Cinnamomum camphora KZ05
Cinnamomum porrectum YB04
Cinnamomum caudiferum FD24
100
71
100
100
99
Sassafras tzumu ZF48
Sassafras albidum KZ23
Cinnamomum verum ZF33
Cinnamomum tamala YB08
Cinnamomum japonicum ZF50
Cinnamomum burmannii ZF38
Phoebe sheareri WH05
Phoebe sheareri WH01
Phoebe sheareri ZF45
Phoebe zhennan JY10
Phoebe bournei SCH08
100
99
100
100
94
100
80
97
100
99
Phoebe hui ZF49
Phoebe sp. nov. ZF55
100
Phoebe cavaleriei SN01
Alseodaphnopsis hainanensis ZF68
Alseodaphnopsis sp. nov. C4013
99
92
Alseodaphne gracilis GLQ03
Phoebe yunnanensis ZF51
100
95
98
Alseodaphnopsis rugosa ZF67
79
88
100
Alseodaphnopsis sp. ZF57
Alseodaphnopsis petiolaris ZF64
Alseodaphnopsis andersonii ZF01
Alseodaphnopsis ximengensis XM01
Persea americana YL01
Machilus pauhoi WH03
Machilus duthiei KZ03
Machilus rufipes s20
95
93
97
65
79
79
Machilus minutiflora ZF37
Machilus ynnanensis HLT
Machilus faberi WH02
50
79
Machilus sp. ZF14b
96
100
Machilus sp. ZF14
Machilus sp. ZF59
Neocinnamomum mekongense ZF23
Neocinnamomum caudatum ZF22
Neoinnamomum delavayi KZ01
100
100
Caryodaphnopsis tonkinensis GLQ08
Caryodaphnopsis malipoensis GLQ13
Cassytha capillaris MH02
Cassytha capillaris MH01
Cassytha capillaris MH01
Cassytha capillaris SZ01
Cassytha capillaris SZ01
Cassytha filiformis MH04
Cassytha filiformis MH03
Cassytha filiformis MH04
Cassytha filiformis MH03
Beilschmiedia sp. nov. C4011
Beilschmiedia sp. nov. YB43
Beilschmiedia yunnanensis C25
Syndiclis sp. ZF61
Syndiclis marlipoensis ZF60
Beilschmiedia sp. nov. C4011
Beilschmiedia sp. nov. YB43
Beilschmiedia yunnanensis C25
Syndiclis sp. ZF61
Syndiclis marlipoensis ZF60
Endiandra dolichocarpa TN01
Endiandra dolichocarpa TN01
Cryptocarya brachythyrsa ZF02
91
100
Actinodaphne henryi ZF41
Cassytha capillaris MH02
Cryptocarya depauperata ZF11
Cryptocarya hainanensis ZF10
95
Neolitsea levinei 20160028
Alseodaphnopsis maguanensisi GLQ26
Cryptocarya yunnanensis YB13
100
100
100
Litsea sericea SL06
Litsea rubescens KZ04
Lindera villipes SL05
Lindera thomsonii var.vernayana ZF44
Iteadaphne caudata ZF56
Lindera obtusiloba var.heterophylla SL07
Alseodaphnopsis maguanensisi GLQ33
Phoebe sp. nov. ZF55
Phoebe cavaleriei SN01
Phoebe yunnanensis ZF51
Alseodaphnopsis petiolaris ZF64
Alseodaphnopsis andersonii ZF01
Alseodaphnopsis ximengensis XM01
Persea americana YL01
Alseodaphne gracilis GLQ03
99
100
Litsea glutinosa ZF03
Litsea tibetana SL09
Alseodaphnopsis maguanensisi GLQ33
Phoebe sheareri WH05
Phoebe sheareri WH01
Phoebe sheareri ZF45
Phoebe zhennan JY10
Phoebe hui ZF49
Phoebe bournei SCH08
100
Laurus nobilis KZ02
Alseodaphnopsis maguanensisi GLQ26
Endiandra sp. C40
100
Litsea liyuyingi ZF20
Litsea dilleniifolia ZF39
Litsea monopetala ZF04
Lindera communis ZF21
Lindera communis KZ06
100
84
95
100
99
100
76
100
99
100
Endiandra sp. C40
Cryptocarya yunnanensis YB13
Cryptocarya hainanensis ZF10
100
100
57
98
Cryptocarya depauperata ZF11
Cryptocarya brachythyrsa ZF02
100
Figure 4. Discordance between 80 GetOrganelle-assembled plastomes and nrDNA sequences based on ML analyses.
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
100
100
100
10
Z.-F. LIU ET AL.
PHYLOGENETIC RELATIONSHIPS AND INCONGRUENCE
BETWEEN PLASTOMES AND NRDNA
The plastome sequence analysis recovered seven tribes
using the multispecies coalescent method ASTRAL,
with this approach giving the most detailed insight
into the relationships of Lauraceae, although we lack
samples from two remaining clades in Figure 3 for
which previous molecular data exist: Hypodaphnis
Stapf from Cameroon, Gabon and Nigeria (Rohwer,
1993); and Mezilaurus Taub. from South America
(Rohwer, 1993; Chanderbali et al., 2001; Rohwer &
Rudolph, 2005).
Our results provide the most comprehensive
plastome-based phylogenetic hypothesis for
relationships in Lauraceae. However, there are still
many non-monophyletic groups and taxonomic issues
that need to be resolved. The paraphyletic relationships
in Actinodaphne, Beilschmiedia, Lindera, Litsea and
Persea have been documented in previous studies
(Rohwer, 2000; Chanderbali et al., 2001; Li, Li &
Conran, 2007; Li et al., 2008b, 2011; Rohwer et al.,
2009), whereas the relationships of Alseodaphnopsis
seen here conflict with previous studies (Mo et al.
2017). Mo et al. (2017) published the new genus
Alseodaphnopsis based on nuclear DNA regions, but
the genus is poorly known; the limited availability of
collections makes morphological diagnostic characters
hard to find (van der Werff, 2019), and our study
recovered the genus as polyphyletic. Similar to Rohde
et al. (2017) and Trofimov & Rohwer (2020), we found
that Sassafras was nested in Cinnamomum, with their
morphological similarities discussed above.
Cytonuclear discordance has been observed in many
plant groups (Rieseberg & Soltis, 1991; Soltis & Kuzoff,
1995; Folk et al., 2018; Liu et al., 2018, 2019; Ji et al., 2019;
Lee-Yaw et al., 2019). The relationships in Lauraceae
reconstructed using nrDNA sequences contrast with
the phylogenetic tree derived from plastomes (Fig. 4).
This incongruence between the maternally inherited
plastid and biparentally inherited nuclear DNA has
been reported in Lauraceae by Rohde et al. (2017)
using psbA–trnH plus trnG–trnS plus ITS.
The results seen here show that is is feasible to
use either plastomes or nrDNA for early-diverging
groups such as Cryptocaryeae and Cassytheae, as
the relationships were consistent with either data
source, but the conflicts in the remaining tribes
caution against the use of plastomes and nrDNA to
infer relationships in isolation. Specifically, the wellsupported incongruent sister relationship between
Caryodaphnopsideae and Neocinnamomeae suggests
that there may have been historical reticulation or
other complex processes shaping the early evolutionary
history of these groups. In addition, the incongruent
phylogenetic relationships in Laureae, Perseeae and
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
sampling and 180 kb of sequence data, many species
are not distinguishable with sequence data. Further
sampling is likely to decrease this discrimination
success, as the phylogenetic space becomes more
densely occupied, and the situations where DNA
barcodes did not discriminate between species are
typically associated with higher sample density of
co-occurring congeneric species. It is not clear what
the primary driver(s) is for this discrimination
failure in Lauraceae; plastid genomes being shared
by hybridization, recent diversification or rapid
radiations, slow sequence mutation rates and/or
restricted infraspecific gene flow could be involved
(Fazekas et al., 2009; Hollingsworth et al. 2011, 2016;
Ruhsam et al., 2015).
Although we are focusing on the ‘failure’ of the
DNA barcoding data to discriminate among taxa in
Lauraceae, an additional factor to consider is whether
the taxonomy of the family needs further revision.
There are many species from different genera of
Lauraceae with similar morphological characteristics.
For example, some Machilus spp. are similar to
Phoebe, as are some Dehassia to Alseodaphne
and Alseodaphnopsis. Conversely, some taxa that
were thought to be easy to discriminate using
morphological characteristics were grouped closely on
the phylogenetic tree. For example, the simple leaved
taxa of Cinnamomum showed a close relationship
with the lobed leaved taxa of Sassafras. Building on
this example, the affinity of taxa in these genera is
further revealed by reconsideration of morphological
characters. Thus, simple leaves also exist in Sassafras
which are also morphologically quite similar to
Cinnamomum section Camphora Meissn. Likewise,
the flowers of Sassafras in the Asian species show
similarities to those of Cinnamomum, and the fruits of
Sassafras species are similar to those of Cinnamomum
section Camphora. Thus, part of the conflicting signal
between the plastid data and morphology-based
classifications may also be due to a complex history
of the interpretation of morphological characters in
the family.
Despite the data showing imperfect resolution
at the species level, our results also enabled the
detection of misidentifications as well as the
identification of cryptic species and potential new
species. Previously only Cassytha filiformis was
known in China (Li et al., 2008a). Here, through our
barcode research, we can now confirm that there is
another Cassytha sp., Cassytha capillaris, present
in China, representing a new record for the country.
Our study also highlighted other potential new taxa
in Alseodaphnopisis, Beilschmiedia and Phoebe that
warrant further investigation (Fig. 1; Supporting
Information, Figs S1–S9).
PLASTOMES FOR SPECIES IDENTIFICATION
CONCLUSIONS
Our plastid genome data resulted in a modest increase
in discriminatory power in Lauraceae compared to
standard DNA barcodes or regions selected for the
family. Using plastomes as genomic DNA barcodes was
nevertheless useful in the correction of misidentified
species, the discovery of cryptic species and in forming
the foundation of the description of new species. The
plastome data set also provided a useful phylogenetic
framework for the family, but cytonuclear discordance
suggests caution is needed in the interpretation of
plastomes and/or nrDNA phylogenetic analyses of
the family. This case study reiterates the value of
accessing multi-locus information from the nuclear
genome for species discrimination and understanding
phylogenetic relationships, especially among closely
related taxa.
DATA ACCESSIBILITY
GenBank numbers, plastid genomes MT621572–
MT621650 and nrDNA MT628590–MT628656,
MT669015–MT669026, noted in Table S2.
ACKNOWLEDGEMENTS
The authors would like to thank Hua-Jie Zhang, JieQiong Li, Xian-Hui Shen, Wei-Yue Zhao, Xue Bai,
Cai-Yu Sheng, Ji-Pu Shi, Yun-Xue Xiao, Lin-Li Zheng,
Zhi-Yuan Lu, Zhi-Xiang Liu, Jian-Hua Xiao, Ding
Xin, Chao-Nan Cai, Qin-Xi Hou, Yue-Qing Mo, Zhi-Yi
Liu, Hong-Hu Meng, Can-Yu Zhang and Jian-Wu Li
for collection assistance and Jens G. Rohwer for some
species morphological identification. We are grateful to
Chun-Yan Lin and Jing Yang for experiment assistance,
and Yu-Hsin Tseng, Xiang-Qin Yu, Xiao-Yang Gao,
Zhen-Shan He, Catherine Kidner, Markus Ruhsam,
Linda Neaves, Laura Forrest, Li-Na Dong, Xiu Hu,
Peng-Cheng Fu, the ICT of RBGE and the CyVerse
Atmosphere Team for data analysis assistance. Special
thanks go to Jian-Jun Jin and Wen-Bin Yu for their kind
help with plastome assembly and to Stephen Jones for
copy-editing an earlier version of the manuscript. We
also thank Jens G. Rohwer for his comments on the
manuscript. We acknowledge the China Scholarship
Council who supported Zhi-Fang Liu as a joint PhD
student at the Royal Botanic Garden Edinburgh. This
work was funded by the National Natural Science
Foundation of China (31370245, 31770569, 31970222),
the Biodiversity Conservation Program of the Chinese
Academy of Sciences (ZSSD-013), the Science and
Technology Basic Resources Investigation Program of
China: Survey and Germplasm Conservation of plant
Species with Extremely small populations in southwest China (2017YF100100), and the 135 programmes
of the Chinese Academy of Sciences (2017XTBG-T03).
The authors have no conflicts of interest to declare.
REFERENCES
Alsos IG, Lavergne S, Merkel MKF, Boleda M, Lammers Y,
Alberti A, Pouchon C, Denoeud F, Pitelkova I, Pușcaș M.
2020. The treasure vault can be opened: large-scale genome
skimming works well using herbarium and silica gel dried
material. Plants 9: 432.
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M,
Kulikov AS, Lesin VM, Nikolenko SI, Pham S,
Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N,
Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a
new genome assembly algorithm and its applications to
single-cell sequencing. Journal of Computational Biology:
a Journal of Computational Molecular Cell Biology 19:
455–477.
Borowiec ML. 2016. AMAS: a fast tool for alignment
manipulation and computing of summary statistics. PeerJ 4:
e1660.
Camacho C, Coulouris G, Avagyan V, Ma N,
Papadopoulos J, Bealer K, Madden TL. 2009. BLAST+:
architecture and applications. BMC Bioinformatics 10: 421.
CBOL Plant Working Group. 2009. A DNA barcode for land
plants. Proceedings of the National Academy of Sciences of
the United States of America 106: 12794–12797.
Chanderbali AS, van der Werff H, Renner SS. 2001.
Phylogeny and historical biogeography of Lauraceae:
evidence from the chloroplast and nuclear genomes. Annals
of the Missouri Botanical Garden 88: 104–134.
Chase MW, Salamin N, Wilkinson M, Dunwell JM,
Kesanakurthi RP, Haider N, Haidar N, Savolainen V.
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
Cinnamomeae emphasize the significance of hybrid
origin and reticulate evolution inside these three large
groups, with multiple examples of inter-specific and
inter-generic incongruence in these tribes (Fig. 4.2).
A particularly difficult problem with the
phylogenetics of Laureae is that relationships in the
Litsea complex remain unresolved (Li et al., 2008b).
Analyses of complete plastomes and nrDNA did not
resolve relationships in the Litsea complex, instead
splitting it into several well-supported but incongruent
subclades (Fig. 4.2). In contrast, the weakly supported
lack of monophyly for Perseeae and Cinnamomeae
in the nrDNA analyses is more likely to be due to
the limited information content of our nrDNA data.
Overall, to improve phylogenetic resolution the next
step will be to generate data from a substantial
number of nuclear markers or whole genomes for these
complex groups of Lauraceae.
11
12
Z.-F. LIU ET AL.
Huang JF, Li L, van der Werff H, Li HW, Rohwer JG,
Crayn DM, Meng HH, van der Merwe M, Conran JG,
Li J. 2016. Origins and evolution of cinnamon and camphor:
a phylogenetic and historical biogeographical analysis of the
Cinnamomum group (Lauraceae). Molecular Phylogenetics
and Evolution 96: 33–44.
Ji YH, Liu CK, Yang ZY, Yang LF, He ZS, Wang HC,
Yang JB, Yi TS. 2019. Testing and using complete plastomes
and ribosomal DNA sequences as the next generation DNA
barcodes in Panax (Araliaceae). Molecular Ecology Resources
19: 1333–1345.
Jin J-J, Yu W-B, Yang J-B, Song Y, dePamphilis CW, Yi TS, Li D-Z. 2018. GetOrganelle: a simple and fast pipeline for
de novo assembly of a complete circular chloroplast genome
using genome skimming data. BioRxiv 2018: 256479
de Jussieu AL. 1789. Antonii Laurentii de Jussieu genera
plantarum: secundum ordines naturales disposita, juxta
methodum in horto regio parisiensi exaratam, anno M. DCC.
LXXIV. Paris: apud viduam Herissant et Theophilum Barrois.
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A,
Jermiin LS. 2017. ModelFinder: fast model selection
for accurate phylogenetic estimates. Nature Methods 14:
587–589.
Kane N, Sveinsson S, Dempewolf H, Yang JY, Zhang D,
Engels JM, Cronk Q. 2012. Ultra-barcoding in cacao
(Theobroma spp.; Malvaceae) using whole chloroplast
genomes and nuclear ribosomal DNA. American Journal of
Botany 99: 320–329.
Katoh K, Rozewicki J, Yamada KD. 2017. MAFFT online
service: multiple sequence alignment, interactive sequence
choice and visualization. Briefings in Bioinformatics 20:
1160–1166.
Kostermans AJGH. 1957. Lauraceae. Pengumuman Balai
Besar Penjelidikan Kehutanan Indonesia 57: 1–64.
Krawczyk K, Nobis M, Myszczyński K, Klichowska E,
Sawicki J. 2018. Plastid super-barcodes as a tool for species
discrimination in feather grasses (Poaceae: Stipa). Scientific
Reports 8: 1924.
Kress WJ, Erickson DL. 2007. A two-locus global DNA
barcode for land plants: the coding rbcL gene complements
the non-coding trnH-psbA spacer region. PLoS One 2: e508.
Kress WJ, Wurdack KJ, Zimmer EA, Weigt LA, Janzen DH.
2005. Use of DNA barcodes to identify flowering plants.
Proceedings of the National Academy of Sciences of the
United States of America 102: 8369–8374.
Kuraku S, Zmasek CM, Nishimura O, Katoh K. 2013.
aLeaves facilitates on-demand exploration of metazoan
gene family trees on MAFFT sequence alignment server
with enhanced interactivity. Nucleic Acids Research 41:
W22–W28.
Lahaye R, van der Bank M, Bogarin D, Warner J,
Pupulin F, Gigot G, Maurin O, Duthoit S,
Barraclough TG, Savolainen V. 2008. DNA barcoding the
floras of biodiversity hotspots. Proceedings of the National
Academy of Sciences of the United States of America 105:
2923–2928.
Langmead B, Salzberg SL. 2012. Fast gapped-read alignment
with Bowtie 2. Nature Methods 9: 357–359.
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
2005. Land plants and DNA barcodes: short-term and longterm goals. Philosophical Transactions of the Royal Society of
London. Series B, Biological Sciences 360: 1889–1895.
China Plant BOL Group, Li D-Z, Gao L-M, Li H-T, Wang H,
Ge X-J, Liu J-Q, Chen Z-D, Zhou S-L, Chen S-L, Yang JB, Fu C-X, Zeng C-X, Yan H-F, Zhu Y-J, Sun Y-S, Chen SY, Zhao L, Wang K, Yang T, Duan G-W. 2011. Comparative
analysis of a large dataset indicates that internal transcribed
spacer (ITS) should be incorporated into the core barcode for
seed plants. Proceedings of the National Academy of Sciences
of the United States of America 108: 19641–19646.
Coissac E, Hollingsworth PM, Lavergne S, Taberlet P.
2016. From barcodes to genomes: extending the concept of
DNA barcoding. Molecular Ecology 25: 1423–1428.
Doyle JJ, Doyle JL. 1987. A rapid DNA isolation procedure
from small quantities of fresh leaf tissue. Phytochemistry
Bulletin, Botanical Society of America 19: 11–15.
Fazekas AJ, Kesanakurti PR, Burgess KS, Percy DM,
Graham SW, Barrett SC, Newmaster SG, Hajibabaei M,
Husband BC. 2009. Are plant species inherently harder
to discriminate than animal species using DNA barcoding
markers? Molecular Ecology Resources 9 Suppl s1: 130–139.
Folk RA, Soltis PS, Soltis DE, Guralnick R. 2018. New
prospects in the detection and comparative analysis of
hybridization in the tree of life. American Journal of Botany
105: 364–375.
Fu CN, Wu CS, Ye LJ, Mo ZQ, Liu J, Chang YW, Li DZ,
Chaw SM, Gao LM. 2019. Prevalence of isomeric plastomes
and effectiveness of plastome super-barcodes in yews (Taxus)
worldwide. Scientific Reports 9: 2773.
Gentry AH. 1988. Changes in plant community diversity and
floristic composition on environmental and geographical
gradients. Annals of the Missouri Botanical Garden 75:
1–34.
Gonçalves DJP, Simpson BB, Ortiz EM, Shimizu GH,
Jansen RK. 2019. Incongruence between gene trees
and species trees and phylogenetic signal variation in
plastid genes. Molecular Phylogenetics and Evolution 138:
219–232.
Harris RS. 2007. Improved pairwise alignment of genomic
DNA. Ph.D. Dissertation, Pennsylvania State University.
Hebert PD, Cywinska A, Ball SL, deWaard JR. 2003.
Biological identifications through DNA barcodes. Proceedings
of the Royal Society of London. B: Biological Sciences 270:
313–321.
Hinsinger DD, Strijk JS. 2017. Toward phylogenomics of
Lauraceae: the complete chloroplast genome sequence of
Litsea glutinosa (Lauraceae), an invasive tree species on
Indian and Pacific Ocean islands. Plant Gene 9: 71–79.
Hollingsworth PM. 2011. Refining the DNA barcode for land
plants. Proceedings of the National Academy of Sciences of
the United States of America 108: 19451–19452.
Hollingsworth PM, Graham SW, Little DP. 2011. Choosing
and using a plant DNA barcode. PLoS One 6: e19254.
Hollingsworth PM, Li DZ, van der Bank M, Twyford AD.
2016. Telling plant species apart with DNA: from barcodes to
genomes. Philosophical Transactions of the Royal Society B:
Biological Sciences 371: 20150338.
PLASTOMES FOR SPECIES IDENTIFICATION
Pang XB, Liu HS, Wu SR, Yuan YC, Li HJ, Dong JS, Liu ZH,
An CZ, Su ZH, Li B. 2019. Species identification of oaks
(Quercus L., Fagaceae) from gene to genome. International
Journal of Molecular Sciences 20: 5940.
Qu XJ, Moore MJ, Li DZ, Yi TS. 2019. PGA: a software
package for rapid, accurate, and flexible batch annotation of
plastomes. Plant Methods 15: 50.
Revell LJ. 2012. phytools: an R package for phylogenetic
comparative biology (and other things). Methods in Ecology
and Evolution 3: 217–223.
Rieseberg LH, Soltis DE. 1991. Phylogenetic consequences
of cytoplasmic gene flow in plants. Evolutionary Trends in
Plants 5: 65–84.
Rohde R, Rudolph B, Ruthe K, Lorea-Hernández FG,
de Moraes PLR, Li J, Rohwer JG. 2017. Neither Phoebe
nor Cinnamomum—the tetrasporangiate species of Aiouea
(Lauraceae). Taxon 66: 1085–1111.
Rohwer JG. 1993. Lauraceae. In: Kubitzki K, Rohwer JG,
Bittrich V. eds. The families and genera of vascular plants,
Vol. 2. Flowering plants. Dicotyledons: magnoliid, hamamelid
and caryophyllid families. Berlin: Springer-Verlag, 366–391.
Rohwer JG. 2000. Toward a phylogenetic classification of
the Lauraceae: evidence from matK sequences. Systematic
Botany 25: 60–71.
Rohwer JG, Li J, Rudolph B, Schmidt SA,
van der Werff H, Li H-W. 2009. Is Persea (Lauraceae)
monophyletic? Evidence from nuclear ribosomal ITS
sequences. Taxon 58: 1153–1167.
Rohwer JG, Rudolph B. 2005. Jumping genera: the
phylogenetic positions of Cassytha, Hypodaphnis, and
Neocinnamomum (Lauraceae) based on different analyses
of trnK intron sequences. Annals of the Missouri Botanical
Garden 92: 153–178.
Ruhsam M, Rai HS, Mathews S, Ross TG, Graham SW,
Raubeson LA, Mei W, Thomas PI, Gardner MF,
Ennos RA, Hollingsworth PM. 2015. Does complete
plastid genome sequencing improve species discrimination
and phylogenetic resolution in Araucaria? Molecular Ecology
Resources 15: 1067–1078.
d e S a n t a n a L o p e s A , G o m e s Pa c h e c o T ,
Nascimento da Silva O, Magalhães Cruz L, Balsanelli E,
Maltempi de Souza E, de Oliveira Pedrosa F,
Rogalski M. 2019. The plastomes of Astrocaryum
aculeatum G. Mey. and A. murumuru Mart. show a flip-flop
recombination between two short inverted repeats. Planta
250: 1229–1246.
S h i n o z a k i K , O h m e M , Ta n a k a M , Wa k a s u g i T ,
Hayashida N, Matsubayashi T, Zaita N,
Chunwongse J, Obokata J, Yamaguchi-Shinozaki K,
Ohto C, Torazawa K, Meng BY, Sugita M, Deno H,
Kamogashira T, Yamada K, Kusuda J, Takaiwa F,
Kato A, Tohdoh N, Shimada H, Sugiura M. 1986. The
complete nucleotide sequence of the tobacco chloroplast
genome: its gene organization and expression. The EMBO
Journal 5: 2043–2049.
Soltis DE, Kuzoff RK. 1995. Discordance between nuclear
and chloroplast phylogenies in the Heuchera group
(Saxifragaceae). Evolution 49: 727–742.
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
Lee-Yaw JA, Grassa CJ, Joly S, Andrew RL, Rieseberg LH.
2019. An evaluation of alternative explanations for
widespread cytonuclear discordance in annual sunflowers
(Helianthus). The New Phytologist 221: 515–526.
L i H - W , L i J , H u a n g P - H , We i F - N , T s u i H - P ,
van der Werff H. 2008a. Calycanthaceae–Schisandraceae,
Flora of China vol. 7. Beijing and St. Louis: Science Press
and Missouri Botanical Garden Press, 102–254.
Li J, Conran JG, Christophel DC, Li Z-M, Li L, Li H-W.
2008b. Phylogenetic relationships of the Litsea complex and
core Laureae (Lauraceae) using ITS and ETS sequences and
morphology. Novon 18: 4–8.
Li L, Li J, Conran JG, Li X-W. 2007. Phylogeny of Neolitsea
(Lauraceae) inferred from Bayesian analysis of nrDNA ITS
and ETS sequences. Plant Systematics and Evolution 269:
203–221.
Li L, Li J, Rohwer JG, van der Werff H, Wang ZH,
Li HW. 2011. Molecular phylogenetic analysis of the Persea
group (Lauraceae) and its biogeographic implications on
the evolution of tropical and subtropical Amphi-Pacific
disjunctions. American Journal of Botany 98: 1520–1536.
Li L, Tan YH, Meng HH, Ma H, Li J. 2020. Two new species
of Alseodaphnopsis (Lauraceae) from southwestern China
and northern Myanmar: evidence from morphological and
molecular analyses. PhytoKeys 138: 27–39.
Li X, Yang Y, Henry RJ, Rossetto M, Wang Y, Chen S.
2015. Plant DNA barcoding: from gene to genome.
Biological Reviews of the Cambridge Philosophical Society
90: 157–166.
Linnaeus CV. 1753. Species plantarum, Vol. 1. Stockholm: L.
Salvius.
Liu J, Milne RI, Moller M, Zhu GF, Ye LJ, Luo YH,
Yang JB, Wambulwa MC, Wang CN, Li DZ, Gao LM.
2018. Integrating a comprehensive DNA barcode reference
library with a global map of yews (Taxus L.) for forensic
identification. Molecular Ecology Resources 18: 1115–1131.
Liu Y, Johnson MG, Cox CJ, Medina R, Devos N,
Vanderpoorten A, Hedenäs L, Bell NE, Shevock JR,
Aguero B, Quandt D, Wickett NJ, Shaw AJ, Goffinet B.
2019. Resolution of the ordinal phylogeny of mosses using
targeted exons from organellar and nuclear genomes. Nature
Communications 10: 1485.
Liu ZF, Ci XQ, Li L, Li HW, Conran JG, Li J. 2017. DNA
barcoding evaluation and implications for phylogenetic
relationships in Lauraceae from China. PLoS One 12:
e0175788.
Lowe TM, Chan PP. 2016. tRNAscan-SE On-line: integrating
search and context for analysis of transfer RNA genes.
Nucleic Acids Research 44: W54–W57.
Maddison WP, Maddison DR. 2018. Mesquite: a modular
system for evolutionary analysis. Version 3.51. Available at:
http://www.mesquiteproject.org.
Mo YQ, Li L, Li JW, Rohwer JG, Li HW, Li J. 2017.
Alseodaphnopsis: a new genus of Lauraceae based on
molecular and morphological evidence. PLoS One 12:
e0186545.
Nees von Esenbeck CGD. 1836. Systema laurinarum. Berlin:
Sumptibus Veitii et Sociorum.
13
14
Z.-F. LIU ET AL.
maximum likelihood analysis. Nucleic Acids Research 44:
W232–W235.
Trofimov D, Rohwer JG. 2020. Towards a phylogenetic
classification of the Ocotea complex (Lauraceae): an analysis
with emphasis on the Old World taxa and description of the
new genus Kuloa. Botanical Journal of the Linnean Society
192: 510–535.
Twyford AD, Ness RW. 2017. Strategies for complete plastid
genome sequencing. Molecular Ecology Resources 17:
858–868.
van der Werff H. 2019. Alseodaphnopsis (Lauraceae)
revisited. Blumea 64: 186–189.
van der Werff H, Richter HG. 1996. Toward an improved
classification of Lauraceae. Annals of the Missouri Botanical
Garden 83: 409–418.
Weber JZ. 1981. A taxonomic revision of Cassytha (Lauraceae)
in Australia. Journal of the Adelaide Botanic Gardens 3:
187–262.
Weber JZ. 2007. Cassytha. In: Wilson, AJG ed. Flora of
Australia, vol. 2: Winteraceae to Platanaceae. Collingwood:
ABRS/CSIRO, 117–136.
Wick RR, Schultz MB, Zobel J, Holt KE. 2015. Bandage:
interactive visualization of de novo genome assemblies.
Bioinformatics (Oxford, England) 31: 3350–3352.
Wu CS, Wang TJ, Wu CW, Wang YN, Chaw SM. 2017.
Plastome evolution in the sole hemiparasitic genus laurel
dodder (Cassytha) and insights into the plastid phylogenomics
of lauraceae. Genome Biology and Evolution 9: 2604–2614.
Zhang C, Rabiee M, Sayyari E, Mirarab S. 2018. ASTRALIII: polynomial time species tree reconstruction from
partially resolved gene trees. BMC Bioinformatics 19: 153.
Zhao ML, Song Y, Ni J, Yao X, Tan YH, Xu ZF. 2018.
Comparative chloroplast genomics and phylogenetics of nine
Lindera species (Lauraceae). Scientific Reports 8: 8844.
SUPPORTING INFORMATION
Additional Supporting Information may be found in the online version of this article at the publisher’s web-site:
Note S1. Background and notes on Lauraceae-specific barcodes.
Table S1. A summary of all taxa compared for plastid sequence variation including de novo sequences generated
for this study, and sequences retrieved from public sequence databases.
Table S2. Taxon sampling, material sources and GenBank accession numbers for de novo sequenced individuals.
Table S3. Plastid genome sizes of the 191 individuals.
Figure S1. ML tree of 191 whole plastid genomes.
Figure S2. NJ tree of 191 whole plastid genomes.
Figure S3. ML tree of concatenated genes extracted from 191 whole plastid genomes.
Figure S4. NJ tree of concatenated genes extracted from 191 whole plastid genomes.
Figure S5. ML tree of the Lauraceae-specific barcode matrix (ycf1+ndhH–rps15+trnL–ycf2) extracted from 191
whole plastid genomes.
Figure S6. NJ tree of the Lauraceae-specific barcode matrix (ycf1+ndhH–rps15+trnL–ycf2) extracted from 191
whole plastid genomes.
Figure S7. ML tree of the standard barcode matrix (rbcL+matK+trnH–psbA) extracted from 191 whole plastid
genomes.
Figure S8. NJ tree of the standard barcode matrix (rbcL+matK+trnH–psbA) extracted from 191 whole plastid
genomes.
Figure S9. ML tree comparisons of 191 whole plastid genomes and corresponding concatenated genes.
© 2021 The Linnean Society of London, Botanical Journal of the Linnean Society, 2021, 197, 1–14
Downloaded from https://academic.oup.com/botlinnean/article/197/1/1/6179569 by guest on 01 March 2023
Song Y, Dong W, Liu B, Xu C, Yao X, Gao J, Corlett RT.
2015. Comparative analysis of complete chloroplast genome
sequences of two tropical trees Machilus yunnanensis and
Machilus balansae in the family Lauraceae. Frontiers in
Plant Science 6: 662.
Song Y, Yao X, Liu B, Tan Y, Corlett RT. 2018. Complete
plastid genome sequences of three tropical Alseodaphne
trees in the family Lauraceae. Holzforschung 72:
337–345.
Song Y, Yao X, Liu B, Tan Y, Gan Y, Yang J, Corlett RT.
2017a. Comparative analysis of complete chloroplast
genome sequences of two subtropical trees, Phoebe sheareri
and Phoebe omeiensis (Lauraceae). Tree Genetics & Genomes
13: 120.
Song Y, Yao X, Tan Y, Gan Y, Corlett RT. 2016. Complete
chloroplast genome sequence of the avocado: gene
organization, comparative analysis, and phylogenetic
relationships with other Lauraceae. Canadian Journal of
Forest Research 46: 1293–1301.
Song Y, Yu WB, Tan Y, Liu B, Yao X, Jin J, Padmanaba M,
Yang JB, Corlett RT. 2017b. Evolutionary comparisons of
the chloroplast genome in Lauraceae and insights into loss
events in the magnoliids. Genome Biology and Evolution 9:
2354–2364.
Song Y, Yu WB, Tan YH, Jin JJ, Wang B, Yang JB,
Liu B, Corlett RT. 2020. Plastid phylogenomics improve
phylogenetic resolution in the Lauraceae. Journal of
Systematics and Evolution 58: 423–439.
Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES,
Fischer A, Bock R, Greiner S. 2017. GeSeq - versatile
and accurate annotation of organelle genomes. Nucleic Acids
Research 45: W6–W11.
Trifinopoulos J, Nguyen LT, von Haeseler A, Minh BQ.
2016. W-IQ-TREE: a fast online phylogenetic tool for