Introduction

In the past, hybridization has often been viewed as a lapsus of nature. Under the biological species concept (Mayr 1942), reproductive isolation has been regarded as a phenomenon defining species boundaries and a basic prerequisite for the persistence of species. With the advent of molecular methods in systematics and evolutionary studies, it became obvious that hybridization is rather a common process; one quarter of plant species and one-tenth of animal species are supposed to be involved in interspecific crossing (Arnold and Hodges 1995; Mallet 2005, 2007).

Viability and fertility of F1 hybrids and backcrosses in comparison with parental species may vary. In wild outcrossing and mixed-mating plants, average fitness is typically lower in hybrids than in parental taxa, except in specific habitats (Rieseberg et al. 1999), but this is not always the case. Hybrids may both profit from somatic or adaptive heterosis (Miller et al. 2015) and suffer from outbreeding depression (Ellison and Burton 2008; Rosas et al. 2010; Sambatti et al. 2008), or the hybrid vigor is just transient and hybrid breakdown occurs in later generations (Johansen-Morris and Latta 2006; Johnson et al. 2010; Latta et al. 2007). In general, hybrid fitness can vary between environments (Leinonen et al. 2011), and hybrids and backcrosses may eventually outperform pure species (Arnold and Martin 2010; Arnold et al. 2012; Dagilis et al. 2019). Hybrid zones appearing at the secondary contact of divergent genetic lineages represent natural laboratories allowing to evaluate how gene flow and selection regulate the spread of genetic material across species boundaries (Lexer et al. 2007; Rieseberg et al. 1999). Genes can be exchanged between lineages at different rates ranging from zero (or almost zero) for the genomic regions involved in reproductive isolation to free (or almost free) exchange for the selectively neutral segments or parts of the genome under selective pressures driven by factors common for both lineages (Barton 1979; Capblanq et al. 2020; Hamilton et al. 2013; Harrison 1993).

Hybridization is commonly understood as gene exchange between different species, or at least recognized subspecific taxa. However, mating between divergent lineages within a species may lead to similar consequences as mating between closely related species (Burke and Arnold 2001; Keller and Taylor 2010; Lucek et al. 2010).

The complex of beech taxa in western Eurasia (Fagus sylvatica L. sensu lato) is a good model for the study of hybridization and introgression at the subspecific level. Native old-growth forests distributed across the whole range from the Carpathians through Dinaric mountains, Pontic mountains to the Alborz, where beech trees reach heights over 45 m, document ecological stability and production potential of beech-dominated forest communities (Holeksa et al 2009; Leibundgut 1982; Tabari et al. 2005). Although in some regions beech suffers from the ongoing climate changes (Czúcz et al. 2011; Leuschner 2020), it continues to expand in mixed montane forests on the expense of coniferous species (Keren et al. 2014). Currently, Flora Europaea (Akeroyd 1993) recognized beech populations in western Eurasia as a single species, Fagus sylvatica L. However, as a taxon with very large and partly discontinuous range, it has been object of many botanical, phytosociological and taxonomical studies, resulting in the description of several subspecific taxa, although the opinions about their number and rank are divergent, sometimes even controversial. Recent systematic revisions (Denk 1999a, b) confirmed the view of Greuter and Burdet (1981), who classified western Eurasian beeches in two subspecies: F. sylvatica subsp. sylvatica and F. sylvatica subsp. orientalis. Nevertheless, the latter subspecies is regarded as a separate species F. orientalis Lipsky not only by local botanists and foresters (Gulisashvili et al. 1975; Mayer and Aksoy 1986; Mobayen and Tregubov 1970) but also authoritative sources such as the Plants of the World Online database (https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:295615-1) or World Flora Online (WFO 2023). For brevity, the subspecies are hereafter labeled as “F. sylvatica” and “F. orientalis,” respectively. Moreover, a plenty of local or regional populations were described as separate taxa, mostly on the subspecies or variety level (for a historical overview, see Denk et al. 2002); among them, the names “F. moesiaca (K. Malý) Czeczott” and “F. taurica Popl.” are widely used by local botanists for the populations distributed in the southern Balkans and the Crimean Peninsula, respectively. Also, beech in the Caucasus and the Alborz Mountains was described as a separate species F. hohenackeriana Palib., but this name is rarely used.

Such a plethora of names is by far not exceptional in botanical taxonomy and is partly justified. The range of beech is discontinuous, especially in the eastern part, and the hitherto genetic studies have documented that individual regional populations are highly differentiated. The range of F. orientalis is divided into three fragments: the Alborz, the Caucasus and Asia Minor (the last reaching into the southeastern Balkans), which are strongly differentiated from each other, both genetically and morphologically (Denk 1999a; Denk et al. 2002; Gömöry et al. 2007). Phylogeny reconstruction based on allozyme genes even indicated that the split between the Iranian population and the remaining two regional populations of F. orientalis occurred earlier than the divergence of F. sylvatica from F. orientalis (Gömöry et al. 2018). The range of F. sylvatica is restricted to Europe. However, Pleistocene glaciations led to a disruption of the distribution range and a subsequent differentiation of populations surviving in glacial refugia also in this species. Genetic studies documented at least four refugial areas serving as sources for the Holocene recolonization: Slovenia/Istria, southern Balkans, Apennine Peninsula and Massif Central, along with several other refugia which have not expanded at the onset of the Holocene (Magri et al. 2006). While most sutures between expanding refugial populations were obliterated, and a nearly continuous distribution has been restored in some regions (western Balkans, southeastern France), big rivers such as the Danube or the Po still represent barriers for migration by seeds. Contact of populations with different gene pools offers a potential for the formation of hybrid zones, where gene flow by pollen leads to hybridization between genetic lineages and introgression. Western Balkans is an example of such a hybrid zone, where gene frequencies change clinally from Slovenia down to western Turkey (Gömöry et al. 1999).

This study focused on the identification of the extent and direction of gene exchange among different parental lineages within putative hybrid zones. At the same time, we tried to identify SNP loci with patterns of introgression deviating from most of the genome as a consequence of adaptation or reproductive isolation.

Materials and methods

Experimental material

Material for this study was partly collected from natural populations (especially F. orientalis) and partly in international provenance experiments with European beech. Provenance trials included two Slovak plots of the experiment coordinated by the Institute of Forest Genetics of the J. H. von Thünen Institute in Grosshansdorf, Germany, which was established in two series in 1995 and 1998 with 2-year-old seedlings (100 and 32 provenances covering a major part of the distribution range of beech in Europe were planted at the localities Vrchdobroč and Tále, central Slovakia, respectively). At these plots, material from 84 and 15 provenances, respectively, was collected for this study. Another provenance trial serving as a source of material was a plot of the international experiment with southeast European provenances at the locality Medvednica, Croatia, where 19 provenances were sampled. The remaining material was collected in natural populations (Fig. 1; detailed information on populations is provided at https://doi.org/10.5281/zenodo.7794256).

Fig. 1
figure 1

Distribution of the analyzed specimens: natural populations (yellow), provenance experiment Vrchdobroč (red), provenance experiment Tále (black), provenance experiment Medvednica (light blue). (Color figure online)

DNA extraction and sequencing

Twigs with dormant buds and leaves were collected from trees growing in provenance trials and natural populations, respectively. Total genomic DNA was extracted from the silica-dried leaves or buds using a modified CTAB protocol following Doyle and Doyle (1987). DNA concentration and quality were assessed with a Qubit 4 Fluorometer (Thermo Fisher, Waltham MA, USA) and NanoDrop (Thermo Fisher, Waltham MA, USA), and samples were sent for sequencing to IGA Technology Services, Udine, Italy.

Double-digest restriction-site-associated DNA sequencing (ddRAD) was applied to reveal single-nucleotide polymorphisms (SNP) in the nuclear genome. ddRAD libraries were produced by IGA Technology Services using a custom protocol, with minor modifications with respect to Peterson’s double-digest restriction-site-associated DNA preparation (Peterson et al. 2012). 300 ng of genomic DNA was double-digested with 2.4 U of both SphI and MboI endonucleases (New England Biolabs) in 30 µl reaction supplemented with CutSmart Buffer and incubated at 37 °C for 90 min and then at 65 °C for 20 min. Fragmented DNA was ligated with 180 U of T4 DNA ligase (New England BioLabs, Ipswisch MA, USA) to 2.5 pmol of overhang barcoded adapter for rare cut sites and to 5 pmol of overhang barcoded adapter for frequent cut sites in 50 µl reaction incubated at 23 °C for 60 min and at 20 °C for 60 min followed by 20 min at 65 °C. Samples were pooled and beads-purified. Targeted fragments distribution was collected on BluePippin instrument (Sage Science, Beverly MA, USA) setting the range of 450–700 bp. Gel-eluted fraction was amplified with indexed primers using Phusion High-Fidelity PCR Master Mix (New England BioLabs) in final volume of 50 µl and subjected to the following thermal protocol: [95 °C, 3 min] – [95 °C, 30 s – 60 °C, 30 s – 72 °C, 45 s] × 10 cycles – [72 °C, 2 min]. After cleanup, libraries were checked with both Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA, USA) and Bioanalyzer DNA assay (Agilent Technologies, Santa Clara, CA, USA). Libraries were sequenced with 150 cycles in paired-end mode on NovaSeq 6000 instrument following the manufacturer’s instructions (Illumina, San Diego, CA, USA).

Data analysis

The initial bioinformatic analyses performed by IGA Technology Services consisted of demultiplexing of raw Illumina reads using the process_radtags utility included in Stacks v2.4 (Catchen et al. 2013). Alignment to the reference F. sylvatica genome (version 2 scaffold level assembly; Mishra et al. 2018; http://thines-lab.senckenberg.de/beechgenome/) using BWA-MEM (Li and Durbin 2009) with default parameters and selection of uniquely aligned reads (with a mapping quality > 4). Detection of the covered loci from the aligned reads using the gstacks program in Stacks v2.61. Filtering of detected loci using the populations program in Stacks v2.61, with option -R = 0.75 in order to retain only loci that are represented in at least the 75% of the whole metapopulation. To remove loci with possible technical errors, a cut-off max-obs-het = 0.8 was used in order to process a nucleotide site at a locus with observed heterozygosity at maximum of 80%. After the initial bioinformatic analyses in IGA Technology Services, we additionally filtered the SNPs based on several criteria: quality value Q > 13 (Q = 26 was the actual lowest value); markers with missing data in > 20% samples of any of the two species; samples with > 20% missing data; samples with read depth deviating ± 3 SD from average; < 5% minor allele frequency (MAF); and read depth further than ± 2 standard deviations from the mean. One randomly chosen SNP per contig was retained for further analysis. In total, 6789 SNP loci passed filtering; they are quite regularly distributed across the genome, maybe with the exception of chromosome 1 with a generally lower density of SNPs (Fig. 2a).

Fig. 2
figure 2

Density of SNPs per 1 Mbp distributed over 12 reference chromosomes: a all 6789 SNPs, b the 220 SNPs exhibiting allele frequency difference Δp > 0.85 between F. sylvatica and F. orientalis (b). Green points denote the positions of outlier SNPs. (Color figure online)

To reveal potential substructure, the Bayesian clustering algorithm implemented in STRUCTURE version 2.3.4 (Pritchard et al. 2000) was applied to cluster individuals into K groups, employing the admixture model with correlated allele frequencies. K values were tested from 1 to 10. A burn-in period of 50,000 iterations was followed by 100,000 iterations for estimation of the membership coefficients. Fifteen independent Markov chains were run for each K. The most likely number of clusters K was inferred from plotting log posterior probability against K, as well as the ΔK measure of Evanno et al. (2005) (Fig. 3a). Clusters were aligned using CLUMPP 1.1.2 (Jakobsson and Rosenberg 2007) with the Greedy algorithm testing 1000 random input orders.

Fig. 3
figure 3

Results of the Bayesian Structure analysis superimposed over the map of Europe and the eastern Mediterranean: a Assessment of the number of groups in the Bayesian analysis (Evanno et al. 2005), b inferred cluster membership proportions at K = 2, c cluster membership proportions at K = 3 (c), d cluster membership proportions at K = 5

Bayesian genomic clines (bgc; Gompert and Buerkle 2011) were used to analyze locus-specific ancestry using the bgc software (Gompert and Buerkle 2012). To define parental reference populations, we relied on earlier experience from allozyme and chloroplast DNA studies (Gömöry et al. 1999, 2007; Magri et al. 2006). For the putative hybrid zone in the Balkans, populations in Asia Minor were considered to be the F. orientalis parental population (12 individuals), while populations located close to the putative glacial refugium in Slovenia as well as those located farther to the North and East were considered pure F. sylvatica parental population (63 individuals). To be on the safe side, we checked whether the summary proportion of the clusters typical for either side of the hybrid zone under K = 5 exceeded 0.95; all samples in the parental zones fulfilled this criterion. In accordance with Denk (1999b) and Gömöry et al. (1999), beech populations in between (including Strandzha Mts. in eastern Bulgaria, usually classified as F. orientalis; c.f. Denk 1999a; Kandemir and Kaya 2009; Tzonev et al. 2006) were regarded as a broad introgression zone with increasing proportion of F. orientalis genes toward the southeast (admixed population; 39 individuals). For the putative hybrid zone between populations originated from glacial refugia in the Apennine peninsula and those originated from refugia in the foothills of Eastern Alps, the Po valley was considered a migration barrier (Magri et al. 2006) and gene flow (possibly bidirectional) was expected to have been accomplished through beechwoods in Western Alps. Therefore, the whole region comprising the Alps, their southern, western and northern foothills and northern Apennines (north of 43°N) were considered the admixed population (21 individuals), while the rest of the Apennine peninsula as well as samples from western Germany, the Netherlands and Luxembourg were taken as reference populations (10 and 27 specimens, respectively). Samples included in each of the hybrid zones and samples removed from the analysis because of missing data are listed in supplementary materials (Table S1).

Since each of these two hybrid zones consisted a subset of individuals from the total dataset, we again removed markers with < 5% minor allele frequency for each of these two sets of individuals. In the case of the Balkans hybrid zone 5038 SNPs were retained, while in the case of Apennine peninsula hybrid zone 4206 markers were left. Hybrid index was calculated using the bgc software (Gompert and Buerkle 2012). Analysis was run with 400,000 burn-in MCMC steps followed by 100,000 subsequent iterations and a thinning interval of 100. Genomic clines were assessed for SNPs with an allele frequency differential between parental populations Δp > 0.85; the distribution of these SNPs over the reference chromosomes was displayed using the R-package RIdeogram (Hao et al. 2020). Two parameters were of interest, namely genomic cline center (α) and cline rate (β) (Gompert and Buerkle 2011). Cline center indicates the direction of introgression, where outliers indicate excess ancestry from one or the other reference population. Cline rate is an indicator of the progression of SNP frequency change; a steeper cline indicated by positive values means a sharp transition of ancestry from one reference population to the other. Point estimates and credible intervals for parameters were generated using the software estpost (Gompert and Buerkle 2012). Outliers were considered significant if the 95% credible intervals for α or β did not overlap zero (Gompert and Buerkle 2011). To assess the sensitivity of the analysis of genomic clines, we performed resampling with replacement of individuals within each group (two reference populations and hybrid zone); 100 bootstrap replicates were created using an ad hoc R-script and the proportions of significant values were used as a support of the calculated genomic cline parameters. Inter-lineage heterozygosity (He, the proportion of loci where alleles originate from different parental populations) was also calculated using bgc. However, as the authors recommend calculation of He only if most loci are fixed or nearly fixed in the parental populations, the dataset was reduced to 220 SNPs, for which the allele frequency differential between parental populations exceeded 0.85. As this criterion was not fulfilled for any SNP in the Alpine hybrid zone, He was calculated only for the Balkans.

Basic diversity and differentiation statistics (nucleotide diversity Pi, Nei 1987; coefficient of differentiation FST, Hudson et al. 1992; pairwise net number of nucleotide substitutions per site between populations Da, Nei 1987) were calculated using DnaSP v. 6.12 (Rozas et al. 2017) for the subspecies, the Balkan hybrid zone and the Alpine hybrid zone.

Results

Population Structure

Basic summary statistics showed that nucleotide diversity is higher in F. orientalis compared to F. sylvatica (Table 1). The subspecies are also strongly differentiated (FST = 0.4722). However, the differentiation among regional populations of F. orientalis (Asia Minor, Caucasus, Alborz) is almost equally strong (FST = 0.4256). This pattern was partly confirmed also in the Balkan hybrid zone: nucleotide diversity was the highest in the reference F. orientalis population, but differentiation between parental populations belonging to different subspecies was substantially weaker (FST = 0.2677); it must be reminded that the reference F. orientalis population comprised only Turkish samples, excluding more divergent Caucasian and Iranian populations. Expectedly, the differentiation between the Apennine and Central European lineages in the Alpine hybrid zone was the weakest (FST = 0.0276). The pairwise net numbers of nucleotide substitutions per site exhibited a similar pattern. The population in Asia Minor is strongly divergent, while the distance between the Alborz and the Caucasus is not so pronounced. In both hybrid zones, parental populations showed the highest divergence, while admixture zones were expectedly less differentiated.

Table 1 Basic statistics of nucleotide diversity and differentiation

The ΔK measure after Evanno et al. (2005) clearly peaked at K = 2. However, a secondary peak at K = 5 and the course of the log posterior probability showing a decrease after K = 6 indicate that a substructure with higher numbers of clusters is also possible (Fig. 3a). The distribution of cluster proportions at K = 2 corresponded partly with the current taxonomical view of two subspecies: one cluster clearly dominated in Europe (including western Balkans), the other one in the Caucasus and the Alborz Mts, while populations classified as F. orientalis in Asia Minor and eastern Bulgaria showed a mixture of both gene pools (Fig. 3b). At K = 3, F. sylvatica remained a separate group, while F. orientalis was split into western Anatolian and east Bulgarian specimens on one side, Caucasian and Iranian specimens on the other side, and populations from the eastern part of the Pontic range and Amanus Mts. in southern Turkey containing a mixture of both clusters (Fig. 3c). With higher numbers of clusters, this Structure was retained, and additional gene pools appeared within the range of F. sylvatica (Fig. 3d), namely a cluster (black pies) with proportions of 8–27% in the specimens from the Apennines (but also dispersed across the southern and western Balkans, southern Alps and the Carpathians), and another cluster present in western Europe (Spain, France, Switzerland, Germany), but again with proportions not exceeding 15% (green pies). Increasing the number of clusters K to 6 or more did not bring any interpretable pattern (data not shown). In contrast to expectations relying on earlier studies (Gömöry et al. 2007; Kurz et al. 2023), the population from the Alborz Mts. (Iran) was not distinguished from the remaining F. orientalis regional populations at any K. In one specimen from Bosnia (Tomislavgrad), the admixture proportion (Q-value) of the cluster predominating in Turkey and eastern Bulgaria (F. orientalis) was unusually high, in spite of being located within the range of F. sylvatica. As Bosnia had been a part of the Ottoman Empire until 1905, we suspect the parental population of this specimen to be artificially introduced; therefore, the sample was excluded from the assessment of genomic clines.

Hybrid zones

In the putative Balkan hybrid zone, hybrid index changes very abruptly at the boundary between F. sylvatica and F. orientalis in eastern Bulgaria. The four specimens classified as F. orientalis showed hybrid indices close to one, i.e., absolute dominance of F. orientalis genomes (hi > 0.85). On the other hand, in the remaining Bulgarian and western Balkan beech populations classified as F. sylvatica (or F. moesiaca) hybrid indices range between 0.002 and 0.412, with a clear trend of increase toward the boundary with F. orientalis (Fig. 4a). Interspecific heterozygosities showed a partly similar pattern (Fig. 5): High heterozygosities were observed in the Bulgarian F. sylvatica populations (especially those geographically proximate to the range of F. orientalis), and rapidly decrease northwestwards, being close to zero in the Croatian specimens. However, low interspecific heterozygosity was also observed in the Bulgarian F. orientalis.

Fig. 4
figure 4

Distribution of hybrid indices across the hybrid zones: a Balkan hybrid zone, b Alpine hybrid zone. White and black diamonds: parental reference populations

Fig. 5
figure 5

Distribution of inter-lineage heterozygosity across the Balkan hybrid zone

Out of 5038 SNPs studied in the Balkan hybrid zone, 220 showed an allele frequency differential between reference populations of more than 0.85. Their distribution over the reference chromosomes was quite irregular (Fig. 2b); none was found on the longest chromosome 1 and just 4 on chromosome 5, while most of them were concentrated on chromosomes 9, 11 and 12 (33, 30 and 24, respectively). Seven SNPs exhibited genomic cline center α significantly differing from zero. Excess F. orientalis ancestry (α < 0) was identified for six SNPs; bootstrap support for α was over 80% in all cases. In addition, another SNP showed significantly outlying cline center with the highest bootstrap support (95%), but allele frequency differential in this case was 0.73 only. Excess F. sylvatica ancestry (α > 0) was found for one SNP, but the bootstrap support was quite small in this case (56%). Genomic cline rate β was non-significant for all loci, indicating a uniform genomic cline steepness across the genome (Table 2). Outlier SNPs were distributed across five reference chromosomes; three of them are localized in exons, two in introns, and three outside a gene—in these cases, Table 2 gives the localization and description of the most proximate genes.

Table 2 List of SNPs identified as genomic cline outliers and genomic clines parameters

In the hybrid zone crossing the Alps, the pattern is quite different. Even in the northernmost specimens from the Apennines hybrid index does not exceed 0.5 (Fig. 4b). Relatively high values (hi > 0.25) were also observed in the southern and southwestern foothills of the Alps and rapidly decreased toward southern Germany. Allelic frequency differential between reference populations does not exceed 0.65 for any locus; therefore, interspecific heterozygosity (in this case ‘inter-lineage heterozygosity) was not calculated. Even when we gave up the Δp > 0.85 criterion for this hybrid zone, none of genome cline centers (α) or rates (β) differed significantly from zero.

Discussion

Population Structure

Whatever number of clusters (K) was chosen for the Bayesian analysis of population Structure, the outcomes diverged from earlier studies relying on chloroplast, allozyme or microsatellite markers. The most striking difference is a lack of differentiation between the Iranian and the Caucasian populations. All studies covering the whole range of F. orientalis, both allozyme-based (Gömöry et al. 2007) and nuclear microsatellites-based (Bijarpasi et al. 2020; Budde et al. 2023; Kurz et al. 2023) clearly separated populations from the Alborz Mts. from all remaining F. orientalis. The same applies to morphometry based on leaf and fruit traits (Denk 1999a, b; Denk et al. 2002). A phylogeny simulation relying on approximate Bayesian computations and using allozyme data even suggested the Iranian beech to be the earliest diverged regional subpopulation within the whole complex of western Eurasian beeches (Gömöry et al. 2018). Another contrast to earlier studies is a lack of clear differentiation between east-Anatolian and Caucasian beech populations, observed both using allozymes (Gömöry et al. 2007) and microsatellite markers (Kurz et al. 2023). Differentiation statistics conformed well with the outcomes of the Structure analysis: A strong differentiation was observed among F. orientalis regional populations (comparable to differentiation between subspecies), suggesting that the more distant beech populations in the Caucasus and the Alborz did not contribute to the recent gene exchange with F. sylvatica; at the same time, these two regions do not strongly differ from each other. In any case, such a strong differentiation within F. orientalis implies that the population Structure best supported by ΔK (Evanno et al. 2005) is misleading, while the Structures at higher numbers of clusters than K = 2 better reflect biological reality.

In Europe, the results of the Structure analysis conformed better with what is currently known about the Quaternary history of beech (Giesecke et al. 2007; Magri et al. 2006; Magri 2008; Vettori et al. 2004). At K > 3, a substructure within F. sylvatica was revealed. Even though one main cluster remained predominant across the distribution range, two other clusters appeared at low frequencies, one shared by the Balkan and Italian populations and another one represented north and west of the Alps. Sharing a substantial portion of the genome is not surprising, as all European beech populations belong to the same subspecies, which has been distributed across a major part of the continent during most of the Pleistocene (Denk and Grimm 2009), and the current differentiation is mainly due to range fragmentation during the last glacial (Magri et al. 2006). Most of the range was colonized from a single refugium localized in eastern foothills of the Alps (Magri et al. 2006). The presence of a specific gene pool in the Apennine Peninsula shared with southern Balkans may reflect both adaptation to the Mediterranean climate and gene exchange during earlier phases of the Pleistocene. Sharing of gene pools between the southern European peninsulas was found also in other plants (Musacchio et al. 2006). Similarly, the admixture of another gene pool in the western part of the F. sylvatica range is probably related to the existence of glacial refugia in the Massif Central and the Rhône valley (Magri 2008).

Gene exchange between divergent genetic lineages

Even when the outcomes of the Bayesian population Structure analysis did not properly show it, there is enough paleobotanical and genetic evidence for the existence of a number of highly divergent genetic lineages within F. sylvatica s.l., potentially connected by gene flow. As shown by Postolache et al. (2021), the distribution of geographical barriers to gene flow (mountain ranges, seas) does not fully overlap with potential boundaries between refugial areas; in spite of this, hybrid zones may be much more numerous than just the two addressed by this study. However, a limited number of available samples and irregular sampling density (especially within the range of F. orientalis) forced us to focus on just the Balkans and the Alpine area. The results of the Structure analysis were partly disappointing from the point of view of the definition of reference parental populations and delineation of hybrid zones for the study of introgression. For this purpose, we finally decided to combine the patterns shown by Structure with earlier knowledge about the distribution of chloroplast haplotypes or lineages defined by nuclear markers (Vettori et al. 2004; Magri et al. 2006; Gömöry et al. 2007).

The information about the preferential direction of inter-taxon gene flow is not without ambiguity. A study in Greek and Turkish beech populations (Müller et al. 2019) suggested a very narrow hybrid zone. The study identified only one population with a noteworthy level of admixture, namely a Greek F. orientalis population, while in all F. sylvatica populations admixture was negligible; this implies gene flow directed primarily into F. orientalis. The comparison of the distributions of hybrid index and interspecific heterozygosity in our study also indicates that in the southeastern Balkans, interspecific gene flow seems to be quite limited and asymmetric, but with much more intense gene flow from F. orientalis to F. sylvatica than vice versa. This confirms our earlier observations with allozyme markers, when frequencies of several alleles changed abruptly between F. orientalis and F. sylvatica at very short distances and subsequently changed clinally toward central Europe (Gömöry et al. 1999, 2007). Hypothetically, a phenological gap may hamper gene exchange in this area. In fact, we have no exact data about differentiation in floral phenology among beech populations resulting from direct observations. However, in beech, the development of vegetative and generative Structures within the annual cycle is synchronized: Male and female flowers appear concurrently with bud flushing (Packham et al. 2012). Spring phenology observations are thus relevant for assessing the floral phenology overlap between beech subspecies. Several provenance experiments contain a mixture of F. sylvatica and F. orientalis provenances. In general, they show that F. orientalis provenances belong to early flushers (Robson et al. 2011; Gömöry and Paule 2011); however, delay in flowering of eastern F. sylvatica provenances is generally small. Recently, Kurz et al. (2023) assessed spring phenology of the introduced F. orientalis individuals admixed in German and Swiss beechwoods and found a phenological gap of more than 4 days. Such temporal difference is too small to hamper gene flow from F. orientalis to F. sylvatica in mixed stands, where both subspecies grow under identical climate. However, it remains questionable to what extent these observations are relevant to the potential of gene exchange between subspecies in natural populations. Even though in the contact zone (eastern Bulgaria and Greece), the stands of both subspecies grow close to each other, F. orientalis occurs typically at low to medium altitudes in warm and dry habitats related to deciduous Quercus forests, while F. sylvatica populations grow at higher elevations in moderate to cool climates (Bergmeier and Dimopoulos 2001). A contrasting floristic composition is also a strong indicator of a difference in climate requirements (Tzonev et al. 2006). Temporal shift of flowering between subspecies caused by climatic contrast linked along with a lack of stepping stones for gene flow may thus be much larger in situ than in artificially established stands. Given a slight protogyny of beech (Nielsen and Schaffalitzky de Muckadell 1954), earlier male flowering of F. orientalis induced by warmer microclimate combined with protogynous mating in the proximate F. sylvatica populations may explain the observed unidirectional gene flow. Excess F. orientalis ancestry was found in six cases, also confirming that gene exchange between subspecies is not equal in both directions. Asymmetry of hybridization rates is not exceptional in the Fagaceae, it was also observed in mixed populations of Quercus robur and Q. petraea. Different fertility rates were also observed in controlled crosses; unequal success of artificial crossing in different directions suggests genetic incompatibilities (Aas 1990; Steinhoff 1993), as factors relevant in natural populations such as phenological gap, differences in pollen cloud density, spatial setup, etc., are eliminated in this case. For beech, no such direct evidence coming from crossing experiments is available. Experience with the occurrence of hybrids in the plantations of F. orientalis in central Europe gives just partial information about the frequency and direction of hybridization. Kurz et al (2023) identified a higher proportion of hybrids in the vicinity of F. orientalis maternal trees than under F. sylvatica trees, but they admit that this may be due to a dominance of European beech in the pollen cloud. Budde et al. (2023) considered interspecific gene flow to be bidirectional, but most hybrids identified based on nuclear nSSRs in their study had chloroplast haplotypes of F. sylvatica, indicating the same asymmetry as that observed in our study.

Nevertheless, genetic isolation remains an alternative explanation of the observed introgression patterns. The outliers in genomic clines imply the existence of some type of genetic reproductive barriers. We found eight loci with genomic cline centers significantly shifted compared to neutral expectations, out of which three SNPs are localized in exons, thus potentially affecting functionality of gene products. The functional annotation of loci is known, but detecting the mechanisms underlying their contribution to reproductive isolation between subspecies is hardly possible. Even if we knew the exact function of the expressed proteins, it would be difficult to assess how exactly they contribute to physiological traits affecting fitness of inter-lineage hybrids. Irregular distribution of the outlier SNPs and their concentration on chromosomes showing high numbers of highly differentiated loci suggest that certain regions of the beech genome may resist introgression more than others (Mallet 2005; Gompert and Buerkle 2011).

In the case of the Alpine area, it is difficult to talk about a true hybrid zone. The populations on both sides of the Po valley are classified as the same subspecies. Even when both nuclear and chloroplast markers showed that the genetic lineage (more exactly, the mixture of genetic lineages) spreading across the Apennine Peninsula is differentiated from the rest of the distribution range (Magri et al. 2006; Vettori et al. 2004), there is no indication that they are reproductively isolated, even partly.

Implications for forestry

Under the progressing climate change, European beech has become a species with uncertain future because of a limited drought tolerance (Leuschner 2020; Martinez del Castillo et al. 2022; Rasztovits et al. 2014). Transfer of putatively pre-adapted forest reproductive materials from southern sites (assisted migration) and future introgression of established populations with the local ones (assisted gene flow) are often suggested as mitigation measures (Aitken and Whitlock 2013). In the case of beech, the southern European peninsulas (Iberian, Apennine, Balkan) or even F. orientalis are suggested as donor regional populations (Mellert and Šeho 2022; Kurz et al. 2023; Petrík et al. 2023). The hitherto experience from provenance experiments or plantations of F. orientalis in central Europe justifies this concept at least partly; on most sites, the performance of southern provenances or F. orientalis in large-scale international experiments or in the plantations is similar or at least not substantially worse compared to the local ones (Alía et al. 2011; Bogunović et al. 2020). Nevertheless, assisted gene flow requires that the introduced provenances hybridize with the local beech. Along with the first information about the extent of hybridization and survival of hybrids in juvenile stages provided recently by Kurz et al. (2023) and Budde et al. (2023), our results suggest that hybridization and introgression between divergent lineages within F. sylvatica s.l. have been ongoing since the contact between them was established, without detrimental effects on fitness.