- © 2014 The Paleontological Society. All rights reserved.
The record of the taxonomic evolution of North American ungulates is critical to our understanding of mammalian evolution and environmental change throughout the Cenozoic. The distribution of sampling in the ungulate fossil record over time and geographic space and the degree to which this biases the observed patterns of taxonomic evolution is poorly understood. To address these issues, I placed fossil collections and occurrences drawn from the Paleobiology Database into 2-Myr time intervals between 55 and 1 Ma. I determined the variation in numbers of fossil collections and occurrences, using three metrics to measure geographic variation: first, the area of the convex hull containing all collections in an interval, to determine the areal coverage of sampling; second, the mean pairwise geographic distance among collections as a measurement of the dispersion of collections within that area; and third, the interval-to-interval migration of the geographic centroid of all collections, to calculate changes in the geographic location of sampling. Each of these showed considerable variation over the Cenozoic, and both the area of the convex hull (ACH) encompassing all collections in an interval, and mean pairwise distance (MPWD) among them showed increasing trends over time.
To minimize the effect of variation in numbers of fossil samples over time, I used standard sample-standardization procedures. To minimize the effect of geographic variation in sampling over time, I standardized the area of sampling among intervals. I also employed both standardizations sequentially. Each standardization procedure had surprisingly little effect on observed patterns of taxonomic richness and rates. This indicates that, for North American ungulates, neither variation in number nor geographic distribution of fossil samples exerts an overwhelming influence on perceived macroevolutionary patterns. These results confirm the ungulate fossil record as a critical and faithful record for our understanding of Cenozoic environmental change and the mammalian evolutionary response.
Mammalian taxonomic richness (i.e., tallies of species, genera, or other higher taxa) is frequently used as a currency with which to measure evolutionary and ecological responses to environmental change (e.g., Prothero 1999; Alroy et al. 2000; Barnosky and Carrasco 2002). In particular, the fossil record of mammalian ungulates (Orders Artiodactyla and Perissodactyla) remains among the most compelling lines of evidence of environmental change in the Cenozoic and of the faunal response (e.g., Matthew 1926; Janis 1993; Webb et al. 1995; Jacobs et al. 1999; MacFadden 2000; Janis et al. 2002a; Bobe and Behrensmeyer 2004). The prominent role of ungulates is a manifestation of their broad geographic distribution, the relative abundance of their fossils, their relatively continuous temporal record, and the demonstrably close relationship between ungulate morphology and habitat (e.g., Kappelman 1988; Janis 1990, 1995; Spencer 1995; Mendoza et al. 2002; DeGusta 2005). For example, in North America, recent studies have cited a peak in ungulate richness in the middle Miocene as evidence of an ecologic and evolutionary response to vegetational (Janis 2000a; Janis et al. 2002a) and tectonic change (Kohn and Fremd 2008).
The determinants of taxonomic richness of living mammals within assemblages or geographic regions remain under intense study by ecologists (e.g., Badgley and Fox 2000; Isaac et al. 2005; Hortal et al. 2008). On evolutionary time scales, forces that directly promote or inhibit the processes of lineage origination and/or extinction combine to influence taxonomic richness indirectly. Therefore, studying the link between environmental change and taxonomic evolution requires analysis of the underlying taxonomic rates, and not richness alone. Furthermore, different combinations of origination and extinction rates can produce indistinguishable patterns of taxonomic richness; patterns of taxonomic rates might better illustrate evolutionary responses to extrinsic forces that would be indiscernible if only patterns of richness were analyzed. In spite of this, rates of taxonomic evolution have received less attention in ungulates than taxonomic richness alone (with notable exceptions, e.g., Cifelli 1981; Hulbert 1993; Alroy 1996).
The degree to which the ungulate fossil record faithfully chronicles these macroevolutionary patterns is crucial to our understanding of the evolutionary response of animals to changing environments. Our perception of taxonomic richness and corresponding rates of taxonomic evolution can be distorted by disproportionate sampling among time intervals. All else being equal, intervals with higher sampling intensity are more likely to yield higher numbers of taxa, as well as first and last occurrences of taxa (Foote 2000). It has now become commonplace to use sample-standardization techniques to minimize the effect of sample-size variation among intervals in macroevolutionary studies (e.g., Alroy 1996, 2000, 2009; Alroy et al. 2001, 2008; Bush et al. 2004; Barnosky et al. 2005; Kohn and Fremd 2008; Peters and Ausich 2008). Such techniques draw samples of equivalent size from all time intervals then estimate taxonomic parameters from these more commensurate subsamples.
Variable sampling intensity among intervals is an obvious potential bias in our understanding of taxonomic evolution, but there are other possible biases. Since the first global tabulations of global richness of fossil marine macroinvertebrate taxa throughout the Phanerozoic (e.g., Raup 1976a), researchers have documented a strong correlation between geographic area of sampling and apparent taxonomic richness (e.g., Raup 1976b; Sepkoski 1976; Flessa and Sepkoski 1978). This correlation results both from biological factors that once determined the distribution of the organisms in life (e.g., the species-area relationship [e.g., MacArthur and Wilson 1967; Rosenzweig 1995]) and from subsequent geological, taphonomic, and sampling factors that determine the distribution of known fossil occurrences (Raup 1976b; Behrensmeyer et al. 2000). Recent studies demonstrate a positive relationship between the area over which fossil occurrences are distributed in an interval and sampled taxonomic richness in both the marine (Wall et al. 2009) and terrestrial (Barnosky et al. 2005) fossil records. Moreover, these studies suggest that much of the apparent temporal variation in taxonomic richness results solely from variation in the geographic area over which fossil occurrences are distributed, even after standardizing the numbers of samples among time intervals.
The goal of this study is to evaluate the fossil record of North American ungulates quantitatively, and to provide the most robust estimates to date of their macroevolutionary history. I first document variation in both the numbers and geographic distribution of fossil occurrences. I then attempt to standardize both of these simultaneously among time intervals, thereby reducing their distortion of the underlying biological patterns of ungulate taxonomic evolution.
Data and Methods
Taxonomic Occurrence Data
Data used in this study are derived from the Paleobiology Database (PaleoDB; http://paleobiodb.org). The data were downloaded on 15 January 2009; I used the group names “Artiodactyla” and “Perissodactyla,” and the following parameters: time intervals = 0 to 65 Ma, region = North America, paleoenvironment = terrestrial. I restricted this study to Artiodactyla and Perissodactyla because of their similar ranges of body size and ways of life (Van Valen 1971). Furthermore, they are very closely related clades (e.g., Springer et al. 2004; Matthee et al. 2007; Murphy et al. 2007) and share nearly contemporaneous dates of origin. Following previous studies (Alroy 1996, 2000, 2009), I constrained the geographic scope of this study to western North America, between 27.00°N and 55.70°N latitude, and 93.40°W and 126.98°W longitude (see Fig. 1 for a map of this region).
Individual collections have assigned maximum and minimum absolute ages in the PaleoDB (see Alroy 2000, 2009). To account for this uncertainty in collection ages, in all procedures described below, I performed 1000 pseudoreplicate analyses, and in each, I randomly assigned each collection a date from a uniform distribution within its designated age range. I randomly drew these age assignments in each pseudoreplicate analysis, so no two pseudoreplicates included exactly the same set of collection age estimates. For example, reported face-value richness and rates are based on median ages from 1000 pseudoreplicates of this collection dating procedure using the full data set of all sampled occurrences. The same collection dating procedure was also applied to standardization analyses (see below). The error reported for estimates of taxonomic richness and rates, therefore, includes uncertainty in collection ages.
I assigned each collection to a corresponding 2-Myr time interval. There are 32 such intervals between 1.0 and 65.0 Ma. Intervals with unequal durations (e.g., Cenozoic subepochs, stages, NALMAs) can distort estimates of richness and taxonomic rates (Foote 2000; but see Butler et al. 2011; Mannion et al. 2011). I included a terminal interval (1.0–0.0 Ma) for analysis to reduce the impact of “edge effects” (Foote 2000) and the “Pull of the Recent” (Raup 1979) would have on the last interval in my window of observation. Note that I do not report results from this interval because of the distortions these phenomena impart upon estimated evolutionary patterns. Henceforth, I will refer to individual intervals by their midpoint ages (e.g., 22 Ma).
I performed all analyses at the genus rather than species level for several reasons (see Sepkoski 1998), particularly, genera are more evenly sampled over time and space than species. I also used genera for practical reasons specific to this study. The choice of an optimal analytical resolution is a balance between making intervals short enough to distinguish true underlying biological patterns and keeping them long enough to maximize the number of occurrence data in each (see sensitivity analyses by Alroy 1996, 2009). The median stratigraphic range of a North American ungulate genus is roughly 3.6 Myr, whereas that of a species is only 0.75 Myr. The faithful documentation of species-level taxonomic evolution therefore would require the use intervals shorter than 1 Myr, or else render most taxa singletons (i.e., with first and last occurrences within an interval) and not readily amenable for analysis (see Foote 2000). Furthermore, such short intervals leave few occurrences in each, thus entailing a low sample-standardization quota. Lower quotas necessarily limit the maximum richness (and numbers of first and last occurrences) that can be observed in these intervals, thereby analytically distorting the resulting standardized patterns. Finally, there are, on average, only 2.5 species for each North American ungulate genus, further suggesting that genus-level analyses are not missing a large proportion of species-level dynamics. In spite of these justifications, I repeated all analyses using species (not shown), and their results are qualitatively similar, albeit with more apparent variation among intervals.
Measuring Geographic Variation of Sampling
There is no single metric that adequately characterizes the geographic distribution of fossil localities, and interval-to-interval changes thereof. Accordingly, I describe below three quasi-independent measurements easily calculated by using latitude and longitude coordinate data from fossil collections within the PaleoDB: (1) the area of the convex hull (ACH) encompassing all collections in an interval, (2) the mean pairwise distance between all collections in an interval, and (3) the migration of collection centroids from interval to interval (see Mohr 1947 for alternatives). I obtained latitude and longitude coordinates for all collections from the Paleo-DB; and excluded from analysis any collections lacking these coordinates. I calculated all linear distances between coordinate pairs as Great Circle distances measured in kilometers.
Unless otherwise stated, I performed all calculations and analyses for this study using programs I had written in Java. All source code and data sets are available upon request.
Area of Convex Hull
The convex hull (also commonly referred to as the minimum convex polygon) is the polygon with the smallest area that encloses all points, and commonly is used to estimate geographic range (e.g., Mohr 1947; Worton 1995; Lyons 2003; Clapham et al. 2009; Hadly et al. 2009; DeSantis et al. 2012) and morphological diversity (e.g., Foote 1999). I first projected the latitude and longitude of each collection as Cartesian coordinates in an Albers Equal-Area projection (Snyder 1987), then calculated the area of the convex hull surrounding all collections, using the algorithm of Bevis and Cambareri (1987). For example, Figure 1A shows an interval in which the ACH of fossil localities is small, relative to that in Figure 1B. As stated above, taxonomic richness is expected to be positively correlated with geographic area both for biological reasons (i.e., the species-area relationship) and simply because more fossil localities are expected from a larger geographic area (e.g., Wall et al. 2009).
Mean Pairwise Distance between Collections in an Interval
Estimates that use extreme points such as ACH typically are sensitive to outliers. An alternative measure of the geography of sampling is the mean pairwise distance between all pairs of collections (henceforth, MPWD). MPWD is analogous to the variance around a mean of a set of univariate measurements and is commonly used as a measure of morphologic diversity (e.g., Foote 1992, 1993; Ciampaglio et al. 2001). MPWD summarizes the geographic dispersion of collections and density of sampling within an interval. For a given geographic area (described above) and number of collections, low values of MPWD would correspond to intervals in which most collections were concentrated around one or a few regions (e.g., southwestern Wyoming in Fig. 1A), whereas high values would represent intervals in which localities were broadly distributed (e.g., Fig. 1B). However, MPWD should decline as the density of sampling increases and new collections fill in geographic space between existing collections.
The relationship between MPWD and taxonomic richness will be determined (at least in part) by the variation in taxonomic composition over space and the density of sampling. For example, if the taxonomic composition of ecological communities changes rapidly over geographic space (i.e., high beta-diversity), sampling that is more evenly distributed (i.e., high MPWD) will tend to sample a greater number of distinct taxa than if sampling were concentrated in a particular subregion (i.e., low MPWD). Therefore, in this example, for a given number of collections and ACH (above), sampled taxonomic richness should be positively correlated with MPWD. An extreme example on the other end of the spectrum would be one in which all taxa were distributed evenly over their geographic ranges and the geographic scale of sampling was smaller than the geographic range sizes of species. In this example, any collection is just as likely to sample the all taxa in the region, and taxonomic richness correspondingly should be independent of MPWD.
Interval-to-Interval Shifts in Collection Centroids
Even if ACH and MPWD were constant among intervals, apparent rates of origination and extinction can be distorted if identically distributed samples from different intervals are drawn from different geographic regions. In this study, I measure distances between collection centroids from subsequent intervals to indicate the magnitude of geographic shifts in the centers of sampling (Lyons 2003). The collection centroid of an interval is the pair of geographic coordinates in the geometric center of all contained collections, and is analogous to the “mean” locality within an interval. I calculate the centroid of sampling as the mean latitude and longitude of all collections in an interval. Note that it does not necessarily correspond to an actual fossil locality. I calculated the distance between centroids of subsequent intervals as their Great Circle distance. This distance is low when there is little change of geography of sampling between subsequent time intervals, but large shifts indicate sampling of new regions. Note, however, that changes in the geographic distribution of collections between intervals can result in movement of the centroid even if the boundaries of the total region from which samples are drawn (e.g., the convex hull) remains constant.
Large changes in geographic centroids could potentially lead to large numbers of apparent originations and extinctions, as taxa are recovered from previously unsampled regions and previously sampled regions go unsampled in subsequent intervals. Therefore, large changes in collection centroids could affect apparent rates of evolution, perhaps mirroring dramatic taxonomic turnover. There is no expectation that centroid migration would yield more apparent originations than extinctions, or vice versa, so it should have no predictable effect on richness.
In this study, I attempt to standardize sampling among time intervals by subsampling fossil collections and/or occurrences on the basis of numbers of collections and/or occurrences in intervals, or the geographic distribution of collections, or both. I subsample within intervals by randomly drawing collections and/or occurrences without replacement until a sampling quota (see below) is met. Because the included collections are drawn at random, I repeated these procedures 1000 times, and report the median statistics of these replicates.
Standardizing Geographic Area of Sampling
I standardize the geographic area over which samples were drawn by determining a fixed geographic area over which to draw samples from each interval (henceforth, the G standardization approach). To do this, I first determine the interval with the smallest ACH (described above), which empirically showed the most restricted geographic area of sampling. I use the ACH value of this interval to set the geographic “quota” over which collections could be drawn, such that in each replicate analysis, each interval would have an area less than or equal to the “quota.” To achieve this, for every interval other than that with the lowest ACH, I choose a single initial collection at random. I draw a random integer, i, from a uniform distribution between 1 and the total number of collections in that interval, and then select the ith collection as the initial collection. I then sequentially add collections to this subsample in increasing order of distance from this initial collection until the next collection to be drawn would exceed the “quota” ACH. Note that although the area of sampling in each replicate will be less than or equal to the “quota,” the random selection of the reference collection in each allows the subsample membership, centroid location, and MPWD to vary among replicates.
Standardizing Numbers of Samples among Intervals
In this study, I attempt to minimize interval-to-interval variation in raw numbers of samples by subsampling collections to a quota determined by the minimum sum of squared occurrences from all collections within an interval (O2W of Alroy et al. 2001). Other methods of setting the sampling quota, including unweighted (UW) and occurrences-weighted (OW), and subsampling of occurrences (as opposed to collections), returned qualitatively identical results (not shown). I report the results for the O2W method because it is based on the empirical relationship between mammalian fossil occurrences and specimens, following Alroy (2000, 2009).
Joint Standardization of Geographic Area and Numbers of Samples among Intervals
Because collections are not evenly distributed over geographic space, this geographic standardization still results in variation in raw numbers of samples among intervals. Therefore, in separate analyses, I also performed both standardizations sequentially, first standardizing by geographic area, then by numbers of collections using the O2W method (henceforth, the G+O2W standardization approach). In principle, this approach simultaneously addresses two of the most conspicuous sampling biases of taxonomic patterns.
Estimation of Taxonomic Parameters
In this study, I measure taxonomic richness in two ways. First, I measure the total number of taxa with a fossil occurrence within a particular interval (hereafter, sampled-in-bin [SIB] richness). Second, I add to this SIB richness the number of taxa that are known to range through the interval (e.g., first and last occurrences before and after the interval, respectively) but do not have an occurrence within it to yield the minimum number of taxa that must have been extant in each interval (hereafter, range-through [RT] richness). Note that both of these measures are totals across the entire sampled region of western North America, and so are not identical to some previous studies that estimate taxonomic richness within subregions (Kohn and Fremd 2008) or individual assemblages (e.g., Janis et al. 2000, 2002a).
Rates of Taxonomic Evolution
I estimate rates of taxonomic origination and extinction with the per-taxon and per capita equations of Foote (2000), using first and last intervals of occurrences. In all cases, results from each of these equations were qualitatively identical, aside from different (but predictable) distortions due to edge effects. I therefore report only the results using the per capita equations, because these are less susceptible to distortion due to temporal variation of preservation (see Foote 2000). Note, however, that the close correspondence between estimates from each set of equations suggests that potential distortions of per-taxon rates are of low magnitude.
Correlating Sampling and Taxonomic Richness and Rates
One goal of this study is to determine the correlation between measured sampling variables (e.g., numbers or geographic distribution of occurrences and collections) and taxonomic richness and evolutionary rates. I measure these correlations with standard, nonparametric Spearman's rank correlation (SRC). To directly address the influence sampling parameters have on evolutionary parameters, and to avoid any autocorrelation within the time series, I also detrend the time-series data by taking first-differences (i.e., differences between pairs of subsequent intervals; Raup and Crick 1982; McKinney 1990), and perform SRC on the interval-to-interval changes in the sampling and taxonomic parameters.
Variation in Raw Numbers of Samples
The numbers of taxonomic occurrences and fossil collections show appreciable variation over the Cenozoic (Fig. 2). In particular, there are peaks in numbers of occurrences of 948 in the early Eocene (54 Ma), 446 and 481 in the latest Eocene (36 and 34 Ma, respectively), and 1206 in the middle Miocene (12 Ma). Lowest total numbers of occurrences occur in the late Oligocene (28 Ma) with 86 and the late Miocene (6 Ma) with 63.
Variation in Geography of Sampling
The ACH enclosing all collections within an interval (Fig. 3A) and the MPWD among those collections (Fig. 3B) also show considerable variation among intervals. Peak ACH (3.29 × 106 km2) occurs in the early Pleistocene (2 Ma), but note that ACH is near 3 × 106 km2 for eight of the 11 Neogene intervals. Maximum MPWD (1118.3 km) occurs in the late Miocene (8 Ma; Fig. 3B). The lowest ACH (∼4 × 106 km2) occurs in the first three intervals of the early Eocene (Fig. 3A), and the minimum MPWD (189.6) appears in the early Oligocene (30 Ma; Fig. 3B). In addition to this variation among intervals, both of these measures also show secular trends increasing toward the present (ACH vs. time, Spearman's rank correlation [SRC] ρ = 0.825, p << 0.00001; MPWD vs. time, ρ = 0.657, p < 0.0003). This indicates that sampling generally is over a larger area and more dispersed in younger time intervals.
There is also considerable migration of the collection centroids between subsequent intervals (Fig. 3C,D). Maximum observed centroid migration (709.7 km) occurs in the late Eocene (38 Ma), and the minimum (43.6 km) occurs during the earliest Eocene (55 Ma). Unlike the other measures, there is no significant trend in migration distance over the entire interval in this study (centroid migration vs. time SRC ρ = 0.051, p > 0.795).
None of the geographic measurements have a temporal pattern that is congruent with that of the absolute numbers of collections (first-differences: ACH vs. number of collections, SRC ρ = 0.101, p > 0.613; MPWD vs. number of collections, ρ = −0.163, p > 0.416; centroid migration vs. number of collections, ρ = −0.028, p > 0.889). Most notably, the dramatic peak of collections and occurrences observed in the middle Miocene (12 Ma) does not correspond to any substantial changes in ACH, MPWD, nor centroid migration, indicating that it is not simply the product of widespread sampling. The lack of correlation between absolute numbers and geographic distributions of collections supports the intuitive conclusion that collections are nonrandomly distributed in space as a result of both geology (i.e., the location of sedimentary basins) and sampling by paleontologists. To provide a hypothetical example, dense sampling of a particular sedimentary basin or deposit would yield numerous collections, but their close proximity would not greatly expand ACH, and actually would reduce the MPWD.
Effects of Geographic Standardization
The G standardization procedure dramatically reduces the temporal variation of ACH (Fig. 4A) and MPWD (Fig. 4B). Some intervals in the middle Eocene (particularly 44 Ma) with high face-value ACH and MPWDs have among the lowest values when standardized. This suggests that a few outlier collections account for the large face-value ACH and MPWD, and are frequently excluded by stochastic chance during collection subsampling. The opposite pattern is observed for some of the early Oligocene intervals (e.g., 32 Ma). Importantly, increasing trends observed when the unstandardized collections are used no longer exist for either ACH (SRC ρ = 0.055, p > 0.780) nor MPWD (ρ = −0.257, p > 0.187) after standardization. In contrast to ACH and MPWD, the observed migration of collection centroids among intervals showed no reduction, and was in fact increased by the G standardization (Fig. 4C). Therefore, whatever biases this migration might impart to the estimates of taxonomic evolution have not been mediated in these analyses.
The raw temporal pattern of face-value SIB richness is significantly correlated with both ACH and MPWD but, surprisingly, not with the total numbers of collections nor occurrences (Table 1). The O2W standardization only slightly diminishes the correlation between richness and the two measures of geography, whereas the G standardization eliminates the significant relationship entirely. Interestingly, the correlation remains significant when both standardizations are used sequentially (G+O2W). Because the G standardization, which eliminates this correlation, is applied before the O2W standardization, this correlation must re-emerge from the combination of the nonrandom geographic distribution of collections and their stochastic selection during the O2W subsampling.
When these time series are detrended by analyzing interval-to-interval changes (i.e., first-differences), the pattern of face-value SIB richness is significantly correlated with that of only numbers of occurrences, and not number of collections (Table 1), indicating that, of the variables examined, these are the primary determinants of interval-to-interval changes in the face-value variation in richness. This correlation is removed by the O2W and G+O2W standardizations, but is slightly exacerbated by the G standardization alone. There is no significant correlation between first-differences of either face-value or any standardized estimates of SIB richness and first-differences of either ACH or MPWD. This suggests that these geographic parameters do not strongly influence face-value variation in richness. For example, increases in either ACH or MPWD from one interval to the next do not necessarily induce an increase in taxonomic richness. The apparent correlation between raw (i.e., not detrended) face-value richness and these geographic parameters may be spurious, because although all show generally increasing trends, the causes are likely unrelated.
Figure 5 depicts the temporal pattern of ungulate taxonomic richness, and the effects that the different sample-standardization strategies have on it. Face-value RT and SIB richness (Fig. 5A) both show a local maximum in the late Eocene (36 Ma) followed by a sharp decline into the Oligocene. An increasing trend appears to begin at the end of the early Oligocene (28 Ma), culminating in the peak of richness in the middle Miocene (12 Ma). This maximum value is then followed by a dramatic decline throughout the remainder of the Miocene, to relatively low richness that remains fairly constant throughout the Pliocene.
SIB richness closely follows RT richness, and in fact, in some cases (e.g., late Eocene, 36 Ma, and middle Miocene, 12 Ma) SIB richness equals RT richness. This close correspondence is due to relatively complete sampling. This good sampling is also indicated by the fact that taxa with first occurrences before and last occurrences after each interval are typically preserved as well (the median proportion of range-through taxa preserved in an interval is 95.9%), and only a small proportion of taxa in any interval are singletons (median = 20.5%, maximum = 35.7%). Other intervals (e.g., late Oligocene, 30 Ma) show particularly large discrepancies between the two measures, suggesting lower sampling during these times.
Separate inspection of the standardized richness curves for the O2W (Fig. 5A) and G (Fig. 5B) approaches indicates the effect each contributes in the combined G+O2W approach. The O2W-standardized curve shows a pattern concordant with face-value richness (first-differences of standardized vs. face-value RT richness, SRC ρ = 0.575, p < 0.002; first-differences of SIB richness, SRC ρ = 0.431, p < 0.03), but some discrepancies exist. For example, the Oligocene exhibits almost no apparent drop in standardized RT richness from late Eocene values, suggesting that the apparent trough in face-value richness might have been due in large part to a paucity of fossil samples in the earliest Oligocene, rather than a true change in diversity. Furthermore, the Neogene peak in richness shifts from the middle Miocene (12 Ma) to near the boundary between the early and middle Miocene (16 Ma). This suggests that the apparent face-value peak in richness at 12 Ma is a result of the unusually high number of collections and occurrences found in this interval (Fig. 2). Maximum richness at 16 Ma is followed by a more gradual decline in richness until the end of the study interval, rather than the more precipitous drop displayed in the face-value pattern.
The G-standardized richness curve also shows a fairly concordant pattern with face-value richness (first-differences of RT richness, SRC ρ = 0.728, p ≪ 0.001; first-differences of SIB richness, SRC ρ = 0.753, p ≪ 0.001). The largest deviations appear in the middle to late Eocene (42–38 Ma), and again throughout the early to middle Miocene (22–14 Ma), suggesting that these intervals have an excess of collections drawn from a disproportionately large area. These results demonstrate that although ACH and MWPD might not have a strong influence on SIB richness over the entire time series (Table 1), standardization of geographic area still has large effects in specific intervals, and therefore important implications for our interpretation of long-term patterns of taxonomic evolution.
Given the effects of geographic area and sampling intensity on different parts of the time series, the combined approach (G+O2W) likely provides a more accurate assessment of richness than either does alone. The general pattern (Fig. 5C) shows a very rapid increase in richness throughout the early Eocene (54–48 Ma) that appears to level off for the remainder of the Eocene, with some variation, including a sharp, one-interval decline in the early late Eocene (36 Ma). The Eocene/Oligocene boundary is followed by a gradual and nearly monotonic increase in richness thereafter, until the maximum richness is reached in the middle Miocene (16–14 Ma). A steady decline in richness follows this maximum until the end of the study interval (12–2 Ma).
Rates of Taxonomic Evolution
The rates of taxonomic evolution underlying these patterns in richness are shown in Figure 6. Time series of rates under all standardization approaches are virtually identical (not shown), and highly congruent with face-value rate estimates (first-differences, face-value vs. G+O2W origination rates, SRC ρ = 0.657, p < 0.0003; face-value vs. G+O2W extinction rates, SRC ρ = 0.862, p ≪ 0.0001). This suggests that neither variation in numbers of samples nor their geographic distribution has a great effect on our perception of ungulate taxonomic rates.
As the geographic area (i.e., ACH) or dispersion of sampling (i.e., MPWD) increases from one interval to the next, it is likely that new species will be sampled from previously unsampled areas, increasing the apparent origination rate. Conversely, if the area or dispersion declines, taxa sampled in one interval might not be sampled in the next, increasing the apparent extinction rate. In general, these predictions are observed in this study; origination rate is positively correlated and extinction rate is negatively correlated with changes in ACH or MPWD, although only the positive correlation between origination rates and MPWD is statistically significant (first-differences of MPWD vs. G+O2W origination rates, SRC ρ = 0.408, p < 0.039). Furthermore, sample-standardized rates of origination are positively correlated with migration distance of collection centroids, but not significantly so (first-differences, SRC ρ = 0.075, p < 0.712). The positive correlation between collection centroid migration distance and extinction rate in the preceding interval is slightly stronger, although still not statistically significant (SRC ρ = 0.151, p < 0.461). This indicates that the observed shifts in sampling location might have an effect but are not sufficiently great as to dominate the face-value pattern of taxonomic rates.
Origination rates in most intervals are low throughout most of the study interval, but show peaks in the early Eocene (52 Ma), late Eocene (38 Ma), early Miocene (22 and 20 Ma), and Pliocene (4 Ma). Extinction rates show only slightly more variation, with peaks in early Eocene (52 Ma), middle Eocene (46 Ma), late Eocene (40 Ma), late Oligocene (24 Ma), early Miocene (20), middle Miocene (14 Ma), and the latest Miocene (6 Ma). Interestingly, most of the peaks in extinction occur in intervals that immediately precede intervals with peaks in origination. In fact, the time series of origination rates is strongly correlated with extinction rates from the previous interval (first-differences, SRC ρ = 0.643, p < 0.0004).
North American Ungulate Evolution
Reassuringly, sample-standardized patterns of taxonomic richness of North American ungulates largely corroborate other recent studies (Janis 2000b; Janis et al. 2000, 2002a, 2004). In discussing the general patterns below, I will highlight some notable differences between the results of this study and previous findings. Nevertheless, the strong correspondence between the face-value and sample-standardized patterns of taxonomic richness and rates suggest that North American ungulates have a robust and reliable fossil record. This should reinforce our confidence in the research over the past century that has used these patterns to understand environmental change and the mammalian response.
The early diversification of North American ungulates is characterized by a rapid increase in generic richness over their first 6 Myr (Fig. 5C). This increase predictably results from high rates of origination (Fig. 6A). However, simultaneously high rates of extinction (Fig. 6B) indicate this was also a period of rapid taxonomic turnover, during which genera with relatively short durations quickly replaced one another. This diversification takes place during the latter half of a 8–10 Myr trend of increasing global temperature that culminated approximately at 50 Ma in the early Eocene Climatic Optimum (EECO), the highest global temperature of the Cenozoic (Zachos et al. 2001, 2008).
Ungulates first rise to prominence in North America during the middle Eocene, when their generic richness reaches a plateau that persists throughout much of the remainder of the Eocene. This pattern from the sample-standardized analysis differs somewhat from the face-value pattern (Fig. 5A) and previous estimates of ungulate generic richness (Janis 2000b), suggesting monotonically increasing richness throughout the middle Eocene. In the current study, the sample-standardized pattern suggests that ungulate richness was at or near its Paleogene maximum much earlier in the Eocene. This peak richness in North America early in the middle Eocene coincides with a peak in ungulate richness in Europe at roughly the same time (Blondel 2001). The plateau in generic richness throughout the middle and late Eocene is intriguing considering it persists through a period of climatic and environmental change. This interval is characterized by a persistent trend in declining global temperatures following the EECO (Zachos et al. 2001, 2008). Furthermore, considerable evidence suggests that terrestrial environments were also changing over this interval, as relatively homogenous tropical or subtropical forests were giving way to more heterogeneous environments with at least some open canopy (e.g., Webb 1977; Janis 1993; Sheldon and Hamer 2010; Townsend et al. 2010).
The one deviation in this plateau occurs near the end of the middle Eocene (40Ma), when there is a reduction in richness (Fig. 5C) accompanied by increased rates of extinction (Fig. 6B) (see also Mannion et al. 2011). Interestingly, this reduction in richness is not apparent in either the face-value or the O2W (alone) pattern (Fig. 5A), but rather is revealed by the G standardization (Fig. 5B,C). Neither ACH nor MPWD is substantially higher than in either the preceding or subsequent interval, suggesting that the G standardization is picking up on a more complex aspect of the nonrandom geographic distribution of the collections. This apparently transient deviation from the pattern of otherwise relatively constant taxonomic richness follows a brief period of global warming (the Mid-Eocene Climatic Optimum), at approximately 42 Ma. A peak in origination rate follows these extinctions (Fig. 6A), with immigrations from Asia (e.g., Mannion et al. 2011) complementing the evolution of new endemic genera.
No substantial changes in either sample-standardized richness (Fig. 5C) or rates (Fig. 6) are evident across Eocene/Oligocene boundary, in spite of substantial global (Katz et al. 2008; Liu et al. 2009) and regional cooling (Zanazzi et al. 2007) and corresponding vegetational changes (e.g., Wolfe 1971; Retallack 1992; Wing 1998; Retallack et al. 2004). This lack of response to climate change matches previous studies of North American mammals (Prothero and Heaton 1996; Prothero 1999; Alroy et al. 2000; Prothero 2004; Alroy 2009), but contrasts with the broad-scale Eocene and Oligocene faunal changes observed in Europe (e.g., Bevis and Cambareri 1987; Blondel 2001; Willig et al. 2009) and Asia (e.g., Meng and McKenna 1998; Lyons 2003; Tsubamoto et al. 2004).
Much of the Oligocene is characterized by gradually increasing taxonomic richness (Fig. 5C) and uniformly low rates of origination and extinction. The sample-standardized pattern (Fig. 5C) suggests generic richness in the Oligocene was nearly equivalent to that in the Eocene. This contrasts with the face-value (Fig. 5A) and previously published patterns, which both indicate that Oligocene richness was substantially lower than in the late Eocene.
This period of stable taxonomic richness and low taxonomic rates throughout most of the Oligocene is followed by two distinct episodes in which pulses of high extinction rates (24 Ma and 20 Ma) are immediately followed by intervals of elevated origination (22 Ma and 18 Ma, respectively; Figs. 6A,B). Richness increases modestly throughout this interval, as extinctions are offset by even greater pulses of origination indicating considerable taxonomic turnover. One of these pulses of origination (22 Ma) is characterized by the immigration of Old World ruminants (e.g., blastomerycines, dromomerycines, antilocaprids [Webb et al. 1995]). The initiation of this turnover was apparently coincident with the shift in abundances of closed- vs. open-habitat grass phytoliths (Strömberg 2002, 2004, 2005) in the Great Plains of North America, suggesting that the origin of grass-dominated ecosystems environments might have prompted reorganization of the ungulate communities. The spread of grasslands in Africa similarly has been implicated in the explosive diversification of the ungulate family Bovidae in the Pliocene of Africa (e.g., Vrba 1992; Bobe 2006; Derry and Dougill 2008). The origin of grass-dominated ecosystems around the Oligocene/Miocene boundary likely influenced two conspicuous determinants of ungulate taxonomic richness and biomass within modern ungulate communities (Hopcraft et al. 2010): habitat heterogeneity (e.g., Turpie and Crowe 1994; du Toit and Cumming 1999; Cromsigt et al. 2009) and primary productivity (Janis et al. 2000; Olff et al. 2002; Pettorelli et al. 2009). The transition of terrestrial ecosystems throughout the Cenozoic from closed environments (e.g., forests) to open environments (e.g., grasslands) was likely a gradual, and geographically heterogeneous process. Throughout, ungulate habitats were likely a mosaic of open grasslands and moderately dense woodlands, perhaps supporting large numbers of ecologically diverse taxa (e.g., Janis et al. 2000, 2002a, 2004; Prins and Fritz 2008), and reproductively isolating them from one another (e.g., Vrba 1992). The transition from forest to grassland also moves the bulk of primary productivity from the canopy, out of reach of most ungulates, to ground level, where it becomes usable by ungulates (e.g., Bodmer 1989). This, in turn, may have supported ungulates in abundances great enough to maintain the viability of reproductively isolated subpopulations. Similar coevolutionary hypotheses of coevolution between dinosaurs and plants have been tested (Barrett and Willis 2001; Butler et al. 2009a,b; Butler et al. 2010). Unfortunately, occurrence data alone are insufficient to test mechanisms underlying the observed changes in taxonomic rates (e.g., Vrba 1987), and additional analysis of ecologically relevant morphology is required. Intriguingly, ecomorphologic analyses of limbs (Janis and Wilhelm 1993; Janis et al. 2002b) also indicate changes immediately following the Oligocene/Miocene boundary.
Higher rates of origination and extinction early in the Miocene culminate in the maximum observed richness of ungulates in the middle Miocene (14–16 Ma; Fig. 5C). Interestingly, this peak in richness throughout the study area coincides with previously reported peaks in ungulate richness within assemblages and regions (Janis et al. 2000; Kohn and Fremd 2008). This might suggest similar underlying controls on richness at local, regional, and continental scales, and it corroborates the view of middle Miocene as a unique interval in the evolutionary history of North American ungulates (e.g., Janis et al. 2004). The unusually high ungulate taxonomic richness in the middle Miocene resulted from the accumulation of lineages that originated in a series of pulses beginning immediately after the Oligocene/Miocene boundary, several million years earlier (e.g., Janis et al. 2002a).
The middle Miocene peak in richness is followed by a decline in richness throughout the remainder of the study interval (Fig. 6). During this period, extinction rates consistently outstrip origination rates except for a peak in origination between 3 and 5 Ma, to which immigrations from the Old World contribute (e.g., cervids and bovids). This protracted period of extinction roughly coincides with global cooling and further aridification and widespread opening of environments (e.g., Webb 1977, 1983; Webb et al. 1995).
Bias of Geographic Variation in Sampling
The overall lack of correlation between SIB taxonomic richness and geographic area of sampling is of particular interest, as it runs counter to the conclusions of Barnosky et al. (2005), who suggested that much of the apparent temporal variation they observed in mammalian richness within particular geographic regions could be explained by changes in the geographic area of sampling. Note that the conclusions of Barnosky et al. are based on their analyses of a data set with broader taxonomic scope and more restricted geographic ranges (i.e., biogeographic provinces). As medium- to large-bodied mammals, ungulates typically have greater home ranges and dispersal abilities than smaller bodied mammals (e.g., Kelt and Van Vuren 2001). The typical home range of an ungulate genus therefore likely covers a larger proportion of any total study area (e.g., in this study, western North America), and hence the bias of geographic area might affect ungulates less. If this is true then the relationship between home range size and the potential biasing effect of the geographic area of sampling is worthy of future research, because variation in home range size among taxa could lead to taxon-specific (as opposed to interval-to-interval) biases in these types of analyses.
As previously noted, the geographic standardization based on ACH attempted here is very simplistic. Standardization of MPWD is also theoretically possible by iteratively excluding randomly selected collections from intervals with higher MPWD than that observed in the interval with the minimum. In practice this might prove difficult, because the relationship between number of collections and MPWD is not as straightforward as that with ACH. For example, subsampling collections cannot increase ACH, but can yield either higher or lower MPWD depending on the geographic distribution of the collections. Furthermore, sequential standardizations run the risk of reducing the number of data to the extent that the biological patterns are no longer apparent. Lastly, subsequent standardization procedures can reintroduce biases addressed by previous procedures, as was apparent when variation in ACH was reintroduced by the O2W standardization of the combined (G+O2W) approach. Therefore, no single or even combined standardization approach is likely to entirely eliminate all temporal variation in sampling. However, the results of this study show that the effects of multiple standardization approaches should be explored independently to determine differences in timing and magnitude among specific intervals (cf. Mannion et al. 2011).
Distributions of Collections and Geographic Ranges of Taxa
A nearly identical set of analytical methods has recently been applied to fossil occurrences to study the geographic range of taxa (e.g., Lyons 2003; Hadly et al. 2009; DeSantis et al. 2012). For example, these studies have used ACH on fossil collections to estimate the expansion and contraction of taxon geographic ranges, and to compare ranges among taxa and time intervals. Geographic distributions of fossil collections, and therefore the parameters estimated in this study (ACH, MPWD, and centroid migration), are complex functions both of the geographic ranges of the taxa in life and of the geography of the various components of the sampling process (e.g., taphonomy, distribution of sedimentary basins, human effort). Clearly, our ability to accurately estimate the true geographic range of taxa in the geologic past can be distorted by the geographic distribution of fossil collections.
Previous studies (e.g., Lyons 2003; Hadly et al. 2009; DeSantis et al. 2012) have well recognized this and attempted to mediate this potential bias by using proportional geographic ranges. This proportion is calculated as the ACH of collections bearing a taxon of interest, divided by the total possible geographic range (i.e., the ACH of all collections). The G standardization proposed herein can also be used in such studies, by first standardizing the geographic area over which the taxa could be found, and then determining either the absolute or proportional area in which they are actually found. Although it is beyond the scope of this study, it would be worthwhile to evaluate the ability of both the proportional area and the G standardization used here (and perhaps their simultaneous use) to faithfully reveal changes in geographic range sizes. This could perhaps be addressed through simulation study, or by artificially imposing a nonrandom distribution of collections (as might be found in the fossil record) upon taxa with empirically known geographic ranges (i.e., extant taxa).
The taxonomic evolution of North American ungulates plays a pivotal role in our understanding of Cenozoic environmental change and the mammalian response. Numerous recent studies point out that uneven sampling from the fossil record over time and across geographic space can distort perceived patterns of evolution. The results of this study suggest that, at least for the taxa and geographic region analyzed in this study, neither of these potential biasing factors exerts a strong influence on the overall temporal pattern of taxonomic richness. There are, of course, individual intervals in which one or both have clear effects, particularly on patterns of taxonomic richness. These intervals underscore the importance of standardization procedures for robust interpretation of taxonomic evolution from the fossil record, even from well-sampled groups like ungulates. But the close concordance of face-value and sample-standardized patterns of taxonomic evolution suggests that the fossil record of North American ungulates is remarkably complete, and relatively free from the biasing effects of temporal variations in sampling. These results validate the prominent role of the ungulate fossil record as a reliable chronicle of mammalian evolution.
I thank S. K. Lyons for indispensable theoretical and programming advice and assistance regarding the measurement of convex hulls. M. Kosnik, S. Peters, and P. Wagner also provided suggestions that greatly improved the analytical design and interpretation of this study. I am very grateful to A. Miller, P. Mannion, and two anonymous reviewers for exceptionally insightful, helpful and thorough reviews of an earlier version of this manuscript. This is Paleobiology Database official publication number 192.
- Accepted 18 August 2013.