Early Season Speed Ratings and a Brief Overview of Speed Ratings

Bill Meylan (September 3, 2005)

This article is somewhat repetitive in regards to other articles already posted ... It addresses some recent inquiries concerning my speed ratings with regards to their application and accuracy ... As requested, a brief summary overview of speed ratings is presented ... Remember, the purpose of this methodology is to determine how fast people are running relative to each other at different courses.

Early Season Speed Ratings ... In general, the most difficult time of the season to calculate comparative speed ratings is in the beginning of the season ... The reason is lack of data ... The overall process is data-dependent which basically means that the accuracy of speed ratings improve as more and more data becomes available (which can only occur as the season progresses) ... In early season, data are limited and statistical "margins-of-error" may be high resulting in speed ratings that are not as accurate as I would like.

Speed ratings can be generated ONLY if one (or a combination) of following are available:

(1) A good Course Conversion Table for the cross country course being considered (more about this later in the article) ... This method is useless for new courses, courses that have been modified, or courses where I have inadequate (or no) prior data evaluation.

(2) Speed Data on Individual Runners ... For the first race of the season, there are no prior speed data for the current season ... I will look at XC speed data from the previous year, but that data is last year's data and must be used with reservation (it might be helpful, but it can also be inaccurate for the current season).

(3) Complete Race Results containing appropriate groups of runners for statistical sampling ... In practice, I graph the race results and try to identify various sub-groups of "average" runners that correspond directly to sub-groups of "average" runners of known speed ability (I can also use "above-average runners") ... This can be very difficult in small invitational races!! (and may be impossible) ... Early season races add another problem - early season races contain a mix of runners ready to race good times (because they have been running much of the summer) and runners who are not in-shape (but will be later in the season), and this makes race graphs difficult to interpret.

Bottom-Line for Early Season ... The most "inaccurate" speed ratings are likely to occur early in the season because the statistical "margins of error" can be high ... Often, I make my best "educated guess" and go from there ... As the season progresses and more teams and individuals cross-over in invitational races (race teams from other leagues and sections), the accuracy improves ... Also as the season progresses, the overall process becomes increasingly a statistical number-crunching operation with few educated-guesses on my part.

Noteworthy Side Consideration ... I receive e-mails from coaches and runners wondering why some teams or individuals are NOT included in my database or ranking lists ... Reason One - some sections and leagues historically have limited (or NO) results available ... I can NOT speed rate races with insufficient data (and this occurs frequently within NY State, especially within certain sections) ... When I do not know the course, individuals or quality of a race (with some degree of acceptability), I do NOT rate it ... Also - available race results for some races may go only 10 or 25 runners deep - therefore, I do NOT have results for other runners in the race ... Available race results means "readily" available through Armory Track, the various sectional web-sites, the timing-company web-sites, on-line newspapers, or other web-sites making it "readily" known that results are being posted there.

Brief Overview (Speed Ratings)

Where do Speed Ratings come from?? ... Speed Ratings for an individual race are generated by the following method:

(1) Get the actual final times for a race.

(2) Determine how fast or how slow the race times are compared to a standard base-line race of known speed ... the determination is a number of seconds (plus or minus) which is known as the correction.

(3) Add or subtract the correction to each of the actual times ... for example, if the correction is 10 seconds faster than the standard base-line race, then add 10 seconds to all of the actual race times ... this produces the "adjusted" race times for each runner.

(4) The method could stop at (3) above ... however, I use the "adjusted" times for a variety of comparisons and other processes involving mathematical calculations ... performing math on race times is a "pain-in-the-neck" compared to using regular numbers, so I convert the "adjusted" times to a number called a "speed rating" (for those familiar with horse racing and the Daily Racing Form, this is similar to the Beyer Speed Figures) ... My speed ratings use the following conversion equation:

Speed Rating = (1560 - Adjusted Time (in seconds)) / 3

1560 is the number of seconds in 26 minutes ... so in effect, the speed rating is how much faster the adjusted time is than 26 minutes (in seconds) divided by three ... the "divided by three" simply means that one speed rating point equals three seconds.

Determining Race Corrections ... The entire speed rating methodology depends on accurate race corrections ... The determination of how fast or slow a race is compared to a standard race can use three separate and independent methods ... when possible, I use all three to see how well they agree:

(1) Course Conversion Tables ... course comparison tables compare the speed of different race courses to each other based upon statistics or historical observation ... I use the SUNY Utica course as my standard or base-line course (and compare all other courses to it) ... Some current course comparisons on my tables are:

    SUNY Utica                 0  (base-line course)
    Bowdoin Park              -5 to -15
    Sunken Meadows            -3 to -13
    Saratoga Park             -70 to -80
    Bear Mountain 2004        -63 to -78
    McQuaid Invite            -75 to -85
    Van Cortlandt Park (2.5)  -230 to -238 (boys)
    Van Cortlandt Park (3.1)  -13 to -25 (Footlocker)
    Elma Meadows              -5 to -20
    Chenango Valley St Park   -20 to -30
    Bethpage St Park          -25 to -35

The negative numbers mean faster than the SUNY Utica course ... I actually have separate tables for boys and girls (the table above is primarily boys ... the girl's ranges are similar for most, but do vary).

Limitations of Course Comparison Tables ... My comparison tables (and anybody else) are only approximations ... they typically consider only normal/good running conditions ... and you must assume the course distance doesn't change from year to year ... Daily variations with respect to weather and other factors MUST be considered.

Conversion tables are NOT my first choice in determining race corrections ... I prefer using them as a check on calculations from the other methods ... However, the tables are useful for courses that are consistent in speed ... I commonly use the average table correction for Van Cortlandt Park (2.5 mile) ... Saratoga Park typically has a race correction of 75 seconds under normal conditions (for both boys and girls) - For example, Nicole Blood broke the Saratoga course record at the 2004 Suburban Council Championships by running 16:41.9 ... the speed rating was calculated as follows:

16:41.9 + 75 seconds = 17:56.9 ... 17:56.9 = 1076.9 seconds ... Speed Rating = (1560 - 1076.9) / 3 = 161

I do not know the Saratoga course record for boys; however, a 15:00 time would typically yield the following:

15:00 + 75 seconds = 16:15 ... 16:15 = 975 seconds ... Speed Rating = (1560 - 975) / 3 = 195

(2) Reference Runners ... This increasingly becomes the method of choice (within NY State) as the season progresses ... My cross country databases contain results for many individual runners ... as the invitational season progresses, more and more individuals compete against runners from other leagues and other sections (and other states) ... As more results becomes available, it allows better "cross comparison" of any individual in my database to any other individual, and more importantly, any groups of specific individuals (such as teams) ... The "cross comparison" is the time differences between individuals at both the same and different races (in conjunction with the actual final times).

For every race, I keep track of how fast every runner runs relative to every other runner on the same day at the same course ... I then extend it with results from other race courses so I can compare how fast any runner has run relative to any other runner on a single day or throughout the season ... It is just keeping track of time differences and final times ... It requires a computer program for the scale of comparison I do within NY State, and it's done on a race by race basis to see how fast a particular race is compared to other races based on how fast individuals have run ... Hence, the individual runners are the reference for determining the race speed ... the time differences and final times are statistically evaluated to find median course correction values.

In addition, I frequently apply a modified "Reference Runner" methodology as follows (because it is easier and takes less time) ... (a) Use the actual race times to calculate "unadjusted speed ratings" by assuming zero correction ... (b) Get the existing overall speed rating for each individual from the current Rankings (or database) ... (c) Find the difference between the "unadjusted speed rating" and the current overall speed rating ... (d) Find the statistical median of the difference as the correction ... (e) apply the correction to the "unadjusted" ratings ... Here is an example:

                                    Overall
                     Unadjusted     Seasonal      Rating      SPEED
             Time   Speed Rating  Speed Rating   Difference   RATING
             -----  ------------  ------------   ----------   ------
 Runner 1    15:30     210.0          195         +15.0       186.0
 Runner 2    15:40     206.7          183         +23.7       182.7
 Runner 3    15:43     205.7          185         +20.7       181.7
 Runner 4    15:45     205.0          180         +25.0       181.0
 Runner 5    15:50     203.3          171         +32.3       179.3
 Runner 6    15:52     202.7          176         +26.7       178.7
 Runner 7    16:00     200.0          165         +35.0       176.0
 Runner 8    16:05     198.3          173         +25.0       174.3
 Runner 9    16:10     196.7          180         +16.7       172.7
 Runner 10   16:12     196.0          172         +24.0       172.0

For this made-up example, the statistical median of Rating Differences is about 24 speed rating points (or 72 seconds) ... The race correction becomes 72 seconds faster than the standard course, therefore, 72 seconds is added to each actual race time ... and the "adjusted race times" are used to calculate the normal Speed Rating.

This example Reference Runner method assumes the majority of runners are consistent in speed ... this assumption is reasonable after the majority of individuals have had the opportunity to achieve a decent level of fitness ... In the early season, many runners have not achieved an appropriate level of fitness (so this must be taken into consideration).

Additional information on Reference Runners is available in a separate article.

(3) Statistical Sampling of Groups ... This method does NOT use any individual runner data ... Often, it is the only method available to compare races from different sections of NY State or other states ... It requires complete race results (and sometimes that's not enough to adequately compare races) ... It requires statistical assumptions (as will be explained) ... Generally, it involves graphical interpretation of race results.

The Articles Page contains a list of the articles on this web-site, and several articles have examples of graphing races (so I am not going to repeat that here) ... the most recent article has some basic sampling information and examples concerning NTN 2004.

Here is a quick example of Statistical Sampling of Groups ... Consider a large group of high school students from many different schools (small, medium, large ... good, average, bad) ... Results of standardized exams allow educators to identify the "A", "B", "C", "D" and "F" students based on relative scores ... the "C" students are the largest group and can be visually identified on the classic "bell-shaped" graphs ... Results of cross country races are similar - in any large invitational race, the majority of runners fall in the "average" class (or "C" class) ... since the "C" class is large, it can be further divided into "C-plus", "C-average" and "C-minus" classes ... Assuming we already have the results of a large standardized race, then we already know how fast the "C-plus", "C-average" and "C-minus" runners will run.

Therefore, in any large invitational race (similar in quality to the standardized race), if we can identify the "C-plus", "C-average" and "C-minus" classes, then we can calculate with reasonable precision how fast one race is compared to other ... and we do NOT need to know anything about the individual runners or the race courses!

The assumption is that groups of runners are equal in quality ... and the primary focus is identifying the classes. Direct graphical comparison of races is NOT possible unless that assumption is accurate.

Note ... Graphical interpretation of race results is part statistical and part "art" (which means experience is necessary) ... Sometimes the graphs are obvious in their comparisons (those are the examples posted on this web-site) ... but occasionally, the appropriate interpretation is NOT obvious (usually due to an unbalanced or non-homogenous mix of "A", "B", "C", "D" and "F" runners).

For those viewers who have asked about the computer software I use .... For statistical evaluation, I use a combination of ProStat (Poly Software International), which is also my primary graphing software, and Microsoft Excel ... The actual cross country databases are maintained as WordPerfect Corel Paradox databases ... I have written series of programs (in Microsoft Visual C++) that do a variety of operations such as generating the actual output for placement on web-pages, calculation of speed ratings for entire races, and generating output for uploading into ProStat or Excel.