Announcement

Collapse
No announcement yet.

Decathlon statistics

Collapse

Unconfigured Ad Widget

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • olorin
    replied
    Bottom line -
    The sprinter decathlete indeed exits and those decathletes that are good in the 100 are also good in the LJ, 110h and the 400. On the other hand, there is a negative relation between the sprint events and the 1,500. So, Eaton was the exception rather than the rule.


    The jumper decathlete is largely fake news. There is only a low correlation between the HJ and the LJ and virtually no relation between these two jumps and the PV.

    The relation between the SP and the DT is the strongest of any pair in the decathlon, but there is only a very weak relation between the JT and the other two throws.

    If we combine the results of the first two questions, we can see how both sprinters and non-sprinter a like can be competitive in the decathlon. Sprint events give relatively low points for excellence but being good in one makes a decathlete more likely to be good in another three events. The technical events give much higher reward on their own, but they are uncorrelated with other events. Thus, both Warner and Kaul can compete for gold even though Warner is almost a second faster than Kaul in the 100.

    Leave a comment:


  • olorin
    replied
    Question 2 - Correlation between decathlon events
    The traditional way to divide decathletes into three categories. Those that are good in the running events, those that are good in the jumping events and those that are good in the throwing events. Some decathletes fit almost perfectly the above categorization. Eaton is considered to be a runner-jumper decathlete and indeed he was good in 6/7 events (except HJ), whereas Sebrle and Pappas are more of the thrower-jumper type of decathletes. But are these categorizations real? In other words, are those decathletes that are good in HJ also tend to be good in PV?
    The straightforward method is to examine the correlation between each event. However, the difference between a correlation of 0.3 to 0.5 is hard to quantify. So instead, I use the same method as for the for the first question. That is, I divide each event to ten even groups (deciles) and then examine the performance of the highest vs. lowest decile in the other nine events. For example, decathletes in highest decile in the 100 score 74 more points in the LJ than those decathletes in the lowest decile of the 100; suggesting (not surprisingly) a rather strong positive correlation between the two events.

    Leave a comment:


  • olorin
    replied
    Originally posted by polevaultpower View Post
    Olorin if you get bored after Tokyo, we will be having our Women's Decathlon Association Championship on August 21-22 Forecasting for our event has always been a challenge, with so few decathlons offered and so many athletes trying events for the first time, now we can add extra time off due to COVID to the mix!
    Thanks for the compliment, but I am afraid that you overestimate my abilities (and/or underestimate the difficulty of the task in hand).
    I follow men decathlon closely for over a decade and still can't come out with a good way to predict scores, so doing this for women with all the problems that you mentioned is way above my capabilities.
    Regardless, good luck with the event and hopefully Jordan Grey (someone else?) will break the AR.
    I would also like to ask other posters to avoid the never ending debate about women decathlon and keep this thread for his original purpose.

    Leave a comment:


  • polevaultpower
    replied
    Olorin if you get bored after Tokyo, we will be having our Women's Decathlon Association Championship on August 21-22 Forecasting for our event has always been a challenge, with so few decathlons offered and so many athletes trying events for the first time, now we can add extra time off due to COVID to the mix!

    Leave a comment:


  • olorin
    replied
    Originally posted by cigar95 View Post

    If the technical events *didn't* show more variability in their scores, we might suspect an issue with the tables themselves.

    I'm wondering if this might be an issue with the tables for the high hurdles. Whacking a couple hurdles can really costly in terms of performance - you'd think that would be reflected in variability of the scores, and at least keep the hurdles away from the bottom rungs of the ladder. One possibility is that the hurdles are an event where decathletes are generally going to be pretty good, so these bad mistakes aren't as frequent. Another possibility is that when they make really *bad* mistakes, the score suffers badly enough that they don't score over 8200 and don't make it into the database. Certainly this would be the case for a fall and dnf.
    Good post.

    Re 110h -
    I agree that that there is a problem in the fact that I included only good performances (above 8200) so we don't see the entire distribution. This bias is especially affecting the 110h as disasters are more frequent than most other events in the decathlon. Unfortunately, I can't see a solution to the problem as collecting the data of all decathlons is virtually impossible. However, I don't think that this bias is the entire story as there are other events with high probability of really bad outcome (e.g. LJ, PV).

    I think that the low scoring in 110h comes from two related aspects.
    1. I think that there is a problem with the scoring tables. An improvement of 0.1 in the 100 is worth almost twice as much than a similar improvement in the 110h (25 vs 13). While I agree that it is harder to drop 0.1 of your time in the 100, I think that the reward for improvement in the 100h is too "stingy".
    2. The "disaster" effect - because of the risk of falling, decathletes typically uses a more conservative approach to hurdling. For example, Eaton typically ran 13.6 - 13.8 during a decathlon, while in reality he was 0.3 second faster. The low rewards for an improvement in the 110h (see point 1) further push decathletes to more conservative approach and not to master the art of tight hurdling in high speed. This leads to the fact that the decathletes in the "good" group are not as good as they were suppose to be based on their potential abilities.

    Unfortunately, I cannot think of a way to test this topic with my data, so I will signoff.
    Last edited by olorin; 07-21-2021, 05:09 AM.

    Leave a comment:


  • Trickstat
    replied
    Originally posted by noone View Post
    Very interesting. Intuitively, I would have pretty much guessed the order correctly. Before the 400, I tell myself “who cares, they’ll all run between 48 and 50 seconds”.
    So the last four events are the most important ones! This explains the Niklaus Kaul effect!
    Another reason why the 400 doesn't tend to lead to many changes in order is that those who score highly in it tend to have a strong first 4 events anyway.

    Leave a comment:


  • Trickstat
    replied
    Originally posted by noone View Post
    Kevin Mayer ranked the events according to "danger", the risk of getting 0 points. I believe the order was PV, 110, LJ, DT, JT, HJ, SP, the flat runs. I would guess there have been countless 0's in the 1500 but due to athletes being out of contention
    ​​​​​​
    I think in most cases, they didn't start the 1500 so they would be DNFs for the whole decathlon.

    Leave a comment:


  • noone
    replied
    Kevin Mayer ranked the events according to "danger", the risk of getting 0 points. I believe the order was PV, 110, LJ, DT, JT, HJ, SP, the flat runs. I would guess there have been countless 0's in the 1500 but due to athletes being out of contention
    ​​​​​​

    Leave a comment:


  • cigar95
    replied
    Originally posted by olorin View Post

    The results are not driven by the poor group being extra poor in the technical events but also by higher rewards for being good.
    For example, when we will calculate the difference between the best group (highest decile) and the median (so there is no effect of weak performances) the PV and the JT are still with highest spread (146 & 137). Most of other events have a spread of ~100 points and the 400 & 110h only ~80 points .
    I didn't fully think through my observation, nor phrase it very well. Let me try again, while acknowledging there will be some degree of over-generalization just to keep from typing several pages. Perhaps revising my remarks to the point of becoming Captain Obvious - though I hope not.

    Performance in each of the 10 events depends critically on both the innate ability and the fitness of the competitor. But for the jav and the vault, the mastering of the technical skills and executing them properly each time are also paramount. There are not as many "fine-grain" technical skills needed to optimally run the 400 - with apologies to Michael Johnson, who ran it quite optimally. So the technical events add a third degree of variability in each performance, which is something that will contribute at both the high *and* the low end.

    If the technical events *didn't* show more variability in their scores, we might suspect an issue with the tables themselves.

    I'm wondering if this might be an issue with the tables for the high hurdles. Whacking a couple hurdles can really costly in terms of performance - you'd think that would be reflected in variability of the scores, and at least keep the hurdles away from the bottom rungs of the ladder. One possibility is that the hurdles are an event where decathletes are generally going to be pretty good, so these bad mistakes aren't as frequent. Another possibility is that when they make really *bad* mistakes, the score suffers badly enough that they don't score over 8200 and don't make it into the database. Certainly this would be the case for a fall and dnf.

    Leave a comment:


  • IloveFelix
    replied
    I found this very interesting, Olorin!

    Thanks a lot!

    Leave a comment:


  • olorin
    replied
    Originally posted by cigar95 View Post
    My sense is that with the technical events, it's "easier to be bad". To "werf the speer" or "spung the stab hoch", you have to do basic things right every single time. Whereas once you're at some level of fitness and with your own natural abilities - and all of the top guys are very fit - you're going to run more or less the time for 400 that everyone expects you to run.
    I'm a little surprised that the 110h isn't higher because it's easy to make mistakes in the hurdles.
    The results are not driven by the poor group being extra poor in the technical events but also by higher rewards for being good.
    For example, when we will calculate the difference between the best group (highest decile) and the median (so there is no effect of weak performances) the PV and the JT are still with highest spread (146 & 137). Most of other events have a spread of ~100 points and the 400 & 110h only ~80 points .
    Last edited by olorin; 07-20-2021, 07:59 AM.

    Leave a comment:


  • cigar95
    replied
    My sense is that with the technical events, it's "easier to be bad". To "werf the speer" or "spung the stab hoch", you have to do basic things right every single time. Whereas once you're at some level of fitness and with your own natural abilities - and all of the top guys are very fit - you're going to run more or less the time for 400 that everyone expects you to run.
    I'm a little surprised that the 110h isn't higher because it's easy to make mistakes in the hurdles.

    Leave a comment:


  • olorin
    replied
    Originally posted by noone View Post
    Very interesting. Intuitively, I would have pretty much guessed the order correctly. Before the 400, I tell myself “who cares, they’ll all run between 48 and 50 seconds”.
    So the last four events are the most important ones! This explains the Niklaus Kaul effect!
    If we calculate each performance in the decathlon relative to the average performance of all decathletes we have a good measure of the quality of the performance.
    For example, Warner recent 10.14 is worth 1062 points.
    The average of all decathletes (above 8200) in the 100 is 888 points.
    Suggesting Warner mark is worth 174 points above the average decathlete.

    Kaul (2019) gained 69, 35, 255, 130 compared to the average decathlete in the last four events.
    His score in the JT (255*) is the second best mark behind Tim Bright 5.70 in Seoul (scored 8216).
    * Kaul mark was 79.05 and he got 1028 points, the average score of all decathlete is 773 leading to 255 points gain.
    Last edited by olorin; 07-20-2021, 03:22 AM.

    Leave a comment:


  • deca
    replied
    That is a very cool analysis. Never looked at it that way. Almost makes it worth playing around with an entirely different training paradigm! As a Master's athlete, I could totally get into training for events in that order of priority as a novel approach.

    Leave a comment:


  • olorin
    replied
    Originally posted by bambam1729 View Post

    What about the shot put?
    Thanks Bambam,
    Added (190)

    Leave a comment:

Working...
X