    HJ and JT is a combo I'm not particularly aware of. However, 1984 Olympic women's JT champ Tessa Sanderson was a 6000+ heptathlete and actually first competed at the English Schools' Championships in the high jump. However, her second best heptathlon event was probably the 100H where she came second in the English champs in 1981. In fact, I have heard people talk of a correlation between hurdles and javelin. However, I think that is more of the case that good hurdlers are quite often decent javelin throwers rather than the other way around. I know Colin Jackson was nationally ranked in his age-group as a javelin thrower in his early to mid teens.
      One of the question I plan to ask is whether decathletes changed through the years.
      The problem is that the number of observation before 1984 is relatively small (roughly 100).
      I'm trying to increase the number by collecting the data for lower marks (the ultimate goal is 8000). If anyone has the data and not mind to share it, then I promise a quick answer.
      I assume you're already scraped the results from the top lists at the page? They appear to be missing many performances, however. They have 1638 performances over 8000, but there have been about 2500.


        So, you had a good start – so what?

        The last question before the Olympics (T&F) start is concentrating in the importance of a good (or bad) start to the overall decathlon score. In other words, can we predict the results of the other events based on the score in the 100.
        In order to examine this question, I look at the performances of all decathletes that had a PB of 8,200 and above. For these decathletes I examine the result in the 100 and compared it to the result of the 100 when they set their PB in the time. Then, I examine their performance in the next 8 events compared to their performances when they set their decathlon PB. Because of data availability I have only performances above 8,000 points.

        The bottom line
        The performance in the 100m is correlated with the next 8 events.
        The relation is much stronger in bad scores than in good scores. A really bad score in the 100 on average suggests a decrease of ~150 point in the next eight events compared with an average one, whereas a good performance suggests an increase of ~90 points.
        The relation in stronger for the other three sprint events (LJ, 400, 110) than the other 5 events. While wind may have being a possible reason behind this result the relation between 100 and the 400 is relatively very strong.
        No relation between the performance in the 100 and the performance in the PV.
        Finally, the relation between the 100 and the other day 1 events is stronger than day 2 events.
          Technical notes

          As previous questions, I concentrated only at the performance of decathletes with PB of 8200. That is, I include only performances that the decathlete had a PB of 8200+ at the start of the decathlon. Due to data limitations (from decathlon2000 site) only decathlon with a final score of 8,000 points are included. Additionally, many of the old timers are not included. The total number of performances that I analyze is ~1,000.
          Of these I compared between two performances. The performance in the current decathlon vs. the performance in the decathlon that the athlete set his decathlon PB (at the time of the performance). I first calculated the difference in the points in the 100. The average difference in -24 points, which suggests (not surprisingly) that decathletes typically run 0.1 slower than the time they ran when they set their decathlon PB. Then, I calculated the difference for the next eight events (ignoring the 1,500) and examine the relation between the two.

          To illustrate what I have done, let's look at an hypothetical decathlete Kevin that have five performances above 8,000. Each of these performances is divided to three parts. Score in the 100, score in the next eight events and the score in the 1,500. The following table presents these performances in chronical order.
          Total score Score 100 Score 2-9 Score 1500
          1 8100 950 6,400 750
          2 8250 1000 6,470 780
          3 8050 1050 6,250 750
          4 8400 1030 6550 820
          5 8150 1045 6400 705
          The first two performances of Kevin are ignored as at the time when he starts the decathlon he didn't have a PB of 8,200. After the second decathlon when Kevin score above 8,200 he enters the analysis and his performance in decathlon 2 becomes the benchmarks. In decathlon 3 Kevin score 1,050 points in the 100 compared with 1,000 in his benchmark, hence his relative mark is 50. The result in the next eight events 220 points lower. I divided this number in 8 to calculate the average relative performance in these eight events (from LJ to JT) so the score is -27.5. The following table presents the calculation for Kevin's decathlon performances
          relative 100 relative ave. 2-9
          1 N/A N/A
          2 N/A N/A
          3 50 -27.5
          4 30 10
          5 15 -18.75
          Note, that after performance 4 when Kevin broke his decathlon PB, this becomes the new benchmark that performance 5 is compared to.

          Lastly, I ignore the performance in 1,500. The reason behind it is that the performances in the 1,500 is often related to the relative performance of the decathlete rather than the shape that he is in. In other words decathletes will be willing to punish themselves more when chasing a medal then when they are not in contention.

            As previous analyses, I divided the relative 100 into equal groups only this time I used 5 (quintiles) as the more interesting results (IMO) are in the middle three quintiles rather than the two extreme one. Then I calculated the average relative performance in events 2-9 (LJ through JT). In the next two columns I split this performance the three sprint events (LJ, 400, 110h) and the other six.
            Group Relative 100 Relative 2-9 Rel_sprint Rel other
            1 Below -56 -33 -55 -21
            2 -56 to -32 -21 -30 -16
            3 -30 to -11 -16 -21 -13
            4 -10 to 10 -10 -12 -10
            5 Above 10 -4 -4 -5
            4-2 11 17 6
            The first columns is the group number (1 is the worse). The second column is the relative performance in the 100 (see technical notes). The next three columns are the important one. Column three is the average relative results in the next nine events (LJ-JT). The final two columns are splitting the nine events into sprint (LJ, 400, 110h) and the other five events (SP, HJ, DT, PV,JT).

            When I started the test, I expected that the two extreme quintiles will have the effect that I found. I have to say I am more surprised that there is relatively large gap between the middle quintiles (see last row). Note, that the numbers are averages so a difference of 11 points is actually 88 points in the next eight events.


              I assume you're already scraped the results from the top lists at the page? They appear to be missing many performances, however. They have 1638 performances over 8000, but there have been about 2500.
              And the results they miss are typically old one.
              There are three sites that I use:
              The most comprehensive one is decathlon2000 - but it is also the least friendly to extract data from.
              All-time athletics is (as always) wonderful. Of all 8,200+ scores there are four or five results they miss. The problem is that the site only has the overall results and not the individual events. When I will finish my updating (probably in 2023) I will offer my data to Peter Larsson.
              World athletics has so many omissions (and mistakes) that it is really pathetic. To illustrate, from 8140 to 8100 there are 288 performances, of which only 164 (less than 60%) are in world athletics.


                Finally a comparison between the performances in the 100 compared to the two technical events PV and JT. Like the previous table the relative performance is divided to five and then I calculate the relative performance in these two events.
                PV JT
                1 -8 -29
                2 -14 -21
                3 -15 -23
                4 -12 -11
                5 -9 -3
                The only explanation I can come with is that decathletes treat the PV as a different animal and their success is more about technique then their overall abilities. However, I would except then the JT to be much the same. Any other ideas are welcome.