When Did US Fencing Begin Using Ratings?

Discussion in 'Fencing Discussion' started by sdubinsky, Mar 2, 2015.

  1. MrTwister

    MrTwister Made the Cut

    Joined:
    Apr 1, 2015
    Messages:
    146
    Likes Received:
    0
    I agree with this statement and would even generalize it. The best N fencers (based on their fencing history) should not start in the same pool. N could be some small number like 8 or 16 if number of pools is at least twice N.

    Here is a proposal of metric to estimate equality of the pools. Let's assume that current relative strength of the fencers is reflected in final results of the tournament. We will disregard all previous history. We will take number of pools (P) and assign rank 1 to all fencers that got place from 1 to P. We assign rank 2 to all fencers that ended up in places P+1 to 2*P. We will continue till end of our list. We will have total number of ranks equal to maximum number of fencers in a pool.
    Let's calculate now for any of the seeding system for each pool the average rank of its fencers and check standard deviation of these averages.
    Ideal seeding system (the one that could predict end result perfectly 100%) should have standard deviation of these pool's ranks close to 0.

    Surely current seeding system have advantage over any other system because it is used in a tournaments and obviously initial seeding affects current results.

    I will check if my elo's have any edge over current system and will publish results here.
     
  2. MrTwister

    MrTwister Made the Cut

    Joined:
    Apr 1, 2015
    Messages:
    146
    Likes Received:
    0
    I have run some numbers and have to honestly report that my calculated Elo's don't predict final placement better than current letter seedings.
    It is especially does much worse when there are a lot of fencers with provisional ratings. But even if I exclude the tournaments where U rated fencers are about 40% of participants ELO's seedings have still slightly greater standard deviation of pool's strength.

    I suspect that as I said above current system has an advantage of affecting final results over other systems and we don't know what would happen if ELO would have been used for seeding. And I guess we will not know that unless other system would get adopted.
    The other factors that could affect these results is that my ELO calculations are not tuned up yet. There is no provisional ratings yet. Coefficients that I use are not tuned up yet and probabilities for pool and DE bouts are considered the same even though they are not.

    What do you guys think about metric I used?
     
  3. Privateer

    Privateer DE Bracket

    Joined:
    Feb 8, 2009
    Messages:
    951
    Likes Received:
    106
    Those are valid points, thank you. I'm applying my experience to all the rest of us unwashed masses, and my experince may not be everyone's. I'm comparing ~30 E/U's at an E-under to the ~10 that show for an Open, which may not be accurate.
     
  4. mfp

    mfp Podium

    Joined:
    May 10, 2002
    Messages:
    1,828
    Likes Received:
    221
    I'd suggest reading the paper below and thinking about what your metric is really measuring, or at least what affects it:


    The paper explores a small DE of only 8 competitors. Consider what happens in a DE format when you have more than 8 competitors, perhaps many more like at a NAC. Or consider what happens with different distributions of players' strengths. Did the numbers you ran come from tournaments of various sizes and with various competitor strengths?

    Basically, the metric you used has the same problem the metric K'ON used has: its first assumption.
     
  5. MrTwister

    MrTwister Made the Cut

    Joined:
    Apr 1, 2015
    Messages:
    146
    Likes Received:
    0
    Interesting paper ! But what does it have to do with comparing of ranking systems? It looks at the case where strength of all participants are known ahead of time and only one participant strength is varied due to for example adding stronger player to the team. It asserts also that single elimination tournaments are not totally fare.
    Our case is totally different. We are looking at seeding people to the pools and trying to make pools as even as possible to get "fair" seeding after pools. The problem is how to estimate strength of fencers before pools. Is current estimator the best we can have or we could find a better one.

    I see the main problem for us to compare any metrics or ranking systems is fairness of comparison. We can't go back to already completed tournaments and run a scenario where seeding would be done using different system and it's even problematic to change it for any future tournaments.

    In my calculations I used all the epee tournaments which have results on askfred since 2006. When I have noticed that events with a large number of provisional ratings ("U") have very large deviation in results I tried to exclude those.

    Perhaps a more narrow metric like "make sure that top 10% of fencers based on final results were not seeded to the same pools" could be used.
     
  6. mfp

    mfp Podium

    Joined:
    May 10, 2002
    Messages:
    1,828
    Likes Received:
    221
    It has to do with the assumption upon which your metric is based which is:

    The metric falls apart there with the first assumption. The final results of DE tournaments are affected by many factors, some stated in the paper others hinted at.

    DE format tournaments aren't very effective at ordering competitors by relative strength beyond a small number of places. And many might be dismayed at how the DE format does for even those top places.

    Yes, that metric is an unfair comparison as it starts by assuming something that's false.
     
  7. MrTwister

    MrTwister Made the Cut

    Joined:
    Apr 1, 2015
    Messages:
    146
    Likes Received:
    0

    You are right if you are looking just at one tournament. It's a probability after all, meaning that what final placement will be is not predetermined by just strength. But we are not talking about single tournament. It's obvious that stronger fencers have higher probability to win when we consider single bout. The paper shows that strongest fencer doesn't have the highest probability to win if fencers are distributed randomly. But that is only in case if the difference in strength between first and second fencers is not large and in one tournament only. If you run the tournament with the same fencers over and over again using random seeding you will get that strongest fencers is on top more than anyone else.

    I believe that my assumption is not false if consider all the tournaments and not just one. Of course you can make it false if you pair #1 and #2 in a first round of DE all the time but it's not the case in current tournament format.

    I probably went too far suggesting that "relative strength of the (all) fencers is reflected in final results of the tournament" that's why I'm revising it as
     
  8. MrTwister

    MrTwister Made the Cut

    Joined:
    Apr 1, 2015
    Messages:
    146
    Likes Received:
    0
    I've got interesting chart I want to share with you all.
    It shows probability of wining 5 and 15 touch bouts depending on probability of wining 1 touch bout.
    Take a look.

    Fencing Probabilities.jpg
     

Share This Page