When Did US Fencing Begin Using Ratings?

Discussion in 'Fencing Discussion' started by sdubinsky, Mar 2, 2015.

  1. fdad

    fdad Podium

    Joined:
    Jul 9, 2008
    Messages:
    4,129
    Likes Received:
    214
    It's easy enough to create an algorithm that treats avoiding a competition as slightly worse than finishing last.
     
  2. DangerMouse

    DangerMouse Podium

    Joined:
    Mar 13, 2003
    Messages:
    3,184
    Likes Received:
    196
    How do you distinguish between avoiding competition and choosing a limited number of competitions as part of a comprehensive training plan?
     
  3. mrbiggs

    mrbiggs Podium

    Joined:
    Mar 25, 2004
    Messages:
    7,837
    Likes Received:
    477
    Literally anything?

    To start:
    Seeding youth events, veterans events, classification-restricted events, large events, strong events, almost any significant men's epee event.

    Poor at comparing fencers across regions, inconsistent fencers, fencers who have taken time off, fencers who are fencing a secondary weapon, fencers who primarily fence at university, and fencers who primarily fence out of the country.

    Pretty much completely useless for seeding any kind of NAC.

    Mostly useless for qualifications and partitioning fencers into Div1, Div2, Div3.


    Some of the above is hard to solve (like people who have taken time off). Some seem to be obvious signs that the current system is a half step above useless. (Shouldn't a reasonable Div1/2/3 be a prerequisite of any classification system? Shouldn't we have good seeding for youth fencing considering its popularity and revenue? Why are inconsistent fencers rewarded with a high-water-mark system?)

    Don't get me wrong, this isn't priority #1 for US fencing. Initial seeding isn't THAT important, and our terrible system at least does a better job than no system. But I find it really bizarre that such a deeply flawed system is so staunchly defended by fencers.
     
    Last edited: Mar 14, 2015
  4. fdad

    fdad Podium

    Joined:
    Jul 9, 2008
    Messages:
    4,129
    Likes Received:
    214
    There are a number of ways, including using a decay factor (or only the top N recent results) that maintains most of the value of a "good" result for a "reasonable" amount of time while minimizing the impact of a "bad" result.
     
  5. DangerMouse

    DangerMouse Podium

    Joined:
    Mar 13, 2003
    Messages:
    3,184
    Likes Received:
    196
    Let me rephrase K O'N's question. How would you define and measure the goals of a seeding system? There have been lots of discussions about this in the past and while there are a lot of people arguing that the current system doesn't work well, they can't clearly define the goals of a seeding system or how to measure the success of a system at meeting those goals.
     
  6. fdad

    fdad Podium

    Joined:
    Jul 9, 2008
    Messages:
    4,129
    Likes Received:
    214
    IMO, the goal of seeding is to evenly balance the strength of the fencers in the various pools. Without putting too much effort into the definition, I would state the goal of a seeding system as: A fencer should be seeded above the set of fencers that, given a reasonable number of (recent) bouts between them and common opponents, fencer A has won more bouts than lost. Conversely, a fencer should be seeded below the set of fencers that, given a reasonable number of recent bouts between them and common opponents, fencer A has lost more bouts than won.
     
  7. DangerMouse

    DangerMouse Podium

    Joined:
    Mar 13, 2003
    Messages:
    3,184
    Likes Received:
    196
    The problem with this is the common opponents. What if there are no common opponents?

    Other than that, I agree with this as a goal. I think the current system does a decent job of doing this. Where I see seeding that is really off from ability level is when a long time has elapsed and either the fencer has progressed or their rating has expired. For example, at the Portland NAC this season in the D1ME, Laurie Shong (Canadian Olympian) was seeded dead last because he hadn't fenced in the US for a while.
     
  8. sdubinsky

    sdubinsky Made the Cut

    Joined:
    Jun 15, 2012
    Messages:
    169
    Likes Received:
    18
    Does anyone remember any stories about the last time US Fencing switched systems?
     
  9. K O'N

    K O'N Podium

    Joined:
    Aug 14, 2006
    Messages:
    3,593
    Likes Received:
    458
    There are a few issues with that, I think. As DM points out, in local mixed events you often will not have many common opponents. A Cadet ME fencer may not have fenced any of the people a Vet50 WE fencer has fenced this year, for example. But ok, I should be seeded above the people I usually beat and below the people I usually lose to. In the current system, I think I almost always am.

    You don't address at all either how the current system is doing, or avoiding any bad effects; by the current system, I mean ratings with four year decay at local stuff and Div II and other events that are just for fun, and points for the top end of Div I NACs and other events with national team implications.

    Do you have any evidence that the current system is seeding events badly?

    Do you have any bad effects you think a new seeding system should avoid? I pointed one out to you above, and you seemed to think it would be easy to avoid, but you didn't actually mention such bad effects in your goals.
     
    Last edited: Mar 13, 2015
  10. Ancientepee

    Ancientepee DE Bracket

    Joined:
    Jun 12, 2010
    Messages:
    951
    Likes Received:
    181
    A rating system could have the following attributes:

    • Allow a fencer’s rating to change regardless of where the fencer finishes in the competition. In other words, it should not restrict rating changes to the top finishers. If two fencers are eliminated in the first round, one fencer with two victories and the other with zero victories, the rating system should reflect that the former fencer had a better result than the latter. This will enable fencers to gauge the progress that they’re making in the sport as soon as they start competing and not until they start making finals. It is especially important that complete results of national-level competitions result in rating changes since this is how rating strengths will be synchronized nationwide.

    • Independent of the format of the competition. Here are just a few formats that could be used for a competition with 24 entries: a) four pools of six with everyone qualifying to a DE without repechage, b) four pools of six with sixteen qualifying to a DE with repechage, c) four pools of six with twelve qualifying to two pools of six with six qualifying to a final pool of six, d) the same as the preceding but with eight fencers qualifying out of the semifinal to a final pool of eight, e) a complete round robin of all 24 fencers, f) four pools of six with fifteen qualifying to three pools of five with nine qualifying to a final pool of nine. A rating system that assumes that all competitions will have a final of eight fencers and so rewards the eighth place finisher much greater than the ninth place finisher will give bad ratings changes for many local competitions.

    • Ratings indicate the fencers’ current results. If a fencer is fencing worse than their rating would indicate, that fencer’s rating should decrease. If a fencer is fencing better than their rating would indicate, that fencer's rating should increase. The actual amount of increase or decrease should be proportional to how much better or worse the fencer's result is from what would be expected by their current ratings. As a result, the seeding of competitions will be more accurate since it will be based on the fencers’ most current performance level.

    • The amount by which a rating changes based on sound mathematical formula and not on someone thinking that the numbers “look about right”.

    • The rating system is self-contained. That is, the calculation of rating changes is based solely on the results of the competition and the fencers’ ratings and not on external factors like international or national rankings because that could present timing problems if fencers are competing in competitions on the same day that there are competitions which effect national or international rankings.

    • Centrally administrated so that fencers cannot misrepresent their rating and divisional chairs cannot arbitrarily award them.

    I would also like to point out that in order for a rating system to work, the "common opponents" don't have to be everyone in the same competition. What's required is that the group of people being compared can be linked to each other via some string of competitions (like the six degrees of separation theory). This is why the first point listed above is so important. If rating changes are limited to only the top "N" finishers, the remaining competitors are excluded from the group of common opponents and the ratings will not be as accurate.
     
  11. DangerMouse

    DangerMouse Podium

    Joined:
    Mar 13, 2003
    Messages:
    3,184
    Likes Received:
    196
    While I know that you have spent way more time thinking about this than the rest of us combined, this is pretty much a description of how you would like a ratings system to function. I don't see what GOALS it is trying to meet or how we might be able to measure if it meets those goals. To me, this is putting the cart before the horse by designing a system without clearly defining what it seeks to achieve.
     
  12. Ancientepee

    Ancientepee DE Bracket

    Joined:
    Jun 12, 2010
    Messages:
    951
    Likes Received:
    181
    GOAL: "...enable fencers to gauge the progress that they’re making in the sport as soon as they start competing and not until they start making finals."
    GOAL: "...rating strengths ... be synchronized nationwide"
    GOAL: The rating system is "Independent of the format of the competition" so that (1) tournament organizers are free to select the best format given the number of competitors, strips/referees available, purpose of the competition (e.g., youth events have more bouts), (2) all formats can be accommodated, (3) the system doesn't need confusing complexity to accommodate different formats, and (4) the rating system does not need changes as (inevitably) new competition formats are implemented.
    GOAL: "Ratings indicate the fencers’ current results" rather than the current system which can indicate only what the one best result the fencer had four years ago.
    GOAL: "the seeding of competitions ... be more accurate"
    GOAL: The "rating changes [be] based on sound mathematical formula" so that the ratings it determines are accurate.
    GOAL: "The rating system is self-contained" so that the means to change one's rating are available to available to most competitors and can be verified by them.

    MEASURABLE: Are fencers able to "gauge the progress that they’re making in the sport [i.e., get a rating and see it reflecting their results] as soon as they start competing" or are ratings earned only after months/years of competition and don't give any real indication of how they'll do against other fencers?
    MEASURABLE: Is the rating system producing reasonable results regardless of the format of the competition itself? If it works for NACs but gives totally meaningless results for Divisional competitions (or vice versa), then it's useless.
    MEASURABLE: Are the ratings accurately predicting results? For example, if the rating difference between two fencers indicates that the higher ranked fencer should be winning 75% of their bouts, if we look at the bouts where that difference of rating existed, did the higher ranked fencer win about 75% of the bouts?
     
  13. Zebra

    Zebra DE Bracket

    Joined:
    Dec 25, 2013
    Messages:
    771
    Likes Received:
    181
    How would you deal with the situation of a vet who is skilled and experienced enough to dominate other vets, but no longer has the speed or strength to keep up with top-ranked young adults?
     
  14. DangerMouse

    DangerMouse Podium

    Joined:
    Mar 13, 2003
    Messages:
    3,184
    Likes Received:
    196
    Ok, now we're getting somewhere!

    I agree with most of your goals except that I think the goal of seeding competitions be "more accurate" and that the measure of this should be that initial seeding accurately predicts end results. This is at odds with ratings being a gauge of progress. If a fencer improves between tournaments, then their rating should NOT be an accurate prediction of their results.

    I think the current system does a great job of gauging progress in a simple way that various math based numerical systems don't. Kids understand the challenge of moving from an E to a D in a different way than moving from a 500 to a 520 rating.

    I think the current system has a lot of issues with consistency across geographic regions, but I don't see how changing rating systems is going to meet the goal of being synchronized nationwide without limiting ratings awards/adjustments to national tournaments. The fencer who only fences in their region will be very difficult to "accurately" seed in a very different region because there really isn't an accurate way to compare ability levels without fencers having faced each other multiple times.

    I also don't think that having ratings determined by a mathematical formula is a good goal. I think that it may be something that makes sense once the other goals are assessed, but that putting the use of a mathematical formula as a prerequisite could preclude other solutions that are not based on a formula that meet all of the other goals better.

    I would also add to the goals that any system change works across scales so that the same system can be used at local and national events. The current system does this, although at the national level for ME it has some issues, the other weapons are a bit better. This is similar to your goal of working across formats.
     
  15. mfp

    mfp Podium

    Joined:
    May 10, 2002
    Messages:
    1,828
    Likes Received:
    221
  16. piste off

    piste off Podium

    Joined:
    Jul 13, 2006
    Messages:
    3,324
    Likes Received:
    443
    The same way that we would deal with cadets/juniors who dominate their peers, but don't have the speed or strength to keep up with top-ranked vets.
     
  17. Ancientepee

    Ancientepee DE Bracket

    Joined:
    Jun 12, 2010
    Messages:
    951
    Likes Received:
    181
    The current classification system is no better. If, for example, a fencer is currently a "C" fencer but then "improves between tournaments" to the level of a "B" fencer, that fencer will still be seeded as a "C" fencer at the next competition that that fencer enters which means that the classification system is "predicting" that that fencer will finish among the other "C" fencers. The classification system does not and can not determine how much fencers improve between tournaments and adjust their classifications accordingly before that next competition. Their classifications change only because of fencers' results in competitions.

    A numeric rating system would work in exactly the same way. If a fencer has not competed in a competition during the time that that fencer's improvement takes place, that improvement will be reflected in the fencer's rating only after the improvement is shown in a competition. If the improvement is real, then the "prediction" of what that fencer's results will be in that competition will be off. The goal is to make the prediction more accurate, not to guarantee absolute accuracy. If any system were capable of that, there'd be no need to hold the competition.

    The current classification system fails miserably in seeding fencers because it is "predicting" their results based on the one best result that the fencers have had in the previous four years which is an extremely poor measure of current performance. That's why the classifications play only a minor role in the seeding of the national-level events.
     
    Last edited: Mar 15, 2015
  18. Ancientepee

    Ancientepee DE Bracket

    Joined:
    Jun 12, 2010
    Messages:
    951
    Likes Received:
    181
    I should add that I am no longer in favor of the numeric rating system I first proposed in 1995 nor the revised version in 2010. I think that the USFA now has the ability to adopt a system that's closer to the one that Elo proposed and the ones that the US Chess Federation and international chess federation have adopted. That is, it can base the ratings on the results of individual bouts rather than on final placement in the competition. The biggest problems will be how to assign initial ratings and how to handle difference between 5-touch pool bouts and 15-touch DE bouts.
     
  19. 40lbsheavier

    40lbsheavier Rookie

    Joined:
    Sep 13, 2013
    Messages:
    32
    Likes Received:
    1
    In my short time bacK competing, the current system awards fencers that received ratings/ classifications in VET and Gender women specific events to carry over to Open and Rating restricted events.
     
  20. K O'N

    K O'N Podium

    Joined:
    Aug 14, 2006
    Messages:
    3,593
    Likes Received:
    458
    It seems like some movement has gone on behind the scenes here, which is very encouraging. Looking at evidence! Changing ideas! Growth, movement! All very encouraging.

    For me, a seeding system should:

    1) Seed pools at all levels relatively evenly. Seed pools at important events (Div I, for example, or anything with national team implications) as evenly as possible.

    2) Provide a way for new fencers to gauge progress.

    3) Provide cutscores for events like Div II and Div III.

    4) Incentivize not fencing and losing on purpose as rarely as possible. I think this is really important.

    I have seen no evidence that the current system (A15, A14, ... U, points for Div I) does not do 1) very well. I'm open to seeing such evidence, I just haven't. Is there a selection of events on askfred we can point to that were systematically badly seeded (ie, pool A did much, much better than the mean, and this happens a lot)?

    The current system does just ok on 2), I think. But that's a bit down to local event organizers. If I wanted to generate more E rated fencers I could run a bunch of U events in my gym. Six Us make an E, right? I just don't care much, nor do most people or they'd do it. So yes, lots of U rated fencers, I agree. But apparently not very important or we'd fix it with tools we have at hand right now.

    The current system does 3) pretty well.

    The current system does 4) almost perfectly well, perhaps as well as possible. There is almost never, under the current system, a reason to not fence and there is never a reason to throw a bout. If I am a B, I cannot go lose and reduce my rating, so I cannot go lose and make myself qualify for Div II, for example. The only time I am incentivized to not fence is if I'm a B and about to drop down to a C, or if I'm a strong C and don't want to earn a B before SN or something. This is pretty rare, but when it happens I do in fact see fencers not fence to prevent themselves from advancing. A system that let them lose ratings points would have the potential to magnify this effect quite a lot.

    Does it? I'm not trying to be a jerk here; has it been demonstrated that the current system is producing uneven pools on a regular basis? As you note in your post, any system will on occasion have someone under-rated due to improvements since the last competition. So any system will on occasion produce uneven pools, just due to this. Does the current system produce uneven pools regularly? Has anyone looked at askfred data to see if this effect is real?

    It's kind of important. If the current system is producing even pools, most of the reason to change anything is gone.

    Well, but let's not conflate pure classifications with "the current system". The current system is to use classifications for non-serious stuff like local events and Div III stuff, and national points for serious stuff like Div I. The reason we need points is because there are too many A15s in some events, like ME.

    Naively, I would think some small K for a five touch bout, and 3K for a fifteen?

    At any rate, I think this is a much improved idea. A pure Elo system would work more or less as it does in chess, I think, which is to say that many people would like it. I think it would be attractively granular, would show progress right away, but would have some predictable bad effects that we should consider:

    I think that, since fencing performance is more variable than chess performance, there would be a higher incentive than we see in chess to not fence if one had lucked into a high rating, just for the sake of pride. I admit I'm just projecting here, I don't have any real evidence that this would happen, but it feels right to me.

    On the other hand, an observed effect: if we're going to preserve something like Div II and Div III and so on, we'll get the same effect they get in chess when they have tournaments below a certain Elo rating; people who are near to and below the cutscore will not play in order to avoid winning so they can preserve a low rating to get into a restricted event, and people who are near to but above the cutscore will go find an obscure event somewhere and 'work on weird openings' for a weekend, in order to lose and lower their ratings so they can play in the restricted event. This happens in chess, and it would happen in fencing. We'd have people losing on purpose, right before SN for example to drop down to Div II. Div II and Div III are much bigger and more important in fencing than ratings-restricted events are in chess, so predictably we'd see this even more in fencing than we do in chess, and we see it a lot in chess.

    I realize that many people find the punitive nature of Elo and similar systems appealing; if you lose your rating should go down. Ok. I don't, but that's an aesthetic judgement. But if we're going to consider that we should at least acknowledge the downside: If you can reduce your rating by fencing, there will be situations in which ratings considerations will incentivize people to lose on purpose.

    The current system, much as many people seem to dislike it, very rarely does the first and never does the second. Never. We would be trading a system in which there's never an incentive to lose on purpose for a system in which there clearly is.

    I think predictable bad effects like this are worth considering before we go messing with stuff.

    ETA: Chess has such a problem with people tanking bouts that they have had to go to a sort of hybrid Elo-high water mark system to counter it:

    http://www.indianchessnews.com/2014/06/aicf-rule-to-check-sandbagging-cheating.html

    for example, there are lots of other examples. This is a known failure mode in games with Elo ranking systems where you also have ratings restricted events, like Div II events in fencing.
     
    Last edited: Mar 15, 2015

Share This Page