Fencing ratings

Discussion in 'Fencing Discussion' started by gillaspy, Apr 1, 2007.

  1. Inquartata

    Inquartata Podium

    Joined:
    Jul 12, 2001
    Messages:
    36,189
    Likes Received:
    1,183
    It's not that I am "scared" or math, rather that I dislike it intensely. :)

    It's all very well to say that everything could be calculated automatically, but I would feel compelled to verify changes myself, not so much because I am wary of deliberate cheating by those in charge of the system but because I am all to familiar with the mistakes and screw ups that abound wherever human beings are involved.

    I suspect that we all know of a number of occasions when people did not get ratings upgrades they earned because a tournament organizers did not send in results, or did not cross all the t's when they sent them in, or because there was a mix-up or omission upstream somewhere. I know two people who should be Cs right now but for such problems...and I was at the events where they got them and where the upgrades were announced at the medal presentations. An ELO-type system is not going to make such things stop happening. Hence I'd consider verification to be essential. And I would find being forced to do it very distasteful. I'm a fencer, Jim, not an accountant! :mad:


    I don't understand what makes people so eager to leap off a cliff to test the spiffy new pair of wings they just made themselves. Or why they are so keen on fixing what isn't broken.
     
    Last edited: Apr 5, 2007
  2. eac

    eac Rookie

    Joined:
    Oct 12, 2005
    Messages:
    1,374
    Likes Received:
    117
    The point, Inq, is that it is broken. It has the following problems:
    1. It's entirely based on peak results, so ratings are significantly often based on flukes, thus a) inaccurately seeding fencers within tournaments for 4 years, and b) misplacing fencers into and out of the divisions (I/II/II) that they should be in.
    2. It is highly subject to inflation. This means that an A ten years ago in epee means much more than an A now in epee, and so the standards for tournaments have to keep changing-- and especially the ratings restrictions between the divisions become quite different from what was intended.
    3. Partly because of said inflation problem, the letters become wildly, wildly out of sync in meaning between the weapons. A B in WS is ridiculously hard to get, while a B in men's epee is comparatively easy to get. Trying to calibrate a national set of divisions for these two at the same time is effectively impossible.
    4. Ratings become geographically out of sync. Within a given weapon, it can be very easy to earn an A in section X, but very difficult in section Y, for the same level of fencer. This means that fencers from different geographical regions are inaccurately seeded in tournaments with each other, producing unfair results.
    5. The low level of granularity means that tournaments of 150 fencers or more are seeded effectively randomly, particularly in places like Div I Men's Epee, where the field (IIRC) is mostly A's, most of which do not have points-- then the pools are almost completely randomly assigned. This leads to inaccurate seeding and unfair results (unfair to the people with the randomly hugely strong pools).
    6. Particularly in men's epee, the gulf between an A and a pointholder is so large that fencers who are continuously improving have nothing to show for their efforts, and may be motivated to leave the sport.

    An Elo system would fix all of these problems. It would also provide an accurate and useful metric by which improving fencers could measure their progress.

    Now, your main objection to an Elo system seems to be the difficulty of verification. With either an online or mail-based system, it would be fairly easy to list the competitions which changed your rating along with the rating itself, so you would not have to verify yourself whether certain results were entered.

    Then it comes down to you doing the actual math yourself, which I can tell you is not amazingly hard, particularly with practice. However, I don't think the fencing world should be held back because of the combination of your personal unwillingness to do math juxtaposed with your demand for individual verification. Do you currently verify your exact number of national points? Would you continue to do so if you had to calculate strength factors and so on for a number of world cups each year? That's approximately the same level of math involved in verifying an Elo calculation. Do you think it's unfair to those fencers that they should have to do a little arithmetic to verify their team placement?

    Any system that fixes the above problems will have slightly more arithmetic involved than the current system. I don't think your distaste for said subject is a cogent argument against an improvement that fixes said problems, and introduces no significant new problems aside from distasteful arithmetic.
     
    Last edited: Apr 5, 2007
  3. Inquartata

    Inquartata Podium

    Joined:
    Jul 12, 2001
    Messages:
    36,189
    Likes Received:
    1,183
    Yeah...I don't buy it, sorry. :)

    I can't think of anyone I know who has gotten a rating which he has then not re-earned. True flukishness is much more likely to crush one underfoot than to hoist one up onto its shoulder.



    You know, looking at your list, and then going back over this thread, I think I see a pattern: Most of the complaints with the current system seem to involve epee. Now, this may come as a shock to epeeists, but there are actually two other weapons in fencing, and neither of them is probably very inclined let what is best for epee drive changes in fencing overall...

    Aside from that, I'm not really sure what you mean by "an A ten years ago in epee means much more than an A now". Care to elaborate?



    Again, what does this mean? "Calibrate"? :confused:

    Are you arguing that the weapons should all be made more alike? Brr! Count me out! Vive la difference, I say!




    Fairness...as I like to say, isn't that one of those mythical creatures, like griffins and unicorns? :)

    The system works well enough, IMO. It has been getting us by for decades without noticeable harm to anyone. Seeding is just seeding. The cream will rise to the top even if it is poured from a carton labelled "skim milk"...




    Randomness also assures us that this will disadvantage everyone equally, given enough time.

    But even if one admits that the present system has some problems, "has some problems" is not synonymous with "broken".


    There it is again! Tsk! Let me reiterate: The world does not and cannot revolve around epee! :D

    A great assertion, but I am not sure why it should be accepted as true...

    And no mistakes or omissions are possible? Come now!

    You make this assessment based upon your own abilities and preferences. I assure you that when it comes to math your idea of what is "not amazingly hard" is very likely to differ a great deal from mine...

    And even were it not so, the whole unpleasantness factor remains unaddressed. Cleaning the toilet bowl is real easy, but that doesn't mean I want to have to do it more often.



    By the same token, neither ought it to be "advanced" because you personally wish it to be, based upon what you are comfortable doing and what you desire. Right?

    I'll leave aside the issue of why you should get to be spokesman for "the fencing world", of which world I am as much a part as you. Maybe we should just speak for our own individual selves? What do you say?


    Yes. Although I don't like it and more importantly am not sure even now that I understand how it should be done or that I would recognize an error if I saw it. And there's the rub. ( Well, one of them. )


    Not unless I had to do so. And that's the point: I don't want to be made to have to do so.



    I'm not going to be drawn into a discussion of "fairness". I wil only reiterate that I, personally, do not like doing math, do not want to be made to do more of it in pursuit of some IMO very dubious "gains" such as "granularity", and see no reason to change a system which in my estimation works just fine.
     
    Last edited: Apr 5, 2007
  4. FoilyDeath

    FoilyDeath Rookie

    Joined:
    Mar 2, 2005
    Messages:
    330
    Likes Received:
    10
    Some people appear to still be under the illusion that your going to be "doing the math". As Peet so rightfully put it earlier, and online system, such as FRED, could not only do it all for you, but also post results, and sync them to the national scores. If we do make a more sophisticated ratings system, something like this, IMO, should be our first priority.
     
  5. Inquartata

    Inquartata Podium

    Joined:
    Jul 12, 2001
    Messages:
    36,189
    Likes Received:
    1,183
    Nope. Still going to have to check the checker. Because if you want something done right you must do it yourself.

    Yes, I do my own taxes, too. Bleah, but I'm not going to pay an "expert" to do it. He hasn't the motivation to be sure it's done right that I have; mistakes affect me, not him.
     
  6. tbryan

    tbryan Podium

    Joined:
    May 6, 2005
    Messages:
    1,984
    Likes Received:
    238
    eac, or someone else who understands ELO, wouldn't this problem also present a serious issue for an ELO-based system. That is, say Cville attended tournaments four weeks in a row.

    Week 1, Virginia Kickoff, he has a phenomenal day and ends up in the top 8
    Week 2, fences a small, local event in NC, he places 3rd behind some decent fencers who are ranked quite a bit lower than he is
    Week 3, fences in a large, local even in Atlanta, GA, he places 2nd
    Week 4, fences in a local event in Virginia, he is knocked out early by a mediocre fencer (hey, four weeks in a row, maybe he's tired)

    Now, to compute his ranking and the ranking of the fencer who beat him in the GA tournament, for example, we have to compute the ranking in order of the events, right? That is, the fencer's rank changes afeter each event, and that new rank should be fed into the next event. Now, if the organizer from the GA tournament turns in the results right away, but the organizer from the NC tournament waits 3 weeks to turn in their results, the USFA cannot accurately compute ratings for these events on week 3 and week 4. What's worse, the USFA might not know that an event had taken place on week 2. They might compute the results for week 3 only to have to go back and recalculate once results from week 2 arrive.

    The National point system works since the USFA knows exactly which events it's going to be counting. It knows when those events take place, and there is a gap of time between those events. The only way I can imagine a point system for all USFA events is for the USFA to compute points periodically with a rolling window. If an organizer's results are not certified in time, the results are not counted. For example, on January 14, perhaps the USFA computes the points for all December events across the US. If a December event didn't turn in its results by January 14, the results won't be counted. So, a fencer will fence with the same rating from December 15 to January 14. Then, rating changes from December 1 to December 31 will be computed. The fencer has a new rating, which he will use from January 15 to February 14. Of course, he doesn't have an updated card with his new rating. (I can't imagine that the USFA would want to send everyone a new card every month.) So, all tournament organizers will need to get a copy of the new points list for the entire USFA membership to properly seed their tournaments during that time.

    Does this process sound about right? I'm confident that we can write software to compute the points correctly. I'm sure that we could simulate results from thousands of tournaments or take a DB like askfred.net to simulate the point changes based on his results. I'm not confident that we will come up with a process and a workflow that people will like. :)

    Really, if I were in charge of putting a point system in place at the National Office, the first thing that I would do would be to mandate the new workflow with the current rating system. Get people used to the rating update cycle. If I'm going to require results to be uploaded, make divisions start doing that now. If I'm going to permit some results to be sent by post, mandate the cut off date now and figure out how many people send their results non-electronically. That will help to define the expected burden of re-entering results. Or maybe I push that work to the divisions, and I require electronic submission of results. Whatever. I need to put the process in place now to see how it works in practice. How often do I get results late (after the cut off)? How often do I get results that have to be restated? How much work do I have to do if someone restates tournament results from 3 months ago? That is, they were submitted on time 3 months ago, but someone only just now noticed an error.

    If that whole process is working with the current rating system, then switching the computation of the ratings will be much easier if I can do it without changing the ratings submission process/business rules. In fact, I can do a dry run where I'm computing the new points alongside the current rating system just to check for quirks and problems in the (computer) system and to get fencers used to seeing their point rating. Perhaps I could mandate that organizers use the new points to break ties between fencers with the same letter-year rating.

    Just thinking out loud here. The hardest part about deploying new software/algorithms in a system involving people is not the math. ;)
     
    Last edited: Apr 5, 2007
    grotto, keith and CvilleFencer like this.
  7. Sciurus-Rex

    Sciurus-Rex Rookie

    Joined:
    Jun 15, 2005
    Messages:
    1,230
    Likes Received:
    234
    "... Why, I remember when we kept track of results with a pile of rocks. When we could find the rocks, that is. Because we were lucky to have light to see by; it hadn't been invented yet. ... And weapons? Hah! We pointed fingers at each other and made 'clang clang' sounds until one of us gave up. Ahh, those were the good ol' days..."
     
  8. FoilyDeath

    FoilyDeath Rookie

    Joined:
    Mar 2, 2005
    Messages:
    330
    Likes Received:
    10
    Nobody says you can't check it...just saying it would be rather simple to adapt a version of EnGarde or FT to automatically calculate points and submit it online automatically to a website as the day goes on. Doesn't mean it cant put the individual pools online.
     
  9. Sciurus-Rex

    Sciurus-Rex Rookie

    Joined:
    Jun 15, 2005
    Messages:
    1,230
    Likes Received:
    234
    I can.

    Be wary of patterns you think you see.

    Randomness doesn't assure us of anything of the sort. That's an assumption of averaging, given an infinite amount of time. Over a relatively short period of sampling -- say, for example, one or two years in which results are garnered every couple of months -- randomness allows for incredibly huge spikes that do not (as you suggest) disadvantage everyone equally.

    It's amusing how you keep trying to tie this to a weapon. Almost like the way a certain political party tries to connect every ill in society to the war in Iraq. That's disingenuous at best.

    Amusingly, though, your rabid anti-epee position inherently embraces an equally unbalanced basis that one weapon should be ignored because the other two are (in your mind) OK. For now. IYHO. ... You are, in effect, suggesting, "We must choose between Action-A for Party-X OR Action-B for Party-Y," where A and B are to change or not change. The bias comes out regardless of how you define A, B, X and Y.

    Not that the weapon-vs-weapon illusion is actually valid anyway. It's a distraction you're getting very good at projecting.

    Fortunately, Inq, the world does not revolve around your math skills. You're suggesting decisions be made on the lowest common denominator? YOU?

    So speaketh the self-declared spokesman for the collective of individuals ... Bravo!

    And therein lies the best argument for serious consideration of all: A grumpy ol' fart who doesn't like to discuss 'fairness' and thinks numbers are all hocus-pocus, likes things Just The Way They Are. That's usually a big red flag to take a look at refining or correcting any system.
     
  10. Durando

    Durando Rookie

    Joined:
    Aug 29, 2005
    Messages:
    783
    Likes Received:
    111
    Epee, Weapon of Mass Destr---oh f8u8c0k it.
     
    Sciurus-Rex likes this.
  11. tbryan

    tbryan Podium

    Joined:
    May 6, 2005
    Messages:
    1,984
    Likes Received:
    238
    How many times do we have to say it. It's not a "weapon." It's "sporting equipment." ;)
     
  12. Inquartata

    Inquartata Podium

    Joined:
    Jul 12, 2001
    Messages:
    36,189
    Likes Received:
    1,183
    Saith the epeeist. :blah:



    If a thing is truly random, it will not afflict the same person over and over while others escape it altogether. It's like referee error: in the long run as many will go for you as against you, and things will even out. The argument that one is going to be consistently horribly disadvantaged by the "problems" of a seeding system which is "insufficiently granular"---singled out for "unfair" treatment---is very thin.


    I call them like I see them. :)



    False analogy; poisoning the well. ( "Hmm, I'll compare his argument to another of known unpopularity, and hope the unpopularity will transfer". )

    It does? Hmm, sounds like a straw man to me.

    Please restrict yourself to criticizing arguments I actually make. :)






    Reread my last post: The fencing world does not revolve around any one individual's preferences or opinions*---mine, yours or eac's. My view that the present system suffices does not automatically get elected as the most sensible choice; neither does the view that an ELO-like system would be better or should be selected and foisted upon the fencing "world".

    *Well, unless yor name is Rene Roch.



    Well, now you're just making stuff up...



    Only when it suits my ends. Otherwise it is as I have said a mythical creature, existing nowhere in the real world. Or as La Rochefoucald said, is "like ghosts, which everyone talks about but no one has seen". :)
     
  13. Sciurus-Rex

    Sciurus-Rex Rookie

    Joined:
    Jun 15, 2005
    Messages:
    1,230
    Likes Received:
    234
    You really don't understand the concept of randomness, then, if that's your assumption. One person CAN be "afflicted" over and over again, especially in a relatively small sample pool, while others escape--

    Ah, screw it. Probably one of those concepts that was addressed in a mathy statistics class that you don't like so much.
     
  14. notalent

    notalent Rookie

    Joined:
    Jun 24, 2004
    Messages:
    1,186
    Likes Received:
    60
    ah but those mistake don't reduce someones rating.
     
  15. contre-Sixte

    contre-Sixte Rookie

    Joined:
    Sep 6, 2006
    Messages:
    213
    Likes Received:
    17
    Two problems (at least) with the eac-ELO system :

    1. Fencer A and Fencer B are both highly rated. So they are seeded at the top of their pools. In one scenario Fencer A crushes everybody and ends up seeded #1 for the DEs. He goes on to win the tourney most likely facing the following in DE tableau: 32,16,8,4,2. His rating doesn't increase much.
    In another scenario, Fencer B wins all of his pool bouts 5-4. His placement in the DE tableau is #4. If he wins the tourney he most likely would have faced the following opponents: 29,13,5,1,2. He potentially gets a higher ratings bump than Fencer A would have. Why? Because the current system rewards you for decimating the opposition early and decisively. [I like that]. And it rewards you for winning against the field. [I really like that].The eac-ELO system rewards you for constantly beating tougher opponents one-on-one. Not the same thing at all!


    2. The concept of practice tournaments is a nightmare in the making. If elite Fencer A can designate a local tourney as "practice", thereby not affecting his ratings, then he can simply throw bouts againts his weaker clubmates giving them unfair rating boosts. Or is the "pratice tourney" designation a 2-way street? My ratings don't change, and the ratings of the person who fences me don't change either. If so, then many of the local elite fencers will simply designate all of the local tourneys as practice, leaving developing local fencers little chance to improve their ratings. Practice Tourney - a real bad idea.
     
  16. oiuyt

    oiuyt Podium

    Joined:
    Apr 26, 2000
    Messages:
    10,240
    Likes Received:
    974
    Read the section of George Masin's proposal discussing when people are not included in the ratings change calculations. Specifically look at the hors concours section. Applies if you have an extreme outlier, although it doesn't address the concern of typical fencers using tournaments in which they would normally be competitive, as sharp practice.

    -B
     
  17. keith

    keith Rookie

    Joined:
    Aug 19, 2004
    Messages:
    4,092
    Likes Received:
    262
    tbryan raises an important question;What happens to late results?

    The system would probably have to use a trailing rating (1 month - 2 weeks?) so what happens if 'old' data hits the system after it has been used to score rank competitions/competitors? Given that it is possible to fence a series of strong competitions in a short time frame this really matters.

    The devil is in the detail - George Masin's proposal (or an other) makes perfect sense, and with tuning would work. Of course the current system also works, and would work better with tuning :dunce:

    There seems to be a lot of enthusiasm for a cool numerical system rather than a focus on what exactly the USFAs problem really is and what is the easiest way to solve it.

    Despite Mr fuzzy tails objections the insights from croquet* are important - the competition formats the USFA uses are not the best formats for ensuring that the best competitor wins (even if seeding is perfect). Those concerned with ensuring that the best fencer wins should be demanding all sanctioned competitions get run as double round robins.

    *I could also post an analysis of the UEFA formats if that would make people feel better and a tad more masculine?
     
  18. oiuyt

    oiuyt Podium

    Joined:
    Apr 26, 2000
    Messages:
    10,240
    Likes Received:
    974
    Please do. More information would be appreciated. And discussing competitive croquet doesn't raise any concerns with me about my masculinity. If you have similar analysis papers from a variety of sources and sports I'd love to see them.

    -B
     
  19. keith

    keith Rookie

    Joined:
    Aug 19, 2004
    Messages:
    4,092
    Likes Received:
    262
    okay so here are two more; the UEFA analysis and also one more geared to the americans. Both links are to PDFs, for the statistically challenged the discussions in both are in plain English.

    UEFA

    NCAA

    (should anyone actually want a PDF of the Appleton paper I can get it via JSTOR but no forum links).

    Note that this is thread drift; Is the problem with the competition format rather than with seeding? A competitive format that does not give a high chance of the 'correct' competitor winning might lead some to conclude that there is a problem with seeding.

    The NCAA paper covers situations were seeding is difficult because of the lack of information.
     
  20. mrbiggs

    mrbiggs Podium

    Joined:
    Mar 25, 2004
    Messages:
    7,837
    Likes Received:
    477

    That's because you fence sabre. Epee and foil (especially with the new timings) are more prone to this sort of thing.

    Plus, as I pointed out earlier, reearning a rating is a lot easier than getting it in the first place.


    This is just ridiculous. Should epee have a completely different system than foil and sabre? Sometimes it's fun to just cover your ears and go "NANANANANATHAT DOESN'T AFFECT ME" but that doesn't mean it's a good national policy. Not to mention that in ten or twenty years, foil and sabre will be in the same boat as epee.

    We've had this system long enough to realize that:
    1. Ratings are inflating
    2. They show no sign of stopping
    3. They will probably keep inflating until they are worthless.

    Obviously, a change needs to happen someday. With the speed the USFA moves, I don't see why we shouldn't start now.


    It makes more sense if a B in women's sabre is about as difficult to earn as a B in men's epee. I think it's preferable for us to have a ratings system that makes sense.

    It's not broken...yet. In five years? We'll see. I don't see why we have to wait until something is completely worthless to fix it.


    As to not wanting to do math, one of the benefits of the Elo system is that, in its pure form at least, it's actually pretty easy to calculate. You could figure out your new rating in under a minute with a piece of paper and a handheld calculator. It's not an optimal level of simplicity, but simplicity comes at the price of permanance in this case. Would you rather have the rating system completely reset every twenty or so years? Because that's basically the other option...
     

Share This Page