I don't understand the objection to using individual matches in the ELO calculation.
...because winning is winning under the current system. Under an 'each touch counts system' doubling out an epee bout becomes a less good idea.
My objection to the chess-fencing comparison is that height, handedness and squirminess are not factors that (generally) impact the result; unless fidgeting really puts of your concentration of course.
Also while we may all know that letter chasing is the bane of technical progression (or some such) it does matter to a lot of fencers who do not go to NACs or large regional events.
I think one problem may very well be one of scale. ...
The USFA has something like 22000 active competitive members, many of who fence multiple weapons, who's scores would have to be tracked, recorded and updated in a timely manner. By a national office that still demands things be faxed or mailed into them... Not the most confidence inspiring as Allen mentions.
I think most folks who contemplate a ratings system any more complex than the current one take it as a given that it would require a computerized, centralized system for results submission & validation, and ratings calculation & publishing.
As mentioned, this kind of automatic calculation is well within technological capability, and is already being done for many local/regional points lists:
And the same with the current system. A letter degrades every year if you don't renew it. After 4 years it drops to the next letter, as you know very well. I would like to see the cap be 2 years instead of 4, but it seems that this point is already served by the current system.
However, it's very rare that someone will have their letter degrade by time. Most of those who do are those who have stopped fencing or stopped training and only fence recreationally. oiuyt posted in a very old ratings thread that at the time, there were 5 men's sabre fencers in the country with ratings about to decompose. That's a pretty small number.
To get the rating, you only have to fluke a result, and fluke it again in the next four years. Once you have a rating, reearning it is much easier than getting it in the first place. For example, to win a simple A1 tournament, you have to come in ahead of two As in the final standings, which generally means beating one in DEs and one in pools. If you already have an A, you have to come in in front of one A, but he's not in your pool, and the odds are higher that you won't have to fence him in DEs.
And even if it does eventually decompose, the fencer is messing up seedings for four years.
The main problem I see with bout-by-bout analysis for Elo ratings is that individual bout results are not required to be recorded at this point. The system would be more prone to error, more difficult to error check, more difficult for the bout commitee, would require greater communication with the USFA, and would make it nearly impossible to run a tournament without a computer.
If it's based simply on results, the bout committee could do the tournament on paper or whatever, then just run the final results through a simple program. The results could also be faxed to the USFA with relative simplicity.
I am not bulleting these off to be argumentative, I just have several responses/questions and if I try to type them straight, I am afraid they will loose their context or not make sense...
That's totally fine-- IMO bullets are clearer when you're responding point by point.
How so? Right now most tourneys for me that are not NAC's, huge super regional events or qualifiers are practice tourneys. I can go there, try new things and not worry about my standings except on our local points list. I don't see how that could be the same under an ELO system. Especially since many of the ones I have seen count each touch or each pool bout as a win/lose!
Generally, it seems to me that if you want a practice tournament, you should have a practice tournament, and not have it be entered in the rating system, causing inaccurate results (say if you don't pay attention and give someone who should really be a C their B). I'm all in favor of practice, and practice tournaments, but I think an accurate rating system shouldn't include those results.
As noted below, an Elo rating system need not take into account every bout/touch, and I personally think that it would totally screw with the system if it did.
Also people who were chasing the points total for team placement and such, I think they would determine what events would give the most points for the least effort/risk (like some do with NAC's or FIE events) and avoid the rest.
The whole concept of a tournament which is worth more points for least effort/risk is an artifact of a highly inaccurate rating system. With our current rating system, and even with our current point system, there are certainly tournaments that vary in the ratio of points to effort/risk. With an Elo system, there isn't such a distinction. Almost universally, the amount of reward is directly proportional to the amount of effort and the amount of risk. So you needn't worry about that.
Also, the system for team selection could easily only calculate ratings based on a designated set of tournaments (just like the team points standings are now calculated for a designated set of tournaments). Team selection need not be linked to your general seeding/ranking, just like the rolling points are not directly linked to the team points.
Further I am worried that it would hurt local attendance. I already see a some of the "I already have an A or B, why go out to a local tourney and give them practice against me" mindset.
That's another artifact of a system with too little granularity. In the current system, if you have an A07, you can get absolutely nothing rating-wise out of a local tournament. However, if you are a 2000 fencer in a division full of 1800 fencers, you CAN get rating points out of winning tournaments, improving your seeding for NACs. The whole worry about risk vs. reward is just the result of never having used an Elo system-- in chess, people *never* worry about that, because the number of points you could lose at a tourney is always proportional to the safety that you won't lose it-- that is, if you COULD lose 50 rating points at a tourney, that means that you can be pretty well assured that you're safe. Everything is totally proportional and exactly as risky as it should be.
And the same with the current system. A letter degrades every year if you don't renew it. After 4 years it drops to the next letter, as you know very well. I would like to see the cap be 2 years instead of 4, but it seems that this point is already served by the current system.
I believe a rating goes inactive in chess much more quickly, and instead of just being a B06 instead of a B07, IIRC you have to totally re-earn your rating after it goes inactive. So there's more incentive to keep fencing frequently.
Not to put to fine a point on it, but Chess is a board game. It is not a sport. Fencing is a sport with all the injuries, conditioning peaks and so forth that go with a sport. I can play chess with a pulled hamstring and a hangover. I could fence with a pulled hamstring and a hangover also (hell, I do...). The difference between now and an ELO future is that if I lost to a scrub because my leg gave out, I would not loose my rating, or have it lowered.
If your leg really gave out, I would think you'd medically withdraw. However, this whole mindset of "oh no! I'll lose my rating!" is a holdover from a peak-performance-based rating system. With an Elo system, if you lose 50 points at a tournament because your leg gave out and you lost to a scrub, you gain them back by winning the next tournament. Ratings aren't determined by big outlying events nearly as much as by your average performance, and so you needn't worry about fluke bad performances ruining your rating. One loss to a scrub will maybe make you underrated at most until the next local tournament.
Physical factors, injuries and peak times in training cycles as well as stress from life/school events all seem to point to people skipping more events under an ELO system, IE only fencing in events where they are at a strong time in their training/development.
People vary in performance in chess much like they do in fencing, and it doesn't stop them from competing. Here's the other thing, too-- if you fence in a bunch of tournaments during a low point in your training cycle, your rating will go down some to reflect that, which is an accurate assessment of your performance at the time. But, as your cycle improves, your rating will follow it-- since it's not based on outlier events, you needn't be nervous about whether you'll be able to win an A tournament to get your A back again. It isn't a big mountain like that. Your rating follows your increase in performance as you get back to fighting strength. This is what's great about using a rating system that's actually a reasonable estimator of competitive strength-- you have to worry much less about people gaming the system or fencing in this tournament or that tournament to improve their rating so it doesn't accurately reflect their strength, because the system doesn't work that way.
I think we would all be better served by something simple (just spit balling here...) like expanding the points list to 100 for each weapon and decree that national points must be factored into seeding for all events. Between the degrading of ratings over time, the inclusion of expanded national points and the system as it stands, I think the system would be more accurate than now without all the chaos and problems of an ELO type system.
Dunno what chaos and problems you're referring to, and I don't know why an Elo system wouldn't be better than an expanded points list.
Regardless, scared of math or not the fact remains that it would seem rather difficult to have this done in a transparent fashion to assure that things were done correctly at a local and national level and to insure that someone in the USFA of Division were not say, skewing the results so that there kids/students would make the national team. Now it is pretty easy, even my math skills are up to it. If it is done by some archaically coded database that only the USFA has access to, that means that we have to have total faith in the USFA. For me, that would require overhauling a hell of a lot more than just the ratings system!
This is a slightly more cogent point. In general, the average chess player does not want to and would have difficulty verifying the exact change in ratings due to each tournament. However, the current formulae for team selection are pretty goddamn arcane. Particularly the strength factors for various international tournaments can be pretty difficult to calculate, and I'd say that the math for that is approximately equivalently difficult, if not more so, to the math necessary to calculate Elo ratings. So in terms of team selection corruption, I'm not worried.
The other thing that prevents this from being an issue is that the USFA can easily publish the formula, and even the source code, of its algorithm for ratings adjustments. So anybody who is interested enough that they'd calculate the current team standings for themselves could also calculate ratings adjustments for themselves. Any geeks who were interested could also come in and check the correctness of the results by looking at the actual code, too.
I think I would love to see a mock up system that took those sorts of things into account. I have not seen that so far and that might be part of the reason for my resistants.
Perhaps when I am not quite as burdened with homework as I am now, I will put up a mockup on my website. I'm sure Peet would also have little or no difficulty doing the same, with his gigantic amounts of data.
Edit: To reiterate, I'm 99.99% sure that it's totally easy to make an Elo system that takes only results lists, not individual bouts, and I agree that taking individual bouts and scores would totally f*** everything up.
On an aside I just reminded myself of chess competition format - it's been a while . Anyhow I wonder if part of the problem of transferring an ELO system to fencing (as per the VA division forum discussion) is the differences in the competition format.
Have decided that running a 'swiss format' fencing competition might be an interesting exercise.
Thanks EAC. I am going to take some time to mull over all that. You do make it sound very reasonable. There are some assumptions that I am not sure I jive with, but I am going to try and think it out. I hate it when I have my mind all made up and someone comes along with silly things like facts and rational arguments that make me go and have to think about it again...
Just another lost soul saved by the (hit) First Church of EPEE!
Reasonable. My issues:
1) The thing doesn't do provisional ratings, which as above I think it should.
2) Ratings don't go inactive, which as above I think they should;
3) I am suspicious of the whole size factor/credited-with-3-wins thing. I think you should be rewarded mainly by the DE round you made it to, because everybody knows (and the current classification & point systems reflect) that the size of the DE round you made it to is much more meaningful than which place you had within that DE round.
3) I am suspicious of the whole size factor/credited-with-3-wins thing. I think you should be rewarded mainly by the DE round you made it to, because everybody knows (and the current classification & point systems reflect) that the size of the DE round you made it to is much more meaningful than which place you had within that DE round.
the catch with this is that you don't want the ratings to be to sticky or to fluid - until the system is instigated there is no way of knowing how it will work.
The USFA should give each division an algorithm and tell them to come back in three years
My question was "I don't understand the objection to using individual matches in the ELO calculation."
Originally Posted by keith
...because winning is winning under the current system. Under an 'each touch counts system' doubling out an epee bout becomes a less good idea.
and I don't understand this answer. In chess each match is counted. It doesn't matter if the winner is a pawn ahead or two queens ahead at the end. It's just 1-0. If a fencer wins a match 5 to 0 or 2 to 1 (clock expires) it's still one win against that particular fencer, not every fencer who happens, in the end, to finish lower in the standings. Neither the winner of that match or the loser of that match may have fenced ANY of the lower finishers in a large tournament. Why should their standing count. Fencing, even more than chess, is Mano a Mano (excuse me ladies, I include you.)
gillaspy: Fencers fence nowadays to maximize their final ranking. They don't fence to beat particular people. If they did, then for instance if you got into a repechage, you could intentionally lose your first bout with the intention of winning the next four bouts, and do better ratings-wise than if you had won three bouts in a row straight through. Basically, the whole incentive structure is based on people trying to maximize their final ranking, and especially their final DE round, not on trying to beat particular people. We don't want to mess with the structure of incentives, and so we pay attention to final results, not individual bouts.
Furthermore, it would be somewhat more difficult and time-consuming to report all bouts, rather than just a final ranking, to a national office.
keith: I don't understand what you're saying, (what do "sticky" and "fluid" mean?) and I don't agree that different divisions should try things, because a) divisions are badly run, and b) the system that will work is fairly clear, and a well-run implementation by the national office will be much better received by the fencing community than several badly-run implementations by divisions.
and I don't understand this answer. In chess each match is counted.
apologies - some of the points systems include hits scored received; then again some don't. myself.
If you don't include bout scores then you have two choices;
assemble the finishers in tranches depending on which round they got to.
or
assign points based on finishing position (see the Masin example).
Clearly if you assign/remove points based on finish position you are comparing fencers based on weak data; in the 128 one fencer lost to the No.1 seed another to the 64th.*
On the other hand if you use tranches then ratings may become rather sticky - you need a real jump/collapse in performance to move up/down.
One difference between a fencing competition and a chess competition (swiss) is that the initial rating gap in the chess table is equal - so the the best has to beat the median and the median(+1) the worst. This of course is not true in a DE tableaux.**
While all this is fascinating it does kind of ignore the point that all you need to do is seed to pools - not predict the outcome of the competition. Unless the USFA is planning on getting into the gambling business of course.
There is also the question of whether the poor seeding people complain of is a function of the rating system or the fencers.
*assume a perfect competition.
**which raises the question of whether ones ideal finishing position should be based on your rating on competition entry or after seeding the tableaux (pool result based).
I don't understand what you're saying, (what do "sticky" and "fluid" mean?)
how easy it is to gain/lose points. If the algorithm gives to much point movement it means that fencers are bouncing up and down the seeding table so much that you don't really have any increase in resolution for seeding purposes.
Originally Posted by eac
and I don't agree that different divisions should try things, because a) divisions are badly run, and b) the system that will work is fairly clear, and a well-run implementation by the national office will be much better received by the fencing community than several badly-run implementations by divisions.
Trouble is until you use the system you don't know what the problems are - or if pool seeding improves over time.
I have a problem with the ref factor. To loose rating points because the local scrub ref is having a bad day is a problem. All the various potential problems with the system can be worked out except this one. The only way we will know is to actually impliment a system track the results and decide after 2-3 seasons if it is working.
Go to the well until the well is dry. When the well is dry find a new well.
Every rating system will be in error to the extent that the refs are in error. People get letters all the time because of ref error. Ref error, however, does average out quite well if you are including small rating changes from every tournament. In fact, there's less reason to worry than there is in the current system, because in the current system, someone can be mis-seeded for 4 years because of one ref error, while in the Elo system, said ref error will be totally gone in a few tournaments.