lichess.org
Donate

Chess Coaching Metrics

So you're in an arms race with your buddy or that annoying guy in your office to learn faster/easier/better so you can wipe that smirk off of their face?

So how in the world do you rank a chess coach? Here's how we do it now:

*Titles
*Online ratings
*Number of reviews on a coaching profile
*5 star ratings

What does that really tell you though?? I guess it tells you that they play well and got people to write things. But can they actually coach and teach well?

I can think of a few metrics that might tell a better story... what if there were a way to:

*Track Student rating increase/decrease during and after working with coach. <- Indicates improvement or lack thereof - did the coach motivate them to improve and give them the tools to do it? I guess that would show up here.

*Longevity of teaching each individual or just an average length. <- would indicate likeability I suppose, but not necessarily more.

So how should we do it!? You guys are smarter than me. What's the real answer to ranking a chess coach?
A "Better Business Bureau" idea for Chess coaches? :-)

I see no easy solution.

- Just because a chess coach was good for one person does not mean they'd be good for another person. A good match is desired.
- A chess title says nothing at all about teaching ability.
- Chess rating says nothing at all about teaching ability.
- Reviews seem like they could be helpful in finding a good match - if the reviews are well written. Just the stars are not enough info.
- Student stats are an interesting idea. However, suppose a coach gets students that don't perform well for reasons outside the coach's control?
I thought that too @jomega I was just playing with my own stats today to try to figure stuff out.. and that's where this post came from.

**before you read my stats below, I truly believe that some other coaches would be highlighted better with other stats from students which is why I know what I did below can't hold up to scrutiny -> it needs more.

In 2020, I taught 39 different players that paid me for at least 1 lesson over the course of the year. Those players gained on average 201 points from the moment they hired me (or Jan 1, 2020 if they began in 2019), until a rating check in on December 31, 2020. I used whichever rating was the most active for this statistic. The biggest change was 776 points on the year, with 6 of 39 students gaining over 500 points in 2020. Oh uh.. nobody took just 1 lesson. The average duration that I worked with a given student was 148 days between the first lesson (or Jan 1 if they began in the previous year) and either their last lesson or a December 31 cutoff of the measurement.

I believe there is a solution to be found.
I don't believe you can drill down into specific quantifiable attributes like these to assess coaching skill.

Did you explain really difficult concepts in such a way that I suddenly saw the light, even though I've read similar material in a book?

Do I feel like when I look at a complicated position, I just have a much better and clearer understanding of what's going on?

I wouldn't really care about changes in rating levels of your students. Is that from 600 to 1200 or 1500 to 2100?

Longevity isn't important to me. If you can be an extraordinary catalyst in your student's learning curve and propel someone up to the next level by helping him see and solve the problems with a different perspective, and that only takes a month, that would be far more attractive to me than someone that takes a year to accomplish the same thing.
This is inherently a subjective thing and I doubt you can really devise a way to accurately measure who is a good coach and who isn't.

I've trained a few people (about 3) all of which started at about the same level 1200-1400. Within two months all of them got to at least 1800 (the biggest outlier went from 1500-2100 in a little under 4 months, but that's just because he is a ridiculous player, I do not expect to ever see that again and its not really because I am a teaching legend). Is that because of me? Maybe, hard to tell. I can't magically make someone a better chess player. I don't own their improvement. It is strictly because all of them were willing to put in the effort and time to get better. Only they can take full credit for their improvement.

That being said there are a few things I like to try to apply to my own teaching because I think they are important to efficacy.

1. Inspiration. This is the single. most. important. thing. a teacher can give a student, in any field. Your job is to get them excited about the game, excited about improvement. Your job is to motivate them to do the best that they can do.

2. Attention to your students need. At all times you should know what specific area in their game needs the most improvement. You need to find creative ways to help them surmount these barriers. Not everyone is the same, and not everyone struggles with the same thing. So it is very important not to be a system teacher.

3. A level of competency in the field you are teaching in. You don't have to be an expert, but you want to be absolutely sure that the information you are giving them is correct. You do not want to instill bad habits in your students, or bad thought processes.
Any good metric, when becomes a target, ceases to be a good metric.

> Track Student rating increase/decrease during and after working with coach.

Suppose this was used to rank coaches. Then coaches would be less willing to take on strong players who would improve relatively slowly compared to a new player.

> Longevity of teaching each individual or just an average length.

Coaches should be able to drop students who aren't putting in their half of the effort to improve, or who are disagreeable. Having a longevity metric as a target adds some toxicity because the coach is motivated to keep on students which they otherwise would drop.
So add a metric that weights the starting level of the player? @ericmsd @phobbs5

That seems attainable. Nice idea.

The longevity thing is interesting to me. It feels like there must be a way to make that metric meaningful. I've stopped teaching players this past year due to computer abuse (I hate when people do that so much), but not really for other reasons. I had a few times when I wanted to quit on students during the year, but figured if the student had the need and desire then I would press on and figure it out.

I guess I never met somebody this year whom I deemed disagreeable or unmotivated. Though the last coach I hired several years ago cut me because he thought I was disagreeable. IMO it still says something about that coach that he did that... I asked for a refund because he didn't show for a lesson, and his response was to flame me and tell me he wouldn't teach me anymore. The metric would mean a ton on that one.. and I don't wish that kind of interaction on anyone anywhere.
The thing phobbos was pointing out that those metrics are useless if you go into coaching trying to pad them. They only have value strictly because they aren't used as metrics, and if they were to be used that way, they would lose all value.
@phobbs5 "Any good metric, when becomes a target, ceases to be a good metric".....where do you work??..cause that's hilariously wrong. You measure things to drive outcomes....even further more...you often measure things to improve them...aka the Hawthorn effect...Yes it would have to be weighted somehow with a counter balancing metric or 2...but still...metrics drives goals...the mind set that you cant have metrics that become targets is why business fail. Your other points are mute.

You can measure outputs and assume the inputs are valid if outputs are consistent....So if you measured outcomes, of ELO increase ....the "how" becomes irrelevant. Obviously you would need significant sample sizes before even being considered relevant data points. The slow learners & outliers in this type of data review are nearly irrelevant...just like fake reviews on amazon...it would eventually catch up to the data producer aka coach.
@HarmlessChessNub This is a reference to Goodhart's Law (www.lesswrong.com/posts/YtvZxRpZjcFNwJecS/the-importance-of-goodhart-s-law) and the main issue is with respect to how the measure relates to incentives. Quoting the linked article: "The most famous examples of Goodhart's law should be the soviet factories which when given targets on the basis of numbers of nails produced many tiny useless nails and when given targets on basis of weight produced a few giant nails. Numbers and weight both correlated well in a pre-central plan scenario. After they are made targets (in different times and periods), they lose that value."

The point is not to measure nothing but to understand how targets will affect incentives and to use that deeper understanding to define measures that are resistant to gamification. The book "Measure What Matters" by Koerr describes putting this into practice effectively via OKRs.

@DrHack -- I think ELO gain over time is probably good enough as a proxy metric for coaching efficacy though I think it may be a little tricky to measure effectively. For example, I'd be wary of the counterfactual (e.g., what would the student's ELO gain have been without the coach), and to understand that you would need to consider the selection bias inherent to the population of players seeking coaching to begin with. Here's how I'd start thinking about it:
-- A) Determine the base rate of avg. ELO growth over N time within given rating ranges: The assumption is that (on average) growth follows a Pareto principle (e.g., law of 80/20) within certain rating ranges and then drops into diminishing returns after a consistent threshold.
-- B) Bisect A into 2 groups: 1) chess players who proactively seek coaching, 2) chess players who do not. The assumption here is that we only really care about group 1. The goal is to understand how coach performance affects growth rate among players who are actually trying to grow, not this pool + random players.
-- C) Track student performance relative to avg growth rate from the pool of player's being coached within student's rating range.

This topic has been archived and can no longer be replied to.