truechess.com also did this, in a more comprehensive way than the people who did paper(s) reported by Chessbase.
Ken Regan has also compiled some results like this in his attempts to devise an automated cheating detection framework.
The problem is that the analysis is fairly shallow, out of necessity.
Once you start comparing classical time control games between very strong players, that becomes an issue, because the engine at that shallow level may well be weaker than the players it's evaluating.
Matej Guid and Ivan Bratko (the authors of several such papers, and the one that is the subject of that chessbase article), have argued that it doesn't matter if the the engine is weaker than the players, but I disagree.
Going into their argument and why I think it fails is beyond the scope of this post, but I'll gladly discuss if it anyone wishes to pursue it.
There are two other big problems with this general approach. The first is that centipawn loss is really a function of two things: your strength and your opponent's strength.
That comes up quite frequently here, where a player gets a game with almost no centipawn loss, but just because their opponent played quite badly.
Even top GMs that keep centipawn loss very low in general see their centipawn loss jump up significantly when playing stronger GMs or engines.
Both Matej Guid and Ivan Bratko in their papers and
truechess.com try to account for things like this by calculating the complexity of a position and factoring that in, but the methodology is a bit suspect in both cases (there's just not an easy and accurate way to do this with automated engine analysis).
The second problem is that centipawn loss, useful as it may be as a rough guide, is an average over all your moves.
In chess, as it turns out, your strength of play depends much more on your weakest moves than on your average moves, a point made by Tord Romstad while discussing how to design engine evaluations in his rightfully famous post at
http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=135133&t=15504 .
That is to say, losing about 5 centipawns every move will score much higher than losing 0 centipawns except for every 20th move when you lose a full pawn (that's actually relatively easy to simulate by fiddling with the source of some open source engines and running a match).
That becomes a big problem when the search is so shallow, because that will result in some deep tactical mistakes being missed, and other tactically sound combinations being classed as blunders (
truechess.com tries to mitigate this when determining blunder rate, but the mitigation is pretty limited).
That throws off both centipawn loss and blunder rate, which makes any comparison on that basis a bit suspicious, especially when combined with the fact we can only very crudely account for the strength of opposition/difficulty of the positions.
While SF is a stronger engine than used in any of the previous studies, the analysis here at lichess is of such limited depth that I don't think it would really be an improvement on previous attempts.
The results would still be interesting though, so I might just upload the games from some matches and see what the results look like.
It's all still very interesting of course, and obviously none of these flaws have prevented me from spending far too much time reading about it, but it's worth taking all these attempts with a couple pounds of salt. :)