Thursday, December 29, 2011

Is Tiger Woods Loss Averse?: A Debate

In June of 2009, Alan Schwarz of the N.Y. Times reported on a draft study that used putting statistics to conclude Tiger Woods was loss averse. I thought events of the previous November had pretty much proved he was not loss averse.  In any event, I read the draft study and did not find it convincing.  I wrote the author about the problems I saw, and so began a long and continuing debate on who was right.  Let’s start with the letter:   

June 30, 2009

Professor Devin Pope
Wharton School
544 Huntsman Hall
3730 Walnut Street
Philadelphia, PA 19104-6340

Dear Professor Pope,

I read an article (June 17, 2009) by Alan Schwarz of the N.Y. Times summarizing your research.[1]   Schwarz  wrote your research had two main conclusions: 1) Golf professionals  had a slightly better chance of making a par putt than a birdie putt, all other things being equal, and 2) Loss aversion costs a top player $1.2 million in prize money per year.    
In reviewing a draft of your research, it is possible your first conclusion is due to an error in variables rather than “loss aversion” as claimed in the paper.  Your second conclusion is based on an analysis rife with errors and should either be redone or omitted from the paper.  Let me address each concern:
Error in Variables – Your research indicates  all 188 professional golfers in your study exhibit bias by hitting birdie putts less accurately than they hit otherwise similar par putts.  I found it strange that not one player was better at birdie putts.  Even though golf professionals come from all races, nationalities, and socio-economic backgrounds, they all putt like a girlie-man.
One possible explanation for this curious result is a missing variable.  As stated in your article, par putts may be more accurate than birdie putts because golfers putting for par may have learned important information.  You argue it is possible players learn important information from watching their partners (technically their fellow competitors) putt on the green. To control for this effect, you include separate dummy variables for the number of putts already attempted on the green by the player and his playing companions.[2]  The research showed such information had a significant effect on whether a putt was made.  I do not believe, however, your variables measure all of the information the player has.
A PGA Tour player misses the green about 35 percent of the time.  On most of those occasions, the player picks up valuable information when he chips or putts from off the green.   The player typically only has this information on par putts, and this may account for your conclusion that par putts are more accurate than birdie putts.  I do not believe your variables measuring the putting history of a hole account for the information gained through short approach shots.  To refute my claim, you could redo your analysis using only putts where a player hit the green in regulation.  (Note: This would probably decimate your sample, since there would not be very many long par putts.)
The value of information gained from chipping or putting from off the green should decrease as the tournament progresses.  By round 4 the players and caddies should have a better idea of the breaks and speeds of each green than they did in round 1.  Therefore, the difference in the probability of making birdie and par putts should decrease as the tournament progresses.   Such a decrease is reported in Table 5 of the paper.  (A corollary would be that the absolute probability of making each type of putt should increase as the tournament progresses.   I do not believe your paper examines that issue.)
Errors in the Economic Impact Analysis – The claim that loss aversion costs a top player an average of $1.2 million per year is based on the data in Table 8 of the paper.  A cursory examination of this table shows it is rife with errors and calls any conclusion into question.  An abbreviated form of this table is reproduced below.  I will use Tiger Woods as an example of the errors.
Abridged Table 8-Understanding the Costs of Missing Birdie Putts




Golfer


Tournaments Played

Scoring Average (72 Holes)

Tournament Earnings (2008)
Additional Earnings if Scored 1 Stroke Higher(sic)
% Earnings Increase if Scored 1 Stroke Higher(sic)
Tiger Woods
6
67.7
$5,775,000
$410,000
6.6%


1.       Tournaments Played- Tiger did play in 6 events, but only 5 stroke-play events which were used in the study.
2.       Scoring Average (72 holes) – The heading should be the scoring average for 18 holes, not 72 holes.  Tiger Woods’ scoring average in 2008 was 68.9,[3] not 67.7 as reported in the table.
3.       Tournament Winnings – Winnings from the Accenture Match Play Tournament were mistakenly included in the total.  This tournament was not part of the study.   Mr. Woods’ total in the five stroke-play events was $4,425,000.
4.       Additional Earnings if Scored 1 Stroke Higher- The heading should read one stroke “lower.”  Assuming a one stroke lower score was meant, the results are still in error.  As shown in the table below, Mr. Woods won three events (Buick, Arnold Palmer, and the U.S. Open).  Scoring one less stroke would not have increased his earning in these events.  Mr. Woods placed solo second at the Masters.  Since he trailed the winner by three strokes, however, one less stroke would not have increased his earnings at this event.  One less stroke at the WGC-CA event would have pushed Mr. Woods from solo fifth place to a four-way tie for second.  He would have won an additional $183,750, not the $410,000 shown in Table 8.
Table-Tiger Woods’ 2008 Money Winnings


Tournament

Placement

Winnings
Placement With 1 Less Stroke
Winnings With 1 Less Stroke
Buick Invitational
1
$936,000
1
$936,000
Arnold Palmer Invitational
1
$1,044,000
1
$1,044,000
World Golf Championship
5
$285,000
T2
$468,750
Masters
2
$810,000
2
$810,000
U.S. Open
1
$1,350,000
1
$1,350,000
Totals

$4,425,000

$4,608,750


5.       % Earnings Increase if Scored 1 Stroke Higher- This heading should also read “1 stroke lower.” Table 8 reports a 6.6% increase in prize winnings for Mr. Woods.  Using the corrected numbers for Tiger Woods ($183,750/$4,425,000), his increase in earnings is 4.2%.  This is an error of 36%.  (Using the numbers shown in Table 8 ($410,000/$5,775,000), the percentage increase should be 7.1% not the 6.6% reported. The calculated percentages for the other players also appear to be in error.) 
I have a couple of minor editorial points.  You write the bottom third of the players are eliminated after two rounds.  Actually, it is closer to the bottom half.  Most tournaments have approximately 150 entrants.  The low 70 and ties make it to the weekend.  I assume your sample included the International Tournament which used modified Stableford scoring (bogey=-1, par = 0, birdie = 2, eagle =5).  This tournament should have been excluded from your sample.  It would be interesting to examine, however, if the probabilities remain the same when the value of a birdie or eagle are greatly increased relative to par.
You certainly have produced a thought provoking piece.  I hope these comments are of some assistance as you work through the peer review process.
Sincerely,

Laurence Dougharty
Cc: Alan Schwarz



Mr. Schwarz wrote the following response after conferring with Dr. Devin Pope, one of the authors.  Dr. Pope’s comments are in blue and identified by his initials “dp.” I have interjected my comments in italics and identified them by the initials “ld.”

I corresponded with the author of the paper and he (Dr. Pope) said the following:
dp -His second concern (about how we calculate the 1.2 million number) has a few valid points and a few not-so-valid points.  This was a simple back-of-the-envelope calculation that an RA put together for us in order to try and give people a feel for how much 1 stroke per tournament is worth.  It appears as if there are a few errors in the table.  Although not all the things he pointed out are errors. For example, he thinks we are doing this just for the tournaments in our data whereas we are actually doing it for all of the tournaments whether in our data or not.  We assume that our effect occurs in all of the tournaments and we then calculate using the entire sample.
ld – I did not think the tournament winnings calculation was restricted to tournaments where Shotlink data was available.  That would have excluded the Masters and U.S. Open.  The Accenture Match Play Tournament should have been excluded because it is impossible to measure the impact of one stroke on the final tournament standings.

dp -In other words, his second concern will probably cause us to have another RA go through and update the Table 8 for errors and then we will clean up the language so people aren't confused.  But no matter how you do it, 1 stroke = large number.
ld – An RA is doing research and his work is not checked? It makes me wonder if an RA was responsible for preparing the data for the regression analyses.

dp -His first concern regarding learning from chip shots is something we could potentially rule out that currently is not discussed in the paper.  It is extremely unlikely that this is causing our effect given that it does not explain the round-to-round differences and it also doesn't explain why people would leave it up short of the hole for birdie which is a major part of the paper.  That being said, we can hopefully rule out this concern with the data itself rather than theorizing.

Hope you find that helpful. And thanks for reading.

Alan Schwarz
The New York Times

I thought Mr. Schwarz was still convinced the research was sound, and I was only noting trivial errors.  I wrote the following e-mail to buttrress my case.  He passed it along to Dr. Pope who made comments (in blue and marked as “dp”). My counter arguments are italicized and identified as “ld.”
Mr. Schwarz,
Thank you for your courteous response.  I am still puzzled why you would base an article on an unpublished paper with questionable findings.  You may have thought the Wharton professor was the next Bill James of golf.  I do not believe that to be the case.  Moreover, a strong case can be made from the study’s own results that the “effect” (higher probability of making a par putt than a birdie putt) does not exist.      
In my original review of the study, I proposed that information gathered from a short approach shot could explain the small difference in probability for making par and birdie putts.  In his reply to you, the author wrote “it is extremely unlikely that this (my hypothesis) is causing our effect…”  This must be another “back of the envelope” probability estimate since the author has not performed any empirical test of my hypothesis.  Here are some reasons why the hypothesis could be true: 
Common sense No. 1 – Assume Player A hits an approach shot from 180 yards that stops 6 feet from the flagstick.  Assume an identical Player B putts from off the green to the same point 6 feet past the flagstick.  Who has the better chance of making the putt?  (I am assuming the players are in different groups, so there is no learning from one another.)  I suspect every Tour player you ask would choose Player B.  Missing a green should have a measurable effect on the probability of a player making his first putt. 
Common Sense No. 2 – The author reported that all 188 golf professionals in the study suffered from loss aversion.   (Was the entire gene pool of risk takers decimated by saber tooth tigers?)   It seems unlikely that not one player would be better at making birdie putts than par putts.  A more logical explanation is an important variable is missing from the regression model.  My hypothesis is a variable measuring “greens in regulation (GIR)” would eliminate the disparity between the probability of making birdie and par putts.  
dp -I can give lots of examples of biases that all people exhibit.
ld -It would have been nice if Dr. Pope had named a universal bias.  Not to jump ahead, but in the published version of the article, Dr. Pope did indeed find players who were not loss averse. 
Empirical Evidence No. 1 –If my hypothesis is true, players who have a high GIR percentage   should  have a smaller difference in the probability of making a par or birdie putt.  The author found the difference went up with a players rank (Figure 8b)—i.e., the 50th ranked player would be expected to have a larger difference than the 1st ranked player.  I assume a player’s GIR ranking is positively correlated with his ranking.  If so, the higher the GIR ranking, the smaller the difference in the probability of making par and birdie putts.  This is exactly what my hypothesis would predict.
                  dp -For other reasons, I have calculated the GIR percentage in our golf dataset.  I went ahead and quickly ran the correlation between GIR Perc and our par-birdie differential using the 188 golfers in our sample.  The correlation was almost zero (t-stat: .52; p = .65). 
ld – Dr. Pope does not present his work so it is hard to critique.  Did he correlate GIR and the par-birdie differential by tournament, by year, or use averages over the study period?  He did plot the par-birdie differential against a player’s 2007 world ranking.  Since rankings change during the study period, Figure 8b should be viewed with skepticism.
 Empirical Evidence No 2 – In your article, you cited Paul Goydos and Geoff Olgilvy as two players who made par putts much more often than birdie putts.  They also ranked low in GIR in 2007-- Goydos was 192nd and Ogilvy was 158th.  This would lend more credence to the hypothesis.
 dp -See above result.  Also, note in the article that Tiger Woods was about average in par-birdie differential even though he is by far the best at GIR Perc.
Theoretical Evidence – The author argues it is highly unlikely my hypothesis is true since it does not predict the decline in the “effect” over the tournament.  In fact, it does predict such a decline.  By the fourth round, the player and his caddie have probably seen 8 to 9 putts on each green.  The value of information gleaned from a short approach shot diminishes –i.e., you would pay more for information on a green you had not seen before than one you play regularly.  As the value of “approach knowledge” diminishes, my hypothesis would predict the decline in the effect.
dp -This is possible, but the evidence suggests that this is not what is driving the results. 
ld – What evidence?
The author argues my hypothesis does not explain why “players leave it up short if the hole for birdie which is a major part of the paper. “  He is correct.  But do players leave it short?  The author’s methodology is questionable—confusing maybe a better word.  He writes “birdie putts are hit .29 inches less hard that par putts.   This is confusing since “inches” is not a unit of force.  A reasonable assumption is he found that on putts of equal length, missed par putts went .29 inches further than missed birdie putts.  This is only 5 percent of one revolution of a golf ball.  If my assumption is correct, the reported difference in distance is physically insignificant even if not statistically insignificant. (Note: The paper does not explicitly state the equation being estimated, define the independent variables, nor specify how those variables are measured.  Given the peculiar estimating model, I do not have much confidence in the results.)
 dp -Yes, inches is not a unit of force…  the meaning of the sentence, however, is obvious.  The paper does provide the equation being estimated and defines the variables.  The specification used in Table 6 is the same as Column (4) of Table 3.  Furthermore, .29 inches is statistically significant (the standard error is reported in the table for everyone to see).
ld – If there was an equation, Dr. Pope could have referenced it.  An “obvious” meaning of the sentence is missed par putts travel .29 inches further than missed birdie putts. Shotlink, however, only promises an accuracy of 1 inch when measuring ball position.  The accuracy diminishes if the player taps in before the operator can sight the position of the ball.  In that case, the laser operator is instructed to shoot the ground where the ball was.  Given the accuracy of Shotlink a difference of .29 inches may be statistically significant but still insignificant. 
Table 6 of the study presents estimates of the probability that the ball will be hit short of the hole.  Again, the units attached to the findings are not clear.  I will assume the author claims a birdie putts is 0.5 percent more likely to come up short than a par putt.  If this is the finding, I find the percentage difference negligible.
dp -.5% is correct and this is significant with 99.99% confidence (again, see the standard error in the table).
ld -Let’s assume this is true.  Out of a 1000 putts of each kind, a player will leave three more birdie putts short.  That may be statistically significant, but I do not see such a small difference as important in explaining human behavior.  
An important finding of the study is buried in footnote 12.  The author writes “for putts less than 200 inches (16.7 feet), we find no differences.  Notably, the risk aversion account cannot explain the entire par-birdie accuracy gap.”  I assume this means birdie and par putts within this length are hit with the same force and come up short with equal probability.  Then what can explain the difference in the probability of making par and birdie putts shown in Figure 2?  Possibly GIR?  If the author’s theory cannot explain the “effect” in a region where most putts occur, the theory should be rejected.
dp -As any golfer should expect, no one is going to leave a 1 foot putt short of the hole.  The point of this footnote and the whole point of Columns 3 and 4 of Table 6 are to show that where you really start to see the difference between leaving a par and a birdie putt short are in the longer putts.  This is what one would expect!  For longer putts, Table 6 shows that birdie putts are 2.3% more likely on average to be left short of the hole (extremely statistically significant).
ld – I am not talking of 1-foot putts.  The draft is saying there is no effect on putts shorter than 16.7 feet.  The vast majority of par putts are from within this distance.  The author should have stratified his findings over narrower segments (e.g., 5-10 feet, 10-15 feet).  The sample size of each type of putt should also have been reported (e.g., How many par putts of over 270 inches were there?).   
Finally, the author argues for inclusion of the Accenture Match Play in Table 8 since “our effect occurs in all of the tournaments.”   The effect, however, in match play would not be measured in relationship to par, but a player’s standing with his opponent—a completely different loss function than stroke play.  But if the author wants to include all tournaments, he should also throw in winnings from the European Tour.
dp -Admittedly, this table could be done in a variety of ways.  Our numbers would be even larger if we included winnings from the European Tour.  It is just an attempt to give people who don’t understand what 1 stroke per tournament means a way of understanding the magnitude. 
ld – I don’t believe the author understands the difference between match and stroke play.
I do not mean to be too harsh on the study.  It is after all only in the draft stage.  My problem is with the N.Y. Times running with the results without fact checking.  I suspect the allure of having Tiger Wood’s name in the study’s title made it difficult to pass up.  This is not Jason Blair type stuff, however.  The only damage would be you placing one more extraneous thought in Jim Furyk’s head while he putts.  Furyk should be able to overcome this, but I have taken him off my fantasy golf team just in case.

Laurence Dougharty

In 2011, the study was published in the American Economic Review.  This study will be critiqued in the next post.





[1] Pope, Devin and Maurice Schweitzer, “Is Tiger Woods Loss Averse?  Persistent Bias in the Face of Experience, Competition, and High Stakes,” June 2009, Unpublished.
[2]  I did not realize the PGA Tour recorded the order of putts.  It must have been very time consuming to label each putt by two dimensions (the number of preceding putts by Player A, and the number of preceding putts by all other players. (Note: After writing this letter I learned Shotlink records the time of each shot.  I assume it is up to the researcher to develop a program that would sort the shots by each group on each green by time.)
[3] See PGATOUR.COM, Tiger Woods, Career Details, 2008.

Monday, November 21, 2011

Golf Digest: Press Adjunct of the USGA


     One reason for the paucity of research on golf handicapping is the USGA's virtual monopoly of the issue.  This monopoly guarantees there will be no forum for competing ideas.  It is like a world without combatants Google, Apple, and Microsoft, but only Microsoft. The press could play a role in questioning this monopoly, but it has not.  The reasons are probably twofold. First, golf writers are not trained in the subject matter, and find it easiest to parrot the USGA line.  Second,  golf publications have a symbiotic relationship with the USGA so there is little to be gained by challenging its host organism.  Let's go back in time to document one incident in the fawning relationship between Golf Digest and the USGA.
         Nothing demonstrates the insular mentality that resides USGA headquarters than a letter written to Golf Digest by Dean Knuth, Director of Handicapping. The letter demonstrates both the arrogance and ignorance by which the USGA rules the game.  It started when a reader wrote to Golf Digest (December 1996) challenging Knuth’s analysis:

In the October Digest (“The Gatescate Scandal”, Dean Knuth is quoted a saying the odds are 1 in 200 that you beat your handicap by three strokes, 1 in 570 that you beat it by five strokes, and 1 in 82,000 that you beat it by 10 strokes.  I suggest Dean check the batteries on his calculator.  If the odds are 1 in 200 that you beat your handicap by three strokes then Greg Norman would shoot a 66 once in every 200 rounds.  It would also mean that your garden variety, 18-handicapper would be a model of consistency, shooting within two strokes of his handicap more than 90 percent of the time.
            As far as 10 under goes, I’ll bet nearly everyone with a handicap of more than 15 had done it a least once and in a lot fewer rounds than 82,000.  I don’t have any idea what Bill Gates handicap is, but it’s pretty clear ol’ Dean is the wrong guy to calculate it.  He is statistically challenged.

            Golf Digest let Knuth reply directly below the reader’s letter--a privilege very few if any are afforded:

First off, Greg Norman had a USGA handicap index in 1995, his level of play equated to a plus 7.5.  That is to say, 7.5 strokes better than a scratch golfer.  The average USGA course rating is approximately 71, so Greg’s better-half scoring average would be 63.5.
            On tour, the courses are set up to a course rating of 76 on the more difficult stops.  There his best average would be 68.5.  To beat his handicap index by three strokes, he would have to shoot 60.5 on the average course (that we play), or 65.5 on the strong tour course.  (Note: The average tour player is a plus 3.5).
            With respect to the garden-variety 18-handicapper, he or she averages three strokes over his or her course handicap and plays to it 25 percent of the time.  Beating your handicap by three strokes or more – twice in tournaments – becomes such a rare event that Section10-3 of the USGA handicap system automatically reduces the player’s USGA index.  Less than 1 percent of golfers are reduced under that procedure, so its an uncommon event except by the sandbaggers of the links.  However, it is true that the size of the handicap index does affect the probability of making a low net score.
            I have been called a number of thing, but never before “statistically challenged.”  I scored a perfect 800 on the math section of my SAT test in high school, graduated from the Naval Academy and got a masters in system technology.  USGA statistics are based on a database  of million of scores and were worked out by our handicap research team of statisticians, mathematicians, and professors.

            Knuth argues that he is not “statistically challenged” because he scored 800 on his SAT and had a masters in system technology.  Arguing “credentials” is never an effective or reasonable method to quiet one’s critics.  This especially true when those credentials consist of a 30-year-old test score and a degree that is the academic equivalent of making the cut at the B.C. Open.
            Knuth commits so many errors in his short letter, he makes the reader’s case for him.  Knuth wrote that beating your handicap by three strokes or more –twice in tournaments – automatically reduces a player’s USGA index.  Knuth’s assertion is not true.  A player who has only two tournament scores must average at least four strokes below his index before his index is reduced.  It is disconcerting that the USGA Director of Handicapping demonstrates a lack of understanding of the system he governs (See section 10.3 of The USGA Handicap System). 
            Knuth mistakenly wrote beat your “handicap” when he should have written beat your “index.”  The USGA procedure reduces a player’s index for performance that beats his index, not his handicap.  A player who beats his handicap twice by four strokes may or may not get a penalty reduction depending on the Slope Rating of the course.  This can be easily demonstrated.  The player’s score is:

            Score = Course Rating + (Handicap - 4)

The player’s index differential for this round is:

Index Differential = (Score - Course Rating) (113/Slope Rating)

Substituting,
                                         = (Handicap - 4) • (113/Slope Rating)
                                         = (Index • Slope Rating/113 - 4) • (113/Slope Rating)
                                         = Index - 4• (113/Slope Rating)

The penalty for exceptional tournament performance is determined in part by the difference between the a player’s index and the index differential:

            Index - Index Differential = 4 • (113/Slope Rating)

If the Slope Rating were 155, the player would only beat his index by 2.9 strokes, and no penalty would be assessed under Section 10.3.  If the player beat his handicap twice by four strokes on course with a low Slope Rating (e.g., Slope Rating = 80), the player would be penalized. 
            Knuth mysteriously omits the effect of the Slope System in estimating Greg Norman’s scores.  It is mysterious since Knuth is the major proponent of the System.  Knuth argues that if Norman had an index of +7.5, his best average would be 68.5 on tour course with a rating of 76.  Knuth’s assertion would be true if the Slope Rating of the Course was 113.  More than likely, however, the Slope Rating would be at least 145.  In that case Norman’s best average would be 66.4 (i.e., Norman's unrounded handicap would be 9.6.  This illustrates a troubling paradox of the Slope System: the higher the Slope Rating, the lower the score Norman would be expected to shoot.
            Golf Digest was made aware of these errors, but never ran any correction.  Knuth also did not respond.  Knuth knew he did not have command of the facts or the theory.  If his opponent could not be cowered by his impressive (but not verified) high school test scores, Knuth knew he would lose the argument on the merits.
            Golf Digest as a major publication could be a countervailing force to the USGA.  Like the press in other settings, it could serve as a watchdog.  Golf Digest has consistently rejected that role believing its institutional health is better served by being compliant rather than critical.

Wednesday, October 12, 2011

The USGA Responds

The USGA was sent a paper critical of its handicap research in general and Appendix E of the Handicap System in particular (see The Reliability and Accuracy of USGA Handicap Research, posted 8/22/2011).  Mr. Hovde, the USGA’s Manager of Course Rating and Handicap Education replied:

We received your letter pertaining to Appendix E of the USGA Handicap System.  We are actually in the process of final edits to the Handicap manual for 2012, and we did make some changes to Appendix E including some re-labeling and correct terminology regarding odds/probability.  The table is not used in any part of the calculation of a player’s Handicap Index and for informational purposes only.  It is not an exact table, as both the net differential row and Handicap Index columns are ranges and not down to the tenth, which is the value they would be calculated to.
I haven’t had the chance to fully study the other documents you submitted, but will do so and get back to you in the near future.  As many of the people involved in this research have moved on or passed away, some of the answers take a bit of digging to find.
Mr. Hovde maintains “the table is not used in any part of the calculation of a player's handicap."  I suspect this statement may be is wrong.  The table was probably used in formulating the reduction in index for exceptional tournament performance (Sec. 10-3) which is part of the handicap system. My suspicion is based on an example on the USGA website where the table is used to to calculate how many strokes a player's handicap should be reduced for exceptional performance. (This brings up another point.  Appendix E shows a player with a high index is much more likely to get a reduction in index for exceptional tournament performance than a player with a low index.  How is that equitable?)

I am not concerned whether the table is "exact."  My concern is whether the tables in question are accurate.  There should be an "Appendix E" file documenting the research behind the table.  If the USGA has to run down ex-employees to answer questions, the state of handicap research at the USGA may be worse than I presumed.