Why I Changed My Mind About the Caster Semenya Case

September 18, 2020
(**This Range Report is a little different, a little longer, and co-published with Slate.**)Last week, Caster Semenya lost her appeal of a rule by World Athletics (the governing body for international track and field) that established a testosterone limit for certain women in certain events. Semenya has a rare condition that elevates her natural testosterone level above that limit. She will not be allowed to defend her Olympic title in the women’s 800 meters unless she takes testosterone suppression medication, which she has said repeatedly she is not willing to do. The me who first wrote about Semenya in 2009—when she literally ran away from the field at the World Championships—would have felt last week’s ruling was the right one, albeit unfair to Semenya. Now, I feel differently—still torn, but more amenable to the argument that testosterone regulation should be off the table for athletes like her.
In my book, The Sports Gene, and in Sports Illustrated, I delved into the voluminous scientific rabbit holes of this debate. I won’t do that here. Instead, I’m going to share a few high-level concepts, and try to explain why I changed my mind. I want to start by describing two friends of mine, a pair of scientists on opposite sides of the Semenya legal battle. Joanna Harper is a transgender woman and medical physicist. She’s also an accomplished age-group distance runner. I first got in touch with Joanna for a Sports Illustrated article on transgender athletes, coauthored with Pablo S. Torre. When she began hormone therapy to suppress her body’s testosterone in 2004, she started collecting data. She was getting slower and weaker by the end of the first month. In 2003, before hormone therapy, Joanna ran a half-marathon in 1:23:11. In 2005, after a year of testosterone suppression, she ran the same half-marathon in 1:34:01. Relative to her age and gender category, those were practically identically good performances. In 2007, she won the women’s 50-54 age group at the U.S.A. Track and Field Club Cross Country Championships, but age and gender-graded performance standards continued to show that Joanna was similarly competitive against women after testosterone suppression as she had been before it against men. Joanna has compiled (and published) data from other distance runners who transitioned, and found the same pattern. Beyond the SI article, Joanna and I bonded over running and a shared affinity for Irish playwright Martin McDonagh. I cherish our friendship, and like no one I have ever seen, Joanna has an ability to engage in controversial conversations without pandering or making others feel attacked. She managed to insult Tucker Carlson on his own show without starting an argument, and left him with what appeared to be grudging respect. I wrote the foreword to Joanna’s authoritative and fascinating book, Sporting Gender. I could hardly admire her more. Joanna believes that, absent medical intervention, Caster Semenya should not be allowed to compete in the women's 800 meters. She testified for World Athletics in the Court of Arbitration for Sport, or CAS, hearings that upheld the testosterone regulation. Ross Tucker (aka @scienceofsport on Twitter) is a South African sports scientist. The more controversial the topic, the more likely Ross is to be part of the discussion, usually highlighting an angle no one else has spotted. For years, he pointed to evidence suggesting that the carbon-fiber blades worn by double-amputee sprinter Oscar Pistorius gave him an unfair advantage in Olympic competition. When Ross began arguing this point, Pistorius was still an inspirational and beloved A-list celebrity in South Africa. (This was before Pistorius shot and killed his girlfriend in 2013.) You can imagine how kind the nation’s media and sports fans were to Ross, but he stood his ground. I’m a massive admirer of his—and have stayed at his home in South Africa—even when (perhaps especially when) I disagree with him. In the Court of Arbitration for Sport hearings, Ross was an expert witness for Athletics South Africa, which argued, along with Semenya’s team, to throw out the World Athletics testosterone rule. Joanna and Ross both agree that:—testosterone is the primary source of the male advantage in elite sports (the CAS decision upholding the testosterone rule was 2-1, but all three judges agreed that “testosterone is the primary driver of the sex difference in sports performance”)
—the lower limit of the typical male testosterone range is about four times higher than the high end of the typical female range —testosterone levels are the best (that is, the least imperfect) marker for separating the men’s and women’s competitive classifications Both Joanna and Ross believe that the female classification in sport is important to maintain and that there should be some criteria for entry into it that protects the integrity of women’s competition. Part of the challenge is that women like Semenya with differences of sex development (DSDs) sometimes have testosterone in the typical male range, or that falls between the typical male and female ranges. (One common pro-Semenya argument is that Michael Phelps’ long arms gave him an advantage over other swimmers, but nobody talked about kicking him out of the Olympics on account of his oversized limbs. I find that unconvincing. We don’t separate swimmers by arm length. There are, though, separate competitive classifications for men and women in track and field. Given that those classifications exist, they need to have some meaning, and there needs to be some dividing line between them.) When Joanna and Ross testified for opposite sides in the Semenya proceedings, it was regarding the research that World Athletics relied upon to create its particular testosterone rule. Ross felt that research was insufficient. Joanna supported World Athletics, and later wrote that “the verdict is congruent with my belief that both inclusion and meaningful sport for all women are important.” It’s important to understand that the World Athletics rule doesn’t set a testosterone limit for all female competitors. It sets a limit specifically for women with certain DSDs—those who have XY chromosomes, internal testes and not ovaries, testosterone levels in the typical male range, and whose bodies can respond to that testosterone. It also doesn’t set a testosterone limit for these women in all events. The rule only forces them to lower their testosterone levels if they want to compete against women in the 400, 800, and 1500 meters. Those are the events, World Athletics says, in which testosterone levels and performance, combined with other evidence—like the historical over-representation of women with DSDs in the 400, 800, and 1500—suggest the clearest testosterone-linked advantage. If Semenya opts to lower her testosterone, she can compete against women in those events. If she doesn’t, she can compete against women only in other events. But the World Athletics research also shows an advantage for women with elevated testosterone levels in the hammer throw and pole vault. Those events have not been regulated, at least not yet. Why? In a paper that Joanna co-authored, researchers wrote that World Athletics may not have included the hammer and pole vault because athletes with DSDs have not been significantly overrepresented on the Olympic medal stand as they have been in the other events. On the one hand, it is responsible scientific thinking to apply a rule only where the confluence of evidence is strongest. It also suggests that World Athletics realizes it is not drawing some bright line between men and women for all sports. On the other hand, it’s an awkward fit with the idea that testosterone is the primary driver of the male advantage in all events, and with the pretty consistent 10 percent performance gap between the fastest men and the fastest women at all distances. (In the paper, Joanna and her coauthors speculate on some of the potential reasons that women with DSDs may not have been as historically successful in the hammer throw and pole vault.) For most people, their genes, chromosomes, reproductive organs, and gender identity align into the categories typical for cisgender men or women. But that isn’t true in every case. Dividing athletes into male and female groups requires imposing a clear binary on a messy reality. There is no science that will change that. As Ross has said: “This is easily the most complex issue that sports has had to deal with, perhaps ever.” It’s so complex that Joanna and Ross, who are also friends, can agree on the social and scientific principles and yet disagree about the bar for evidence that leads to regulations. Over time, I’ve become more drawn to the argument that athletes, like Semenya, who have been raised as women and who have lived as women and who identify as women should be allowed to compete as women. But Joanna wouldn’t be Joanna if she didn’t push me to think very hard about why I feel that way. Joanna also pressed me to think about how the “ ‘born female, raised female’ argument,” as she refers to it in her book, would even work. We can’t know gender identity at birth, so in practice people are assigned female when they’re born without a penis. But external genitalia aren’t a good way to divide male and female athletes. Until the guidelines were changed in 2016, the International Olympic Committee said that transgender women who have penises should have them surgically removed to compete with cisgender women. Given that nobody thinks a penis is the source of the male advantage in sports, that made no sense, along with being inhumane. Christie Aschwanden, one of the best science writers in the world, a former elite cross-country skier, and another friend of mine, has argued that “recogniz[ing] Semenya under the gender identity that she has inhabited since birth” is sound policy. Any kind of testosterone testing, she argued, will harm not only women with high testosterone, “but also every other woman athlete who looks too ‘manly’ or otherwise does not conform to someone else’s notions of what a woman should be.” I can count on my hands the number of people whose opinions I value as much as Joanna’s and Christie’s. And I don’t think either one is necessarily wrong. Our society—and, as a consequence, our sports—hasn’t been set up to accommodate gender nonconforming individuals. That is not fair or just, and it puts us in the impossible position of selecting which kind of unfairness and injustice we believe is the least harmful to the fewest individuals. As Christie put it: This is a trolley problem, and World Athletics—and anyone who cares about Semenya and track and field—has to acknowledge that no matter what decision gets made, someone is going to get run over. I started running the 800 meters late in high school; within a few years my best time was faster than the women’s world record. It doesn’t seem fair for the most-accomplished, hardest-working women in the sport to compete against someone with my physiology for their livelihoods and podium spots. World Athletics has argued that, as it pertains to the 800, Semenya has all the physiological advantages of a male runner. (Her times in the event wouldn't be nearly world class in men's competition, but they would be good enough for her to compete on a men’s college team.) I think it’s good for athletes, sports fans, and society to have fair women’s competition at the elite level. The Court of Arbitration for Sport ruling acknowledged the harm of the testosterone rule to athletes like Semenya, but said that “such discrimination is a necessary, reasonable and proportionate means of achieving the legitimate objective of ensuring fair competition in female athletics in certain events and protecting the ‘protected class’ of female athletes in those events.” Semenya has been an exemplary champion, performing admirably on and off the track in the face of tremendous pressure and scrutiny. The initial handling of her case was abysmal and inhumane. After she won the Africa Junior Championships in 2009, Athletics South Africa—at the request of World Athletics—conducted sex-verification testing on Semenya; she was not given an explanation, and thought the tests would be for doping. When she became world champion in Berlin that same year, Semenya said that World Athletics took her to a local hospital where she “had no choice but to comply” with more testing. Leaks of medical information and tin-eared public remarks from World Athletics officials enmeshed Semenya in what she called “the most profound and humiliating experience of my life.” World Athletics’ approach has improved enormously in substance and in tone since then, but I still have lingering mistrust. I’m inclined to have the governing body proceed by the precautionary principle, and get involved in sex testing as little as possible. That is, given that World Athletics is the one driving the trolley, we should make sure that it’s moving along the tracks incredibly slowly and deliberately. My experience doing investigative reporting on medicine has also made me feel skittish about medical interventions for asymptomatic people, even if they’re generally safe. If Semenya wants to try to defend her Olympic title, she would have to undergo testosterone suppression for no medical reason. I think it’s reasonable to suspect that that could change her life in ways that she finds undesirable. In fact, she has said as much about a five-year period when, at the behest of World Athletics, she underwent testosterone suppression; she remained world class but got steadily slower. (Alternatively, she could compete in other events. Sprinter Aminatou Seyni, who hails from Niger, would have been a World Championships medal contender in the 400 this year without a testosterone rule; she opted not to lower her testosterone, and instead competed in the 200, where she made the semi-finals.) And now for one more layer of complexity. Joanna and Ross may soon be on different sides of another legal battle in sports, about transgender athletes in rugby. And this time, they’re on sides opposite to what you might guess based on their roles in the Semenya case. Ross collaborated on a report that concluded that allowing transgender women to play on women’s teams carries at least a “20 percent to 30 percent greater risk” of injury to women who never went through typical male puberty. Based on that report, World Rugby just proposed banning transgender women from women’s competition. Joanna once told me that, “for cardiovascular factors, trans women go from typical men to typical women after transition, but with strength, they go from typical men to somewhere in between typical men and typical women.” Still, she thinks the evidence for a ban is shoddy. Joanna and Ross—and I—all think that having guidelines in Olympic sports and the NCAA that require athletes transitioning from male to female to undergo testosterone suppression (if they want to compete in the women's category) is reasonable. In that scenario, the athlete is proactively seeking to switch her competitive classification, and I think it makes sense to have rules for managing that. But even there, I’m not perfectly consistent; I don’t think the same guidelines should apply to Olympians as to youth athletes. Joanna has floated the idea of allowing transgender girls who have not undergone testosterone suppression to compete against girls in high school track, but not at the state championship level. Last week, Joanna explained to me, “I think that in most—but not all—matters, we should use gender identity to divide people into male and female categories.” I used to be troubled by this feeling of inconsistency. I don’t love it now, but I’m more comfortable with it. I think Joanna and Ross must be comfortable with it too, which is why they can take positions in track and rugby that might look contradictory from the outside. They’re not actually contradictory at all. In its decision, CAS called the testosterone regulations a “living document”; experience may show the rules need to be altered for appropriate implementation, or new evidence may show that fewer or more events should be regulated. I’m sure that both Joanna and Ross would love to see new research, and that both would be open to changing their views if warranted. They’re just trying to follow the best available (necessarily imperfect) evidence in each individual case, while keeping humanity and the values of sports in mind. Thank you for reading, until next time….David
p.s. If you have a friend who might enjoy this free newsletter, please consider sharing. They can subscribe here.[/vc_column_text][/vc_column][vc_column width="1/4" offset="vc_col-lg-offset-1"][vc_raw_html]JTVCbWM0d3BfZm9ybSUyMGlkJTNEJTIyNDklMjIlNUQ=[/vc_raw_html][vc_basic_grid post_type="post" max_items="3" element_width="12" grid_id="vc_gid:1600429840038-3e6b9737-6949-5" taxonomies="14"][/vc_column][/vc_row][vc_row][vc_column][vc_wp_posts number="3"][/vc_column][/vc_row]Risk Estimation and the Pandemic, with Dr. Kelly Fradin

September 8, 2020
The last Range Report focused on the utility of “Fermi estimation” for calling bullshit on the news, or on your neighborhood Twitter know-it-all.
This report is part one of two in my back-to-school-in-a-pandemic special, and I want to start with some estimation that I would've found really interesting even if I weren't a new parent. Early in the pandemic, we learned that one of the few silver linings of this coronavirus is that children rarely develop a severe case of Covid-19. And then came periodic reports of kids who had a wildly overactive immune response to the virus (akin to Kawasaki disease) that was fatal or caused serious heart damage. Silver lining officially tarnished. Those early reports didn’t have much numerical context, so I went searching to see if anyone tracking the issue had done some Fermi estimation of the likelihood of a child having that disastrous immune response. Sure enough, Dr. Kelly Fradin, a New York pediatrician and author of the newly self-published Parenting in a Pandemic, had already done this weeks before I even saw (and had a chance to worry about) those news reports. Here’s the thinking she shared on Twitter:
Range Report Question: On average, people who develop symptoms start feeling them five days after exposure. Looking at this chart, it looks like nearly 40% of them would still test negative on day five, even while they’re spreading the virus and getting sicker! What course of action would you recommend for someone whose kid comes down with very suspicious symptoms, but tests negative, and isn’t sure when the first exposure occurred?
David
p.s. If you have a friend who might enjoy this free newsletter, please consider sharing. They can subscribe here.[/vc_column_text][/vc_column][vc_column width="1/4" offset="vc_col-lg-offset-1"][vc_raw_html]JTVCbWM0d3BfZm9ybSUyMGlkJTNEJTIyNDklMjIlNUQ=[/vc_raw_html][vc_basic_grid post_type="post" max_items="3" element_width="12" grid_id="vc_gid:1599525505716-d10caba4-07a3-8" taxonomies="14"][/vc_column][/vc_row][vc_row][vc_column][vc_wp_posts number="3"][/vc_column][/vc_row]Piano Tuners and the News in Beirut

August 11, 2020
At the time, I thought the questions were just ridiculous.
Every exam contained one problem that had nothing at all to do with first-year college chemistry. They came completely out of nowhere, like a piano falling from the sky. Or a piano tuner. "How many piano tuners are there in New York City?" That's the one I remember most vividly. My thought process went approximately like this: No fair! WTF is this?? I have no idea… 10,000? I was very wrong with that answer. In retrospect, I was very wrong about the question, too. Far from ridiculous, that question and others like it led to one of the most useful lessons I have ever learned in a classroom. The questions were examples of so-called “Fermi problems.” Enrico Fermi—who created the first nuclear reactor beneath the University of Chicago football field—often made (and asked others to make) back-of-the-envelope estimates to help approach problems. Part of his lesson was that a certain mental strategy is often more important for attacking a problem than is detailed prior knowledge. The strategy entails breaking the problem down into many pieces and making estimates for each one; none of the estimates has to be particularly accurate for the end result to be sensible. Here’s how I should have addressed the piano tuners: —How many people do I think live in New York City? About 9 million. —How many people are in the average household? Let’s say three. So 3 million households. —What portion of households do I think have a piano? Not really sure...some Upper East Siders probably have two, but let’s say one in ten overall. So 300,000 pianos. —How often is a piano tuned? I certainly don’t know, but sounds like an annual thing, or at least in that ballpark. —How many pianos can one tuner service in a day? Hmmm...let’s say an hour per piano, plus transit and breaks and all that, so let’s say five per day. —How many days does a piano tuner work? There are about 260 weekdays in a year, minus a few weeks of holidays, so say 240 workdays, which would mean that one piano tuner can tune about 1,200 pianos a year. So 300,000 pianos divided by 1,200 pianos-per-tuner-per-year means we need 250 piano tuners to keep NYC in tune. Now what’s the actual answer? I don’t know exactly—I just redid that in real time with a series of guesses, as I should have the first time around—but I’m pretty sure I got the right order of magnitude. According to WIRED, Google has used “How many piano tuners are there in Chicago?” as a question in job interviews. WIRED reported that the Yellow Pages lists 83 tuners in Chicago, with a few duplicates, so WIRED thinks about 60. I prefer to stick with 83, because New York City has about three times the population of Chicago, and 83 x 3 is 249, so that’s more favorable to my estimate. (Future Range Report topic: confirmation bias!) I don’t know anything about piano tuning, but using the Fermi strategy I could easily have realized that my 10,000 answer was nonsensical. More importantly, understanding and practicing Fermi estimation ultimately became one of the most useful tools in my cognitive garage. I use Fermi estimation constantly to consider whether news articles and academic papers even make sense on their face before I bother to dive deeper; some of my articles and parts of my books were born when Fermi estimation helped me realize that certain scientific papers or popular news articles made no sense and couldn’t possibly be accurate. In chapter two of Range, I talk a bit about Fermi estimation, and mention that it’s one of the topics in the University of Washington course INFO 198/BIOL 106B, or, by its more popular name: “Calling Bullshit.” The instructors use a news case study to demonstrate “how Fermi estimation can cut through bullshit like a hot knife through butter.” Last week, I saw an important example of Fermi estimation in the wake of the horrendous explosion in Beirut. I first learned of the explosion on Twitter when my timeline filled up with authoritative descriptions alleging various explosive materials and photographs with red circles around some dot in the sky that the Twitter user claimed was a missile or fighter plane. Some tweeters were arguing about the recent political motivations of several countries—their attempt to demonstrate that it was most likely an attack. And then I came across this tweet below:- At the first atomic bomb test, Enrico Fermi dropped pieces of paper “before, during, and after the passage of the blast wave,” and used the distance the paper traveled to estimate the explosion strength.
- Speaking of the course "Calling Bullshit," the instructors wrote a book of the same title that came out last week, and I'm loving it so far.
- A randomized trial found that scientific articles that get tweeted end up with a lot more academic citations. (Tip for my scientist readers: tag the journal's account when you tweet, but if you start your tweet with the journal's account name, put a "." before it or else only people who follow both you and the journal will see it.)
- The pandemic has disrupted the illicit drug trade and in turn led to a surge of overdoses. (For background, I wrote about that disruption in a previous Range Report.)
- Maria Konnikova is one of my favorite writers. Her new book, The Biggest Bluff, details her one-year journey from poker novice to pro. Maria has a psychology PhD, and the book is really her examination of the impact on decision making of risk and luck and skill and focus and pressure and practice and cognitive bias. It also describes a game created by poker pros called “Lodden Thinks.” The game is sort of a mashup of Fermi problems and psychological profiling. Competitors pick a question, and then they have to guess what some other person would answer. (Poker pro Johnny Lodden was the first other person, hence the name.) Getting the right answer is irrelevant, you just have to figure out what that other person would say.
- An epidemiologist who sounded the pandemic alarm early just noted: “We’re drinking from a fire hose right now in terms of new [scientific] information…there’s a downside to that, because with that comes an increasing amount of marginal if not potentially erroneous information.” The last Range Report, on blood type and Covid-19, was about exactly that.
Thanks so much for reading. Until next time….
David
p.s. If you like this post, please consider sharing it. (Here's the link.) And if you have a friend who might enjoy this free newsletter, they can subscribe here.[/vc_column_text][/vc_column][vc_column width="1/4" offset="vc_col-lg-offset-1"][vc_raw_html]JTVCbWM0d3BfZm9ybSUyMGlkJTNEJTIyNDklMjIlNUQ=[/vc_raw_html][vc_basic_grid post_type="post" max_items="3" element_width="12" grid_id="vc_gid:1597153448223-d1ff76e6-de8d-8" taxonomies="14"][/vc_column][/vc_row][vc_row][vc_column][vc_wp_posts number="3"][/vc_column][/vc_row]Type A Blood and Covid: Danger! …Wait, Never Mind

July 28, 2020
In early June, my family had a distressing dinner-table conversation about medical research that was making headlines.
The study, eventually published in the venerable (and retraction-prone) New England Journal of Medicine, found that type A blood was associated with more severe Covid-19. Specifically, it found that patients with type A blood had a 50 percent increased likelihood of needing oxygen or a ventilator. That’s not good; early research in this pandemic suggested that one-third to one-half of patients who end up on ventilators die. My entire family is type A, including me. (Blood, not personality. As a writer, my personality is naturally type O. ...Oof, bad pun.) Obviously, this was unwelcome news. And it was the second study that had found a bad association with type A blood, although no one knew exactly why. “That is haunting me, quite frankly,” is how the New York Times quoted a German molecular geneticist who co-authored the NEJM study. Scary stuff. Doctors popped up in articles and on YouTube theorizing (usually cautiously) why people with type A blood might have a different immune response to the coronavirus. The explanations were what my friend Mike Joyner of the Mayo Clinic calls “bioplausible.” That is, they are entirely logical, but also probably wrong. While I wasn’t excited to hear the results of the blood type study, thanks to lessons I learned while reporting my first book, The Sports Gene, I was very skeptical. My guess was that subsequent studies would either find a much smaller influence of blood type, or none at all. In the two years I spent going through research on genetics and physiology, I came across a lot of studies that associated some physical trait with blood type. This, I learned, is how most of that body of research was created: a lab would be studying the genetic contribution to some physical characteristic, let’s say height, just for example; the lab collected blood from all subjects; as long as the researchers had blood, they decided they might as well get blood type data. Later on, when they analyzed all their data, they noticed a correlation between height and a certain blood type, and so they published it. It wasn’t the study they set out to do, but it’s an easy way to get another publication. Fine, nothing wrong with that in and of itself. Except eventually I learned that a lot of labs were doing that because it’s so easy to do, and those that didn’t find an association just didn’t publish it. So all the positive findings got published, and few of the negative findings (i.e. those that found nothing) got published. This is what scientists know as “publication bias,” or, colloquially, “the file drawer problem,” so-called because studies that find no relationship end up stuffed in a file drawer, never to see the light of publication. In the topics I was probing for The Sports Gene, I saw this pattern several times: a study finds a strong association of some physical trait to blood type, then another study does too; then a few studies start to trickle in that show a much weaker association; then come the studies that show no association at all. Ultimately, the conclusion is that the early studies were false positives, and only scientists getting false positive results were initially publishing. (As psychologist Drew Bailey taught me, this “decline effect” — the gradual drop in a reported effect over time as more studies are published — is an area of study unto itself.) The good thing is that science often worked the way it should, eventually correcting the record. It just took a while. Amid the breakneck pace of coronavirus research and news, that’s kind of a problem, even when “a while” is measured in weeks. Six weeks after my dinner-table conversation, a new round of studies found that blood type has little or nothing to do with Covid-19 severity. Unfortunately and unsurprisingly, the new findings received less attention. (But props to the New York Times for following up its initial story. In my opinion, when this happens, the follow-up article should be linked at the top of the original story, so that anyone who sees the first piece also sees the corrective.) Here’s the moral of the story for this moment in time: tons of data on Covid-19 and patient characteristics is piling up all over the world, and scientists will be looking for and sharing all sorts of correlations. Many of those will be false positives, the result of statistical randomness. If the correlations are dramatic, they’ll grab headlines. Other researchers will (hopefully) subsequently try to replicate those findings, and often fail. The negative results will be less likely to get published. When they are published, they’ll be less likely to garner expansive news coverage. My advice: if a particular Covid finding — say, the supposed curative effect of hydroxychloroquine — grabs your attention, first treat it like a hypothesis, not a rock-solid conclusion. To use a phrase from chapter 11 of Range, treat it like a “hunch held lightly.” Then set up a news alert so you have a better chance of noticing if the original study is contradicted. And keep in mind that the initial positive results are likely to be the most dramatic that are ever found, which is why they were published in the first place. Finally, this lesson applies to all research, but I think it's especially worrisome in drug trials. A recent examination of 105 clinical trials of certain antidepressants showed that 53 of the trials found the drugs to be effective, and 52 of the trials found them to be ineffective. But while 52 of the 53 positive trials were published, only 25 of the negative trials were published. So the body of published research is badly distorted compared to the actual scientific findings. Even an extremely conscientious doctor — one who pores over that entire medical literature — may well conclude that the drugs are more effective than they really are. BONUS IN-THE-WEEDS TIP: The Funnel Plot (You already read the main point, so feel free to skip!) There’s a really neat visualization called a “funnel plot” that helps demonstrate the publication bias issue I just described. Below is a funnel plot from a study that analyzed other studies of whether probiotics prevent gastrointestinal disease. Every dot in the chart represents a single study of probiotics and GI disease. On the x-axis is a measure of whether probiotics increase or decrease GI disease risk. A negative number means probiotics decrease risk, so that’s what you want. The y-axis is a measure of how reliable a study is (basically how large the study is); the higher up the y-axis a datapoint is, the more reliable that particular study. Ok, take a look:David
p.s. The last Range Report, a remembrance of the so-called "father of the 10,000-hours rule," evoked many more responses than I expected. In case you missed that one, here it is. p.p.s. If you have a friend who might enjoy this free newsletter, please consider sharing. They can subscribe here.[/vc_column_text][/vc_column][vc_column width="1/4" offset="vc_col-lg-offset-1"][vc_raw_html]JTVCbWM0d3BfZm9ybSUyMGlkJTNEJTIyNDklMjIlNUQ=[/vc_raw_html][vc_basic_grid post_type="post" max_items="3" element_width="12" grid_id="vc_gid:1595898791338-bb61f7ac-76b3-5" taxonomies="14"][/vc_column][/vc_row][vc_row][vc_column][vc_wp_posts number="3"][/vc_column][/vc_row]Remembering the “Father of the 10,000-hours rule”…(p.s. he hated that title)

July 2, 2020
On March 25th, psychologist Anders Ericsson and I were both supposed to attend Angela Duckworth’s class at Penn, where she would start the day by discussing her famous grit research. I critiqued certain extrapolations of grit research in chapter six of Range, and the idea was that Anders and I would share thoughts on the relative importance of things like grit, deliberate practice, early specialization, and talent. And by “share,” I mean probably debate. Anders did not believe in talent and was a proponent of the idea that a head start in narrowly focused practice was the ultimate advantage; I have argued for the importance of sampling, exploring different talents, and not specializing too early. When the pandemic intensified, we had to cancel our trips to Penn. It didn’t seem like a big deal. Anders and I had had feisty exchanges before, and I figured we had more to come. So the news of his passing last month came as a total shock. I was crestfallen, which might seem strange given that our relationship was based on public disagreement. My first book attracted attention partly because it criticized Ericsson’s work that led to the “10,000-hours rule,” particularly the argument that there is no such thing as talent. My second book attracted attention partly because it criticized the push for early hyperspecialization as the path to excellence in any endeavor, an argument in a raft of best selling books that focused on Ericsson’s work. And yet, Anders not only enriched the field of expertise research, he enriched my life. Before I explain how, I want to note that Ericsson hated the moniker “10,000-hours rule,” even though it made his work world famous. He hated it so much that he wrote an open letter about it in 2012 and posted it on his faculty web page at Florida State. The title speaks for itself: “The Danger of Delegating Education to Journalists.” Yikes. He expressed frustration at the idea of 10,000 hours as some magical threshold. The number came from a 1993 paper that Ericsson co-authored. It featured 30 violinists at a music academy, the 10 best of whom had accumulated 10,000 hours of “deliberate practice” on average by the age of 20, and were deemed by their instructors to have international soloist potential. The next 10 musicians were deemed potential pros as part of a symphony, and the bottom 10 were categorized as “music teachers.” In a nutshell, the paper concluded that the superior musicians had spent more time in what the researchers characterized as deliberate practice: solitary, cognitively engaged, effortful practice focused on error correction that “is not inherently enjoyable.” The paper — which it’s safe to say is the most influential modern paper on skill development — was already a big deal when Malcolm Gladwell’s mega-smash Outliers made it world famous as the “10,000-hours rule,” the “magic number of greatness.” (You know an academic paper has reached crossover-star status when it inspires a Macklemore tribute. Which, I have to say, is catchy and motivating during interval sessions.) In his “Danger of Delegating” letter, Ericsson wrote: “10,000 hours was the average of the best group; indeed most of the best musicians had accumulated substantially fewer hours at age 20. Our paper found that the attained level of expert music performance of students at an international level music academy showed a positive correlation with the number of solitary practice hours accumulated in their careers and the gradual improvement due to goal-directed deliberate practice. In contrast, Gladwell does not even mention the concept of deliberate practice.”The fact that Gladwell doesn’t mention deliberate practice is a nitpick that probably wouldn’t have made a difference to non-scientist readers anyway. That said, the concept of deliberate practice is a huge contribution that Ericsson made to the study of skill development. Even though the definition of deliberate practice seemed to morph at times, Ericsson put an all-practice-is-not-created-equal stake in the ground.
Just mindlessly swatting balls at the driving range, for example, isn’t as useful as watching where a shot goes, noticing that it sliced, adjusting the club head and trying again to see if you can improve. To use the terminology of psychologist Robin Hogarth, which I featured in Range, I think part of what Ericsson was advocating was making a learning environment as “kind” as possible by looking for feedback after every attempt that informs the next step. He often said that, ideally, you should have a world class coach telling you what to do after each attempt at something. Speed typing offers a fun example, I think, of Ericsson’s larger, conceptual principle. It turns out that if you just type a lot without thinking about it, you’ll get better, until you settle at a plateau in the 50-80 words-per-minute range. When I spoke with court reporters (they use different keyboards) who competed in speed contests, I learned that they too hit a plateau. To get off of it they’ll set a metronome a little faster than they can currently type and then keep up with it no matter how many mistakes they initially make in practice. Soon, they make fewer mistakes and get a bit faster and move the metronome again. After a year of tiny improvements, they’re way faster. Rather than settling at a comfortable plateau, they find a way to make practice effortful again. What I took from Ericsson was the need to find ways to engage in effortful activity when you reach a plateau, rather than assuming you’ve topped out. Ericsson’s focus on the type of practice was a huge contribution. He set a research agenda that will bear fruit for years to come. Still, as psychologist and creativity expert Scott Barry Kaufman (who, like me, had affection for Ericsson) pointed out, there was a huge caveat to the deliberate practice framework that was typically downplayed or left out entirely. Ericsson acknowledged in Peak, Kaufman wrote, that “the techniques of deliberate practice are most applicable to ‘highly developed fields’ such as chess, sports, and musical performance in which the rules of the domain are well established and passed on from generation to generation.” In Peak, Ericsson and his coauthor explain for the first time I had seen that certain areas simply “don’t qualify” for the deliberate practice framework, including “many of the jobs in today’s workplace– business manager, teacher, electrician, engineer, consultant, and so on.” Kaufman added that it leaves out “almost any creative domain.”~~~~~
My most frequently recurrent disagreement with Ericsson was over what his work actually proved. Embedded in Ericsson’s framework was something known as the “monotonic benefits assumption,” which in this context means that there should be a perfect correspondence between the number of hours of deliberate practice someone has put in and their skill level. So if two people who started from scratch each put in a thousand hours of deliberate practice, they should be at the exact same performance level. But in skill acquisition research, this turns out never to be the case except in simple tasks that never change. People tend to progress at different rates even given the same practice. And there is evidence of that even in Ericsson’s famous paper. Think back to the “Danger of Delegating” letter. Ericsson’s own criticism of the “10,000-hours rule” explains that most of the top musicians in his study had not accumulated 10,000 hours. But the original paper did not include raw data or any measure of variance, so it was impossible to tell how much they actually accumulated. It only gave an average, which by its very nature obscures individual differences, so how was anyone to know how much the musicians had actually practiced? To give chess as an example, a study found that it takes 11,053 hours on average to reach international master status in chess, but one player made it in 3,000 hours, and another needed 23,000 hours. So you can have an “11,053-hours rule,” but it doesn’t tell you much about the reality of human skill acquisition. Giving only an average tells a misleading story. In 2012, the same year he wrote the “Danger of Delegating” letter, I invited Ericsson to join a panel on talent and skill development that I co-organized at the American College of Sports Medicine annual meeting. He graciously accepted. That day, he was asked to explain how much variation there was around that 10,000 hours average for the top musicians in the famous study. He answered: “Well, what I would say is that obviously when you’re only collecting data on 10 individuals, and also it’s shown that when we had them keep [practice] diaries, and also do some of the retrospective estimates several times, that there’s no perfect agreement.” That is, the musicians were inconsistent in their accounts of how much they had practiced. The person asking the question, physiologist Tim Lightfoot, replied: “Many of us in this field have dealt with the validity of recall surveys as well, but I don’t ever recall a publication in any of our journals leaving out standard deviation because they said it wasn’t valid. And so I guess I’m still kind of curious as to what is, even if it is flawed, what was the variation that you saw around that 10,000 hours?” Ericsson replied that “it was certainly more than 500 hours.” It later emerged that it was much more. Not only that, but some of the merely good musicians had actually logged more deliberate practice than some of the great ones, so the idea of perfect correspondence between deliberate practice and performance level did not hold. After that ACSM panel, despite the contentious back and forth, Ericsson joined all the panelists for a dinner where the discussion continued. It was energetic, fascinating, and civil. The day was just about my fantasy of productive discourse: earnest disagreement in a formal setting followed by earnest disagreement coupled with humor and food and promises to follow up with data in an informal setting. I went home with a massive reading list. As he did that day, Ericsson put practice at the center of research on skill development, and inspired an enormous amount of subsequent work. He always argued passionately, and he argued clearly so that one knew what they were up against. I don’t once remember in either our public or private exchanges seeing him resort to ad hominem attacks. Sometimes he frustrated me by ignoring contradictory work; other times I eagerly used principles from his work to improve myself, especially my memory. I valued both of those sides, and made use of them. Once, when I got stuck in the writing of Range, I took an online beginner’s fiction writing course. I suddenly felt completely incompetent, and it was a mini-revelation for me. It made me realize that I was using way too many quotes in my book manuscript. I went back and replaced tons of quotes with more clear narration, and the book is better for it. Ericsson wouldn’t have counted a fiction class as deliberate practice because it wasn’t tightly focused on my nonfiction goal, but, to me, it adhered in a way to the conceptual spirit of his work; I was seeking discomfort in order to get off a performance plateau. I wish I could argue over that experience with Anders. I liked knowing that we’d inevitably cross paths now and again. I hope he would have smirked to know that my idea of honoring him was to do what I just did above — continue grappling with our generative disagreements. Rest in peace, Anders Ericsson.Thanks so much for reading the Range Report. If you have a friend who might enjoy this free newsletter, please consider sharing. They can subscribe here. Until next time….
David[/vc_column_text][/vc_column][vc_column width="1/4" offset="vc_col-lg-offset-1"][vc_raw_html]JTVCbWM0d3BfZm9ybSUyMGlkJTNEJTIyNDklMjIlNUQ=[/vc_raw_html][vc_basic_grid post_type="post" max_items="3" element_width="12" grid_id="vc_gid:1593621053859-bff20764-cbbe-6" taxonomies="14"][/vc_column][/vc_row][vc_row][vc_column][vc_wp_posts number="3"][/vc_column][/vc_row]Nerdiness and Fitness in the Time of COVID

June 16, 2020
The coronavirus has all of us spending more time at home than usual. So I thought this would be a perfect moment to get tips from Steve Kamb, the founder of Nerdfitness.com, which has helped fitness beginners all over the world start on their quests toward better health from the comfortable confines of their homes.
Since he started the site in 2009, Steve has been passionate about showing that nerdiness and fitness need not be in zero-sum competition, and can instead be synergistic. The articles and videos on Nerdfitness.com are as delightful for their Ghostbusters references and Star Wars-themed workouts as for their pointers on the perfect bodyweight squat (which is easier than you think). Steve is an absolutely lovely guy, and shares some great (and practical) thoughts for homebound fitness in the Q&A below. My personal favorite: “exercise snacking.” David Epstein: Can you first share a bit about the impetus for starting Nerd Fitness? Steve Kamb: I was a very active kid growing up, playing sports and building tree forts and having neighborhood-wide water balloon fights, but I also spent my nights and weekends devouring science fiction, like the Redwall series and Lord of the Rings, and playing Super Nintendo. Despite working out five days a week through all of high school and college, I never shook my identity as a “scrawny, weak kid.” After college I finally discovered how important nutrition is with regards to building strength and improving one’s health. Thanks to some small adjustments — like learning not to hate, and even to like vegetables — I made more progress in six weeks than I had in the previous six years. So I started absorbing everything I could about strength training and nutrition. It was right around this time I grew disillusioned with my entry level sales job (which I was terrible at), and I stumbled across Tim Ferriss’s 4 Hour Workweek. I read the book in two days, and fell in love with the idea of creating a tiny business by combining two seemingly unconnected ideas and owning it.Those two ideas hit me just a day later: nerd culture and fitness novices. So I googled “Nerd” and “fitness,” and not a single thing popped up. I quickly bought the domain, got certified as a trainer, and eventually started writing simple blog posts that encouraged nerds not to feel embarrassed about asking beginner questions or learning the basics about exercise and nutrition.
DE: One reason I wanted to talk to you now is because people are spending a lot of time inside, without exercise equipment, and they’ve lost the normal structure of their day. Can you share a few basic tips regarding the obstacles people face when they’re thinking about starting to do home workouts? SK: I’ve been writing about how to get in shape at home since day one, as many beginners prefer to start their fitness journey in the comfort of their homes rather than feeling self-conscious in a gym. Now that we’re in the middle of the actual apocalypse, everybody has to quickly adapt to, “How the heck do I get in shape without equipment?!” Here are the three most common challenges people face in the “home gym” setting:- You’re on your own, with no trainer, class, or machines with helpful diagrams.
- Your “gym” is also your living room, basement or backyard; your kids might be bugging you, and you might get distracted by the TV or your phone.
- It’s harder to see progress when you aren’t picking up heavier dumbbells, and along with that it’s harder to have accountability.
- Count any exercise as exercise. Bodyweight training or gymnastics, roughhousing with your children, jumping rope, or even just moving around doing household stuff, known as “non-exercise activity thermogenesis,” or NEAT.
- Establishing a routine is crucial for behavioral change. Specifically, putting workouts in your calendar, having an accountability partner to text or check in with, changing into workout clothes to signal to your brain (and your family) that this is exercise time. If possible, having a part of your house with any equipment you use already laid out and in plain view is huge. Do whatever you can to minimize the steps between you and your new routine.
- If you’re finding your homelife way messier right now, consider “exercise snacking.” For example, what if you had to do a pull-up (or hang) every time you walked under your door frame pull-up bar in the living room? Or you have to do a quick circuit of push-ups and squats in between every 22-minute episode of “Schitt’s Creek” on Netflix. I find doing small movements throughout the day keeps you aligned with the goal of thriving during the apocalypse.
This is why we say: “We don’t care where you came from, only where you’re going.” Whatever brought you to Nerd Fitness, great!
We use each person’s goal as a north star to help build a daily routine. And over time, the mental shift from aesthetic based goals (“Are we there yet?”) to performance based goals (“What am I capable of now?”) is where the magic happens.
So if you want to run a marathon, let’s focus on building the daily habit of running and fueling your body for long runs. If you want to do a handstand, here’s a plan you can follow each day and how to eat to build muscle.
I’d say in 95% of our success stories, the most common thread is: “I started this journey to lose weight, and I don’t know how the heck it happened, but now I actually look forward to exercising.”
DE: What has been your most popular post or program ever, and has a particular post or program seen a surge of interest during social distancing?
SK: The most consistently popular article is my Beginner Bodyweight Workout. I filmed the workout back in 2009 when I started the site. The best part? My shorts are on backwards in the video, [fact check: true ✓] which I didn’t notice until 5 years later. Since the pandemic began, the most popular video, unsurprisingly, has been “7 At Home Workouts.”
DE: Another new obstacle for a lot of people is that not only do they have to workout at home, but they have kids around at the same time. Just, how? Drop some knowledge on me...
SK: Parents, congratulations! You are playing Apocalypse Simulator 2020 on Hard Mode. Very early on when I started Nerd Fitness, I recognized that I was just one guy on a fitness journey, and that other people had other challenges. So I’ve always tried to include expertise from people in different life situations so that everybody reading the site had somebody they could relate to. Now we have 15 full-time coaches at Nerd Fitness who are available to train online clients, and many of the coaches are either married with kids at home right now, or they’re single parents. It’s why we put together this resource, “How to exercise with kids at home.” Not only how to get them involved, but even how to have them serve as giggling and wiggling weights too!
DE: You’ve inspired a lot of people who didn’t have an exercise history to get started. So can you leave us with a few inspiring words?
SK: We’re all just figuring this stuff out as we go, right? There’s no playbook on how you should feel or react when quarantined during a global pandemic.
So don’t compare your health baseline to what you were like pre-pandemic. That is gone. Done. Your new baseline is today. Tomorrow, see if you can find a way to sneak an extra vegetable onto your plate. Knock out a set of 10 push-ups right now on the side of your desk. Go up and down your stairs one extra time. “Better than yesterday” starts to get pretty meaningful when you can do it consistently.
DE: Steve, thanks so much for taking time for me, and I know if people want to learn more about your journey from construction-equipment salesman to writer/personal trainer/extremely fit nerd, they can read about it in Level Up Your Life. The book is hilarious. It starts out with a description of growing up in a town called Sandwich, before moving on to how Steve's boss at his sales job put a GPS device on his vehicle and could see when Steve started late. Steve nonetheless managed to spend a lot of time on the road reading Harry Potter. And then he had a panic attack and decided it was time for a change.LIGHTNING ROUND
- The people I wrote about in chapter 10 of Range have been making “eerily accurate predictions about COVID-19.” Or, as one scientist put it: “How does he keep doing this???”
- "What Protests Can (and Can’t) Do.” This FiveThirtyEight story takes a look at what political science can tell us about protests, and includes an interesting graphic of Google search terms that are on the rise.
- ”Are Americans As Stupid As We Seem on Twitter?” (h/t @nuclearbdgr)
- Even the news version of a new chicken-or-egg scientific publication is pretty technical, but here’s the upshot: both DNA and RNA probably coexisted before life on Earth.
- Travel watch: the number of passengers going through TSA has doubled in the past month, but is still down 80% from a year ago.
George Floyd Before George Floyd: David Cornelius Smith

June 4, 2020
When I called my brother this week, he wanted to talk about a black man in Minneapolis who died after police officers pinned him down with their knees, for several minutes, until he stopped breathing. The man’s name: David Cornelius Smith; he died in 2010.A decade ago, the city of Minneapolis was grappling with an event eerily similar to the recent death of George Floyd. I had never heard of David Cornelius Smith, and you probably haven’t either. But my younger taller brother, Daniel Epstein, is a 34-year-old attorney with a long-running interest in criminal (and civil) justice reform. (He recently used his life’s savings to run for Supreme Court of Illinois on a platform to reform judicial rules; in Illinois, Supreme Court justices wield significant power by setting rules for everything from bail procedures to, well, their own conflicts of interest.)
In telling me about Smith’s death, Daniel taught me a few things that I’ve been thinking about since, so I invited him to do a Q&A. Below is an edited version of our conversation. First, a quick intro to the death of David Cornelius Smith:
Smith was a 28-year-old black man who was mentally ill and experiencing homelessness. On September 9, 2010, he entered a sixth-floor basketball court at the YMCA in downtown Minneapolis, where he behaved strangely, alarming patrons and staff. Staff members called the police, who tried to talk Smith into leaving. Smith refused. Video shows that officers tased him several times. (One officer tells the other that he has never used his taser before, and asks if he should.) The officers wrestled Smith to the ground and knelt on his back, an act known as “prone restraint.” After four minutes, they realized he wasn’t breathing. The officers attempted to resuscitate Smith and called paramedics. The paramedics restarted Smith’s heart, but he was comatose and died a week later.
David: One difference between Smith’s and Floyd’s deaths is that the officers don't appear to have been criminally charged in the first case. But something else you said about the aftermath of Smith’s death struck me — about the civil suit his family brought and how it was resolved. Can you describe what happened? Daniel: Sure. The city of Minneapolis settled in 2013 for $3 million, and agreed to a requirement for more training on proper restraint techniques. But the officers themselves didn’t pay anything out of their own pockets, so there was no direct punishment. As an aside, Derek Chauvin, the officer who knelt on George Floyd’s neck, is a 19-year veteran, so I’d be interested to know if he got the extra training about prone restraint. And, if so, why didn’t it work? David: That makes me think of a book I read recently, The End of Policing. The author, sociologist Alex S. Vitale — coordinator of the Policing and Social Justice Project at Brooklyn College — writes that “researchers have found no impact” of extra training. (And, by the way, the Minneapolis PD has spent millions on everything from implicit bias training to body cameras.) With regard to increasing diversity (which the Minneapolis PD also did), Vitale writes: “There is now a large body of evidence measuring whether the race of individual officers affects their use of force. Most studies show no effect.” That book really stuck with me. When I was the overnight crime reporter at the New York Daily News, I assumed that better training and hiring more officers of color would alleviate a lot of the bad behavior. But the research that Vitale cited suggests that my assumptions were wrong. Daniel: Yeah, I think that's an argument for taking a closer look at the rules and incentives that influence police and policing.David: Right. I’m reminded of the concept of “wicked learning environments,” in which individuals or organizations repeatedly fail to improve with experience because there is no system that forces them to recognize and learn from mistakes. Do you think the justice system, with respect to police misconduct, is a wicked learning environment?
Daniel: I think so. It seems like what one legal scholar calls a “recurring miss.” The system is designed in a way that allows bad behavior to go unpunished, so it repeats. The officers who killed Smith by putting sustained weight on him while in a “prone restraint” not only didn’t face criminal charges, they themselves didn’t have to pay the victim’s family. David: From your attorney perspective, can you offer any potential solutions? Daniel: There are a bunch, but one of the more clever accountability solutions I’ve seen proposed is to require officers to carry malpractice insurance. The idea is: in the same way that insurance promotes safer cars and safer driving through premium adjustments, it would promote safer policing. Officers with disciplinary records (Derek Chauvin had received two letters of reprimand) would be costly to insure, maybe prohibitively so. There is actually some research that has shown that, with systems like this, insurance companies put pressure on police departments to reduce the use of force and to fire bad cops. David: That reminds me of one more quote from The End of Policing, that “use of force is highly concentrated in a small group of officers who tend to be male, young, and working in high-crime areas.” So maybe steep financial repercussions could help weed out repeat offenders. Of course, they would have to repeatedly offend in the first place, but maybe it would catch some of them before anything truly devastating happens. But financial penalties aside for a second, what about criminal penalties? Daniel: Law enforcement officers rarely face criminal charges for use of force, and when they do, they usually win. Prosecutors have to prove that an officer acted unreasonably when viewed “from the perspective of a reasonable officer on the scene, and [that] calculus must embody an allowance for the fact that police officers are often forced to make split-second decisions about the amount of force necessary in a particular situation.” Here’s how that plays out in practice: officers testify that they feared for their lives when they saw the person they shot make a suspicious movement, and juries tend to give a lot of leeway to officers making those split-second decisions. That’s usually enough for acquittal. But these prone restraint cases have a unique feature: there is no split-second decision. The two officers who dealt with David Cornelius Smith pressed down on him for four minutes. They don’t appear to have been charged. If they had been, I wonder if it would have stuck in officer Chauvin’s mind and whether he would have been more likely to remove his knee from George Floyd’s neck. David: Ok, so police are rarely criminally charged, and when they are, the burden of proof is extremely high. What about civil charges against individual officers, as opposed to against the city? Daniel: Cops don’t have to worry about that. For the most part, they enjoy what's known as “qualified immunity,” which means they’re so thoroughly protected from civil lawsuits that judges and juries don’t even get the chance to decide whether they violated someone’s constitutional rights, whether by using excessive force or in some other way. David: I’ve heard the term qualified immunity for years, but don’t know the origins or level of protection. So what exactly is it, and why does it exist? Daniel: You can thank the Supreme Court for qualified immunity. You won't find it in the Constitution. The Court established the defense in 1967, and expanded it 15 years later. So if you want to sue an officer today for violating your constitutional rights, you have to show that your right was “clearly established,” which in practice means that you pretty much have to find an old case that’s almost identical to yours, and in which the victim won. Turns out that’s a tall order that leads to some bizarre results. For example, a recent lawsuit alleged that Fresno cops who executed a search warrant on a suspected illegal gambling operation pocketed $225,000 that should have gone into evidence. The 9th Circuit Court of Appeals held that the officers could not be taken to trial because they enjoyed qualified immunity and it wasn’t “clearly established” whether cops stealing property that was seized under a warrant violates the Constitution. David: That seems very “turtles all the way down” — you can’t prove that your constitutional rights were violated unless someone else proved it in the same situation, which would have required that person to find someone else who proved it in the same situation…and so on. So did that stop David Cornelius Smith’s family members from bringing a civil suit against individual officers? Daniel: It certainly could have. It might sound strange, but if they couldn't find a past case that ruled that killing via prone restraint while in custody violates the Constitution, they would be in a tough spot. David: So let’s say that was indeed the case. Will that same reasoning possibly apply in the death of George Floyd? Daniel: Officer Chauvin and the other officers could conceivably get qualified immunity because Floyd’s family can’t point to the very similar case of David Cornelius Smith as one where constitutional rights were found to have been violated. Basically, it’s really hard to deter police misconduct by hitting cops in the pocketbook. But remember, qualified immunity only applies to civil suits, and in Floyd’s case the officers have been charged criminally. But officers have protections that you and I don’t have, and qualified immunity isn’t the only one. David: Do tell… Daniel: If you and I were arrested right now on suspicion that we shot someone, the cops would take us to separate rooms and interrogate us immediately. They wouldn’t tell us what they actually know, and they might lie to us about what they know, which is allowed in a lot of jurisdictions. But now let’s say that you and I are police officers and we're being investigated for disciplinary purposes after we shot someone — totally different procedure. In many cities, because of collective bargaining agreements negotiated by a police union, there’s a buffer period of a day or a few days during which time officers David and Daniel cannot be interrogated. When the buffer period is over, you and I will learn the allegations against us. When we finally get interrogated, we can be in the same room to respond. In some cities, we’d get to see video of the incident before making statements. In others, we could make statements, see the video, and then change our statements. So basically the process often appears to prioritize officer preparation and coordination rather than fact finding. Again, wicked learning environment: the officers are somewhat insulated from many of the repercussions that would teach them and their colleagues a lesson, so to speak. If we don’t change these paradigms, we shouldn’t expect a change in outcomes. David: Dan, thanks very much for bringing this to my attention, and sharing a few important lessons. And thank you for reading. Until next time…. David p.s. You can find previous Range Reports here, and if you have a friend who might enjoy this newsletter, please consider sharing. They can subscribe here.[/vc_column_text][/vc_column][vc_column width="1/4" offset="vc_col-lg-offset-1"][vc_raw_html]JTVCbWM0d3BfZm9ybSUyMGlkJTNEJTIyNDklMjIlNUQ=[/vc_raw_html][vc_basic_grid post_type="post" max_items="3" element_width="12" grid_id="vc_gid:1592320008109-2ac48fed-06e0-7" taxonomies="14"][/vc_column][/vc_row][vc_row][vc_column][vc_wp_posts number="3"][/vc_column][/vc_row]Antibody Tests: When 95% Accurate Isn’t Enough

WHEN 95% ACCURATE IS REALLY INACCURATE
Last week, a primary care provider offered to order me an antibody test that could determine if I've had COVID-19. I had a suspiciously timed dry cough in late February, so I've been looking forward to ruling COVID-19 in or out. Thus, my initial reflex was to get excited that I could finally get a test. Before agreeing, though, I asked the doctor about test accuracy. The result of that conversation was that I decided not to get the test. Allow me to explain. The doctor told me that the test has greater than 90% accuracy, by which he meant that the rate of false positives is less than 10%. That sounds great. But it immediately reminded me of a study I described in the last chapter of Range, in which a question was posed to physicians and med students. Here it is: If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person's symptoms or signs? The doctors and med students were also told to assume that the test had perfect "sensitivity," i.e. that it detected all true positives. The most common answer the doctors and med students gave was that the patient has a 95% chance of actually having the disease. The correct answer is that there is only about a 2% chance that the patient actually has the disease, or 1.96% to be exact. Say you're testing 10,000 people. Because the disease prevalence is 1 in 1,000, 10 people in the sample have the disease. The test has perfect sensitivity, so all 10 of those people get a true positive result. But remember that the false positive rate is 5%, so 5 out of every 100 people tested will get a false positive. In total, 500 people out of 10,000 tested will get a false positive. So the chance of a patient who tests positive actually having the disease is 10/510, or 1.96%. Only a quarter of the physicians and physicians-in-training who were given this quiz answered it correctly. Obviously, that isn't because they aren't extremely smart. It's just that this kind of thinking — the idea that the diagnostic value of the test to any individual is predicated on the base rate of the disease in the tested population — is deeply counterintuitive. That's exactly why I think the concept should be very explicitly hammered home in medical education, (or, better yet, all education), given that healthcare providers use diagnostic tests constantly. Now let's say a coronavirus antibody test has only a 1% false positive rate (or 99% "specificity," in medical lingo). If 5% of the tested population was infected, a positive antibody test means you probably had the disease. And yet, there's still a 17% chance that you got a false positive. Do that test on 50 million people, and 8.5 million people will wrongly assume they were infected. If we're counting on antibody tests to tell people if they have some level of immunity, that could be a problem. At some level, this actually can become intuitive. For instance, pretend that only 1 in a billion people has Hogwarts disease; so just 7 people on Earth have Hogwarts.** (Hog warts?) If your doctor told you that you tested positive for Hogwarts, and that the diagnostic test had only a 1% false positive rate, would you assume you have Hogwarts? Probably not. After all, if we tested everyone on Earth, 70,000,007 people would get positives, and only 7 would be true. **If you test positive for Hogwarts, it's definitely a false positive. Hogwarts is not a real disease.LIGHTNING ROUND
- A corollary to the issue above is that it's important to test a representative sample of the population in order to establish the prevalence of infection. On this page, the FDA estimates the predictive value of antibody tests that it has authorized, but adds in bold letters: "We do not currently know the prevalence of SARS-CoV-2 antibody positive individuals in the U.S. population." Therefore, the FDA notes, an individual should be very cautious about making decisions based on the results of a single antibody test.
- If you want to think about conditional probability in maybe the simplest possible example, check out this short video. The question: Given that a couple has two kids, and you know that one is a boy, what's the probability that the couple has two boys? (Hint: it's not 50%.) The second parts asks the same question, except now you know specifically that the older child is a boy. Does it seem like that knowledge should matter? Well, it does.
- For a few more conditional probability examples (including a tidbit of insight into search engines) check out the "Bayes' Theorem" page of Math Is Fun.
- For a deeper dive, check out chapter 3 of Judea Pearl's Book of Why. It's written for a wide audience, and includes only basic math. (The book is fascinating. It has some useful diagrams, so I don't recommend the audio version.) Even if you gloss over the math, you'll still come away with important concepts. The book includes the medical-test problem (via mammograms), and chapter 3 delves into the question: "How much evidence would it take to convince us that something we consider improbable has actually happened?"
People Are Moving, But Not Enough for Cartels

PEOPLE ARE MOVING AGAIN, SLOWLY
When I last shared the link to the number of passengers going through TSA, it was consistently about 4% the volume-per-day compared to the previous year. That figure has started to rise, and now it's hovering around 8%. That might sound small, but, obviously, it's a 100% increase in just three weeks in total passengers going through TSA each day. Will we ever return to 2.5 million passengers per day? Warren Buffett, at least, thinks that we won't see those travel numbers again in the next few years, which is why Berkshire Hathaway sold all of its shares in airlines. (Buffett's company held around 10% of American, Delta, United, and Southwest.) Data on the number of Apple Maps requests for directions also suggest the beginnings of a slow return to mobility. I found the Miami data particularly interesting. Much has been made of the fact that Florida did not implement blanket shutdowns, and yet thus far doesn't seem to have suffered for it. (Wall St. Journal: "Smart or Lucky? How Florida Dodged the Worst of Coronavirus.") According to the Apple Maps data, requests for walking and driving directions in Miami were down about 70% by late March. That's very similar to the data from New York City. A reasonable hypothesis: citizens in large urban areas stayed put whether or not the government mandated it. That said, search a big city today and you'll almost certainly see an upward trend, but only for walking and driving directions; public transit requests have remained 70-80% below baseline in Boston, Chicago, DC, New York City, and San Francisco — cities with some of the largest public transit systems in the country. The future of public transit will be an interesting issue to follow. Last week — for the first time in its 115-year history — New York City's subway system underwent a planned overnight shutdown, during which every single car was disinfected. I was in Manhattan on 9/11, and recall that the subway only completely stopped for about two hours. For a sense of the gravity of last week's shutdown: daily NYC subway ridership is normally more than double the number of daily airline passengers in the entire country. And contrary to popular Twitter opinion, this does not mean that the subway was cleaned for the very first time and Andrew Carnegie's fingerprints were finally wiped off the subway poles. Subway cars go out of service for cleaning every single night, it's just not usually all of them at once. Plus, as if Carnegie took the subway. The man had a five-electric-car garage.DRUG-TRADE DISRUPTION
Before I did some cartel reporting at ProPublica, I didn't know anything about the drug trade. I certainly had no idea that most of the drugs that supply U.S. customers come through legal points of entry. Maybe the image in my head was a little too Hollywood — drug-loaded airplanes landing on makeshift runways in remote deserts.** The reality is that most imported drugs come right through border checkpoints. Here's how one cartel lieutenant described a methamphetamine smuggling tactic to me: the gas tank of a car is removed and replaced with a smaller tank that is surrounded by a larger, outer container; meth is liquified and placed in the outer container; gas goes in the inner, smaller gas tank; if a border agent uses a gas tank scope (basically a cord with a mirror lens on the end) to look in the tank, they see gas, and if a dog sniffs the tank, it smells gas. Should that fail and the smuggler get caught, c'est la vie, eight of the next ten cars will get through. That's just one method I remember from my education about how drug traffickers would much rather use existing transportation and commerce systems than create their own. So it's not terribly surprising that COVID-19 transit restrictions have seriously disrupted the drug trade, according to a United Nations report. Heroin tends to move by land, cocaine by sea; synthetic drugs like meth use a mix, but with more air trafficking (often just carried in luggage) than other types of drugs. With international land and air traffic way down, drug smugglers are having a difficult time moving heroin and meth between countries. Ok, so, just for kicks, take a moment to guess what's happening to drug prices.... Countries where drugs are produced are reporting lower prices; producers have more supply than can be moved, so they're selling cheaply to traffickers. Meanwhile, countries that comprise the major demand markets for heroin and meth are reporting increased prices; difficulty trafficking has led to a shortage for retailers, so the drugs that do arrive carry a higher price tag. The U.N. analysis also notes that countries are reporting a large-scale switch of drug users from heroin to synthetic opioids, because of both price and availability. That's precisely opposed to the plan of cartels in recent years, which has been to capitalize on the growing opioid addiction problem (particularly in the United States) by increasing the supply of heroin and selling it at better prices than legal painkillers. (Another tidbit learned in reporting.) My main, high-level takeaway from cartel reporting was that the drug trade is part of a very complex economic system, and that altering one aspect — often via narrowly targeted law enforcement — will frequently backfire if nobody is keeping an eye on the overall system. In a concerning conclusion, the U.N. report notes that drug trafficking is way down right now, but the world may soon face a situation in which producers have massive stockpiles of drugs they want to move, along with hordes of impoverished people who desperately need work. As Robin Bell, a brilliant scientist at the Lamont-Doherty Earth Observatory (where once upon a time I worked in a lab), recently told the New York Times: "You wouldn't want a doctor who just worried about one part of you; you want somebody to look at your entire system." She was talking about the climate system, but the lesson holds for the drug economy, too. --------------------- **That does happen, sometimes. The Arellano-Felix Organization once landed a commercial jet filled with 10 tons of cocaine on a makeshift runway in the desert near La Paz, Mexico. Unfortunately for the AFO, the plane got stuck in the sand. They unloaded it and then tried to blow it up, which didn't work at all and is definitely an OSHA violation. So they brought construction equipment and hacked it up and tried to bury it. Also didn't really work, and drew the attention of the Mexican military. Click on the image of said plane below to check out the ProPublica story.
LIGHTNING ROUND
- Speaking of opioids, here's an unusual side effect of the pandemic: "Since March, federal officials have arguably done more to reform addiction medicine in the U.S. than they had in the two decades prior."
- My former ProPublica cubicle neighbor, Megan Rose, was part of a team that just won a Pulitzer for an incredible investigation into a series of deadly naval accidents in the Pacific. (Links to every installment here.) Whether you're into military performance, or some other type of work performance, this investigation into poor training and recovery practices should resonate. (When I traded my nice office at Sports Illustrated — with a big ole window 32 floors above 6th Ave. — for a crowded cubicle at ProPublica, it was to learn new skills by sitting near the Megans of the world.)
- More on recovery practices: elite athletes may have superior "sleepability."
- Cory Doctorow's wise conclusion on rules for writers: they're never actually rules, but rather a guide to things that are really hard to do well.
- The College Board arranged this special virtual AP History lesson with Hamilton creator Lin-Manuel Miranda.
- Author Dave Eggers captured the confusion about COVID-19 public information in a satirical Q&A.
From Nike Corporate Strategist to Award-Winning Novelist
