SamuZai
3blue1brown
3blue1brown

patreon


Medical test probabilities draft

Hey everyone,

Here is a draft for a video about medical test probabilities I plan to publish soon, let me know if you catch any errors or otherwise have feedback that can be incorporated.

I've talked a bit about plans I had for this topic in previous posts, and this video is one I sort of split out from an earlier draft.  The target here is a viewer who has either not come across Bayes rule yet, or who has a little but not enough to have become familiar with how it can be framed using odds.  It seemed worth trying to pair this with the question of why medical test statistics can be prone to the kinds of fallacies they are.

I still have some open questions about this particular presentation.  For example, when you start applying statistics to models that have some kind of continuous parameter to them, rather than just being a binary "you have it or you don't" kind of question, there is a distinction people draw between what Bayes factors are and what likelihood ratios are.  Is that worth going into at all?

If you have any thoughts, either on this draft or what you would like to see in any follow-ups, please do feel free to share.

-Grant

Edit: I've swapped the video linked above for the final one now.  Thank you for the helpful comments, I rearranged/shortened the intro, swapped around the order of some other parts, and added a note for more info at the end.

Medical test probabilities draft

Comments

Have you thought about doing a video on AlphaZero? Potentiqlly explaining first Tree Search, then MCTS and finally how these can be improved by a neural network?

Love this video. The prevelance of breast cancer among women in the US is 1 in 12 so not sure I would change much about that presentation. Like the concrete example of the guy who asked the doctors what they would tell the patient. A small improvement would be to chose either TNR/TPR FPR/FNR for the discussion or to go with Sentivity/Specificity for the duration. I think this would help a little in the overall flow and equations, but this is a very minor comment on a fantastic video. Congrats.

Huh. I read about this problem back in my youth, out of curiosity, and yet somehow I can't remember having read about the odds approach. The odds calculation feels so much better to the physical chemist in me now (back before I switched to IT). I loved this brilliant surprise^^

L4m3ness

I like the video a lot. It might be helpful, given the current anxiety, to comment that 10/1000 women who have breast cancer differs from 10/1000 people who have COVID. Which is infectious and changing, underscoring the utility of quick recalculation given a change in prevalence.

I really like this video. I think it provides a quick and concrete example of Bayes formula and how to apply it quickly and intuitively. A couple of things that I think could be improved. When FPR and FNR are first presented, they are demonstrated concretely, but I don't remember seeing the formula for them. I think presently that visually (doesn't need to be addressed verbally) would be helpful so that they are more recognizable when used in formulas later. Overall, great video. I hope this helps people better understand statistics, and especially the kind of things where misunderstanding could hurt us all.

James Cooper

I actually think this comment is 100% spot on. I think maybe I've felt some nebulous pull this year towards content which it feels like I "should" make, but that actually has a habit of becoming a source of writers block, and results in videos that I feel only okay about much more so that ones with more niche topics. It's something I've reflected on more recently, and I think through next year I'll plunge much more into math topics which I personally feel excited about, without getting caught up on the idea math-as-PSA.

3blue1brown

Oh, whoops! Not intentional..

3blue1brown

Thanks for the details Tom! I've seen "Bayes factor" used several places even for the simpler context of binary-hypothesis odds-updating, would you say it is incorrect to use the term for anything other than a case with free parameters? The distinction you bring up is what I was referencing in the post as the added detail for how this looks as you extend to continuous cases. I decided to add an on-screen note referencing that there is a distinction, with added links in the description for those curious to learn more, do you have any favorite references on the matter?

3blue1brown

It seems like you're struggling with the balance between making the paradox confusing and making it easy to understand what's going on. I'm wondering if it works better (i.e. seems more paradoxical) if you start by describing the specificity and sensitivity and then show what happens when it's applied to a population with low prevalence instead of the other way around. Then the stage is set for a "how can a test which is more than 90% accurate only give 10% confidence?" kind of line. It's a little pedantic, but since the crux of the paradox is about what "test accuracy" means (in a colloquial sense) , I would rather have something like "test accuracy is usually measured with two numbers" than "test accuracy is two numbers" like the video has at around 3:00.

Nate

4:05 is it intentional to say "What does really this mean"? I may not be up on my memes or maybe this construction is more common than it sounds to me. Or maybe just a word order issue.

This video was fantastic! Before watching this I was looking for an exact bayes rule simplification just like this. Is the idea to use “odds” based on some pre-existing material that you’re aware of? Anyway, I have a couple thoughts. 1) In an earlier comment you asked what the quickest way to get math with odds to “click” might be. For me personally it clicked when I considered that a probability of 1/X corresponds to an “odds fraction” of 1/(X+1), or an “odds fraction” of 1/Y corresponds to a probability of 1/(Y-1). You imply that “fractional” relationship with the graphic at 12:08, but aren’t saying it out loud. I get that you want to use “:” to signify odds as separate from probabilities, but maybe when you’re actively doing the math, you can morph “:” into a differently-colored “/“ or something so people intuitively know to treat odds as fractions during plug-n-chug. In short, l think people generally understand math formulas better with “/“ than with “:”. 2) The audio transition at 16:42-16:46 sounds kinda ominous out of nowhere, not sure if it’s just in my head, or if that’s an accident, or if you intentionally wanted to embed viewers with “ugh, conventional bayes rule YUCK 🤮”

Andrew Alvarez

This one is a REALLY nice complete lesson on Bayes. I think it will be extremely instructive for students. Small presentation comments: 1. Volume blows out at 7:11. 2. I found the formulas (and then the "maybe this isn't the best way" discussion about them @1:00) kind of distracting from the point. I would suggest you lean into the formulas without the disclaimer. Or use the "quick calculation" motivation instead of (or in addition to) the formulas themselves. Again - really nice!

Gabe

On the common factor being indipendent from the priors, I'd swap steps 1 and 2 in the set "Bayes Rule the Snazzy Way" sequence, so given that you FIRST compute the factor once in new step 1, you can swap different new steps 2... Seems more intuitive

Great video! A very minor visual correction: At 12:30, a "P" on the right-hand side of the screen is visible before the fade-in begins at 12:32

Those technical quibbles aside, I'll be providing a pointer to this video to my students when I next teach BDA. Well done.

Tom Loredo

It's beyond the regime addressed in this video, but FWIW: We tend to talk about odds ratios and posterior odds = prior odds X Bayes factor when summarizing inference for *discrete* alternatives only. When there are continuous alternatives, the odds formulation isn't very useful, because typically you'll want to do things like integrate over regions of the continuous parameter space (e.g., to find a 90% credible region for an uncertain parameter). Probabilities sum (the law of total probability!); the odds ratio does not.

Tom Loredo

I think starting with the quizz and right away stating that there is an easy and intuitive way to arrive at the right answer is a very good idea.

Daniel Armesto

I teach Bayesian data analysis at Cornell (where I have a joint appointment in astronomy and statistics). What you call the Bayes factor here most statisticians would just call a likelihood ratio. The term "Bayes factor" is usually reserved for problems where one or both of the hypotheses that are compared have free parameters. In that case, the Bayes factor is found by taking the ratio of *average* likelihood functions for the two hypotheses (averaging WRT the priors for the parameters; the averaged likelihood is called the marginal likelihood). This is in contrast to conventional "frequentist" statistics, where the most popular procedure is to look at the ratio of *maximum* likelihoods ("maximum likelihood ratio hypothesis testing"). If you've been watching the recent controversy over p-values, you may have noticed that Bayes factors are often suggested as an alternative, but they are considered controversial. No one disagrees that likelihood ratios are what's relevant with simple hypotheses. The controversy arises because for compound hypotheses, Bayesian inference identifies the parameter-averaged likelihoods (not maximized likelihoods) as what's relevant for inference, but such averaging doesn't appear in a natural way in frequentist statistics. Reserving "Bayes factor" for compound hypothesis settings helps us keep track of whether one is really looking at things from a Bayesian perspective, where parameter averaging is demanded. Technically you could use "Bayes factor" in the simple hypothesis case, where averaging the likelihood over *no* free parameters means you just use the simple likelihood. But that doesn't distinguish what approach one is using. Incidentally, your P(+) is doing a similar kind of marginalization. But in this setting, frequentists and Bayesians would agree that the prior (over the two compared hypotheses) is objectively available, and both would use them for that calculation. Basically, Bayes factors and (maximum) likelihood ratios are different, but happen to agree for this simple class of problems.

Tom Loredo

It is clear that the matter covered in this video has very clear and immediate medical, social and psychological ramifications. Still, I think that this channel excels as a maths divulgation channel, and should stick to that. There are many others who produce excellent content in all those areas that could build on the idea of this video for those medical, social or psychological purposes, but I think that at one point it is better to pass the baton to them.

Daniel Armesto

At 7:10 there's a sudden, brief jump in volume.

Tom Loredo

I thought the explanation was fantastic, and I disagree with other viewers who suggested changing the beginning.

Great presentation. I wouldn´t change a thing. Well, actually, maybe the idea of starting with a quizz for the viewer (the same quizz given to the medical doctors), showing how the intuitive response is not the right one and stating right away that there is an easy and intuitive way to get the right answer is a good idea. I realize intuitively I used to think of Bayes rule in this way (prior belief x update factor = updated belief. By the way, I like the term "update factor"; it´s self-explanatory), but had never actually formalized it. You have done it perfectly. This video should be mandatory in all medical and journalism schools. On a more general sense, it is a great example of how formulas are not dry recipes, but can embody wisdom. I have never been able to memorize Bayes standard formula, but your alternative presentation reflects the Bayesian rules so clearly (prior x update factor = posterior), it is so intuitive, it just sticks.

Daniel Armesto

I'll try restructuring the intro slightly, but it might be a bit much to completely reorient it.

3blue1brown

Hi Phil, Thank you for the thoughtful remarks, and I'm sorry it missed the mark. I like the points you're bringing up, particularly 2 and 3, although I can't help but feel they would be most suitable to as the backbone to a new video entirely. I've thought a lot about very many ideas that might be worth exploring concerning medical tests, very rarely with the result of a conclusion I felt comfortable centering a video on, but this video is something I pulled out from the many attempted scripts on adjacent topics because I regularly felt myself wanting to quickly reference a Bayes factor/likelihood ratio, but not wanting to divert to long a tangent to explaining it. I'll keep what say in mind as I comb through those scripts. It's interesting to hear that it comes off like the target is people in a position to change test reporting; the actual intent is for people to feel like they know how to interpret their own results, or more generally how to think about evidence updating belief. Maybe that could be made more clear...

3blue1brown

The audio is a bit hot/distorted at 7:10. Excellent content, as usual. I’d love continuous parameter info, but maybe as a separate video to minimize confusion.

Jeremy

Placebo^-1 (operative phrase: "they were told that")

I was thinking of using Bayes in a personal situation. My son was diagnosed with Covid last week with a swab test, as part of a program for returning university students. He has to quarantine for 10 days. Same with a friend of his. They were then told not to retest after 10 days, since they will likely get a positive again. Known as a long positive, because the test is sensitive. They both thought they had covid in October (lost smell and taste for 10 days). But a spit test done in Thanksgiving was negative (not so sensitive). But the swab is extra sensitive and may detect covid from up to four months ago. So, he did an antibody test. A long and short test, meaning it can tell if a recent infection or months old. He was positive for antibodies from likely a few months ago. But his friend was negative for antibodies, yet both had loss of smell around same time. So, using Bayes what is the chance either is non-contagious or should they isolate. I was confident he was safe based on his tests alone, but when his friend had different antibody test, I was not so sure. Is my inner-Bayes intuition correct?

While I love the didactic approach, THIS video has a PSA purpose too. As such, I'd go with the BLUF (bottom line up front) method: Start with the end in mind in the first 20 seconds or so of the video; you want to explain the simplified factor method FIRST, then explain as a paradox that most people don't understand it. You SAY in the intro you wrote about this video that you assume the audience might not have exposed to Bayes. Ok, then TELL THEM FIRST HOW EASY IT IS TO USE THE FACTOR. You want viewers of this video to be evangelizers within their own communities and echo chambers in particular (parler.com anybody?;-), about how to really make fast accurate decisions about COVID. This is one of those cases where a good rule of thumb upfront may make people curious about how everybody else "doesn't get it right" since "it was so obvious for them when they heard you say this"... Just my 2 cents

I really appreciate the high quality of your videos. So, in the interests of maintaining that, I thought that I should share with you that I felt that this particular video was a bit out-of-family. What I mean by this is that most of your videos take a complicated concept and make it seem enormously more intuitive. In this case I felt that the original idea was simpler and that the video was at risk of over-complicating it. Also, I think the video is currently targeting too small of an audience - that is, people who may be convinced by your arguments to change the way that test results are reported and represented. But your video should target the wider audience of viewers who really can't do much to change that. Some specific suggestions: 1) Early on, could you explain the "P(+ | )" notation that you used later in your formulas? 2) Could you pose and explore the question: "Ultimately, will taking this test do me more harm than good?" I recall that there is a test call an "amniocentesis" where this math is very relevant. Taking the test came with some (small) risks, so, depending on the age of the pregnant woman, it may or may not make sense to take the test. Of course, this is a controversial topic, so it may be best to steer clear. However, I suspect that there are similar examples for other diseases, such as heart disease. The first test may provide some guidance, but perhaps in many cases it will cause people to undergo expensive and risky invasive procedures as a next step. 3) The following question is really "So if I decided to take the test, how do I decide what to do after I receive the test results? Should I then take an even more dangerous test or undergo a potentially dangerous procedure?" I think that your video could and should explore these broader topics more. To carve out time, I suggest spending less time working through various examples involving probabilities, odds, and the application of the Bayes factor. Hope this helps! As always, love your videos!

BTW, I do like the "trick" of using odds to overcome wrong intuition. However, we cannot blindly apply it to "update prior probabilities" that come from a set not representing the whole population.

Edith Dubiner

The Bayes factor works for the whole population, but might not be applicable on specific subsets of the population. For example: if an individual has prior probability of being sick, because they are 70 years old (or because they already took the test and got a positive result), multiplying by the general population factor could be wrong. Being in the subgroup and the test-result may be dependent.

Edith Dubiner

So I like the material, but I'm finding the presentation a bit hard to follow, especially at the beginning. Starting from showing the equations, and then the example, and only then the paradox, feels like it's burying the lede and forcing people to understand the problem. It might be clearer to follow the path Gigerenzer did: first pose the question, then after the viewer has gotten it wrong, show the populations and the surprising result, then introduce the concepts of PPV and NPV and go into "what are you designing your test to answer." From there that leads into how you relate the things that are easy to measure (specificity etc) to the things you actually care about. That would let Bayes' Law show up as a part of the explanation, rather than as the header for the whole thing, and I think that would open this up to a much wider range of viewers. (Agreeing with what William Smith said in his comment above)

Yonatan Zunger

Hmm, that's a good point to be aware of. What do you think the quickest way to make it "click" is?

3blue1brown

In an earlier draft I was thinking of making it all about covid tests, but ultimately realized that's a complete minefield of uncertainty, and also might make the video feel less relevant, say, 10 years from now.

3blue1brown

Repeating tests or looking at other evidence is really important to help people understand how updating works but there are other videos that Grant has made about this previously.

Joshua Davis

Some slip in the editing bumped that part up artificially by 5db, thanks for the catch!

3blue1brown

I think adding the idea of a continuous application of bayes law adds an extra concept and this video is already pretty full of concepts. My understanding of this video’s message is “you can calculate the usefulness of a medical test with some simple math and the result is definitely not intuitive” “thinking in terms of odds instead of probability lets you use that simple math confidently” “ what you know about your initial odds of having the disease makes a difference in how ‘useful’ the test results are” that’s a lot of concepts for a wide audience to take in. Even the probability form of bayes law is an additional concept and that (by itself) might substantially reduce the audience of people who watch the whole video or try to understand the first three concepts.

William Smith

Great video, it gives me a more quantifiable sense of Bayes rule, other than to remember that I have to remember that priors are a thing. At one point I think it was a bit tricky to follow, because I got distracted figuring out how to calculate the odds from the probability, I think giving the formula there would be nice. Also what it means to multiply the odds by something. You say it's just a fraction, but still when you hear "multiply 1 to 4 by 5", i.e. (1:4)*5 it's not intuitive because 1:4 seems so symmetric, whereas 1/5 doesn't. Thus I could imagine (1:4)*5 meaning 5:4, or 5:20 or 1:20. It becomes clear very quickly when you think about it, but for that one has to pause the video.

Well, that kind of depends on the prior odds of it being different, doesn't it? ;-)

Paul Brekke

I think this was a very clear and understandable summary, Grant. Thank you! I've taught this subject to medical phd students a few times, and I wish I could have told them to watch this video :-) I really like how this reframing explicitly separates the prior, which I find is the thing most people - myself included - stumble on when confronted with this kind of questions. The only thing I can think of changing, is that for people with _some_ medical background, I think likelihood ratio is a more familiar term (it's relatively commonly used in evidence based medicine textbooks and papers), and LR could perhaps be introduced the first time you mention Bayes factor, so that we/they understand the relevance of the video immediately? Thank you again for making an effort to explain these important concepts to the world - I am really proud to be a supporter! :)

Paul Brekke

Awesome stuff! I feel like COVID was handled well here, as you have the animations from the previous videos on it but aren't framing it in that limited scope

Kyle Begovich

The killer (terrible word choice) idea that jumps out is your reframing of medical tests as updating a prior. Central to breaking the misconception the physicians had (I think this example also came up in the drunkards walk) is that tests operate on an existing probability/odds. With respect to addressing a continuous parameter I think that would bury someone learning about Bayes for the first time. But you could show what happens with two Bayes factors, i.e. repeating the same test.

The sound at https://youtu.be/dCn0NjabEZM?t=426 "What would you tell people in the seminar" is clipped

-

Maybe rerecord the audio at 7:10

I got myself confused there because I'm ignoring the predictions of non-rain ... hrrm.

Jason Olshefsky

My $0.02 (readers, discount appropriately!): I think what you have here would be a really cool way to tie together many Frequentist-y ideas (e.g., hypothesis testing that most folks come across in high school) with its Bayesian counterpart. Maybe the odds-form of Bayes rule (which you touched on here) is the bridge to get you there? Anywho, always enjoy the videos, Grant...particularly anything probability/statistic related :)

I have been looking for a way to more easily integrate Bayes into my daily life, and this helps a lot. I take away that the simplest way to understand the Bayes factor is the ratio of the probability that a test gave one answer and it was right divided by the probability that the test gave the same answer but was wrong. So when someone predicts it will rain, and it does 9 out of 10 times, that's a Bayes factor of 9. If it actually rains 1/10th of the days, then the odds of rain is 1:9, so a prediction of rain means the odds of rain are 9:9 or even. In the video, I felt like the Bayes Factor * odds came on a little like a shortcut to the answer. I haven't done the math, but does it make sense to do the algebra to weed the Bayes Factor out of the traditional Bayes formula so the "odds equation" is proven?

Jason Olshefsky

I really think that the suggestion you made at the end of the lesson was very important. Our understanding needs to be tested with the goal of developing an intuition that can be trusted. The course on probability and statistics on Brilliant.org was disappointing in this regard. It never helped me to develop a real ability to intuitively understand Bayes's theory. I'd like to have a piece of software that could keep drilling me and giving me feedback until I felt confident that I really did understand it. This is really important to me because if I only had this intuitive ability years ago my life would be very different right now :(.

Joshua Davis

Super intuitive! Ty. I also fit your target group exectly.

jonas.app


More Creators