When to use Bayesian - Towards Data Science

文章推薦指數: 80 %
投票人數:10人

Bayesian statistics is all about belief. We have some prior belief about the true model, and we combine that with the likelihood of our data to get our ... OpeninappHomeNotificationsListsStoriesWritePublishedinTowardsDataScienceWhentouseBayesian5ScenariosWhereBayesianModelingShouldbeConsideredIntroductionMoststatisticalmodelshaveafrequentistandaBayesianversion.Thedecisionbetweentwoapproachesarenotjustachoicebetweenmodels,itismoreachoicebetweenstatistical“languages”.Frequentiststatisticsisbasedonhypotheticallyresamplingthedatafromanunderlyingtruemodel.Takemaximumlikelihoodestimation,inasplainEnglishaspossible.Itsayslikelihoodistheprobabilityoftheobserveddatagiven(conditionalon)theunderlyingmodel.Whichofallthepossibleunderlyingmodelsgivesusthehighestprobabilityofseeingthedata?TheBayesianequivalentofthismodelwearelookingforis“Whichmodel,orsetofparametervalues,arewemostcertainisthetruemodel?”Philosophiesaside,therearesomepracticalreasonstochooseoneapproachortheother.Thefrequentistapproachisoftenthego-to,sointhisarticle,I’llgothroughsomescenarioswhereitmaymakesensetogowithBayesianmodeling.Note:foraprimerontheintuitionofBayesianInference,seethispiece.PhotobyCarlosMuzaonUnsplashPriorInformationBayesianstatisticsisallaboutbelief.Wehavesomepriorbeliefaboutthetruemodel,andwecombinethatwiththelikelihoodofourdatatogetourposteriorbeliefaboutthetruemodel.(Iwon’tgotoointothemathhere,butagainseehereifinterested.)Insomecases,wehaveknowledgeaboutourdomainbeforeweseeanyofthedata.Bayesianinferenceprovidesastraightforwardwaytoencodethatbeliefintoapriorprobabilitydistribution.Forexample,sayIamaneconomistpredictingtheeffectsofinterestratesontechstockpricechanges.Mybeliefsabouteconomicssaythatlowinterestratesgenerallyboostpricesbysomeamount,probablybyaround5%perinterestpointbutprobablynotbymorethan25%perinterestpoint.Wemightuseanormaldistributionforourprior,centeredat-5%withavarianceof40%.Itwouldlooksomethinglikethis:Priorbelief~Normal(-5,40)abouttheeffectofinterestratesonstockprices.Imagebyauthor.IfI’mworriedabouttheeffectthatmypotentiallybiasedpriorbeliefshaveontheposterior,Icaneitherusea“weakerprior”,e.g.normaldistributionwithmean0andvariance1600(akastandarddeviation=40).Weakpriorbelief~Normal(0,1600)abouttheeffectofinterestratesonstockprices.Imagebyauthor.or2.DosensitivityanalysiswhereItryawholebunchofdifferentpriorsandseehowmuchitactuallyeffectstheposteriordistribution.LimitedDataThisscenarioisrelatedtothepreviousone.Ifourdatasetisreallysmall,someoutlierscouldleadtoanincorrectfitinafrequentistmodel(i.e.usingsklearnordinaryleastsquareslinearregression).However,ifweuseaBayesianapproach,apriordistributionovertheparameterscanactasaregularizationtopreventunlikelyextremevalues.Inasituationwheredataisobtainedovertime,youcandotheBayesianinferencewiththedatayouhave,obtainaposteriordistribution,andthenusethatposteriorasthepriorwhennewdataareobtained.Thisprocess,called“BayesianUpdating”,canberepeatedoverandover.UncertaintymeasurementAsalludedtoearlier,Bayesianinferenceprovidesmoreinterpretableconfidenceintervals,usuallyreferredtoas“credibleintervals.”Thisposteriordistributionisdirectlyanalogousourbeliefaboutaparameterinthemodelorapredictiongivennewdata.Youcansay“Iam95%surethatparameterθisbetween2.2and3.6.”Comparethistothefrequentistconfidenceintervalwhichcansay“inalargenumberofrepeatedsamples,thesimilarlycalculatedintervalsasoursbetween1.7and3.4wouldcontainthetruevalue95%ofthetime.”Ifyoucareaboutconveyinguncertainty(especiallytonon-statisticians),theBayesianapproachprobablymakesmoresense.LimitedTestDataIfyouaretrainingandtestingontwodifferentdistributions,alikelyscenarioisthattheamountofdatainthetestingdistributionisfarsmaller.Saywewanttotrainacomputervisionalgorithmtolocateararecancertypefromawhole-bodyCTscan.Weincorporatescansfrompatientswithamorecommoncancertypefortrainingandsavesomeoftherarecancerpatientsfortesting.Wecangetanaccuracypointestimatefromourevaluation,butamoreappropriatemethodmaybetouseaBayesianapproachtoevaluation[1].Wecancreatearelativelyweakpriorforaccuracyofourclassifier,addevaluationtrialsasdata,andobtainaposteriordistributionwherewecansay“Iam95%confidentthattheclassifieraccuracyonthetestdistributionisbetween65%and82%”.HierarchicalModelsHierarchicalmodelsworkwellintheBayesianFramework.Inthehierarchicalmodel,therearemultiplelevelsofrandomvariables.Forexample,youaremodelingstandardizedtestscoresofstudents.Eachstudent’sgradecomesfromadistributionofscoresintheirschooldistrict.Andtheparametersoftheschooldistrictcomefromamoreglobaldistributionofschooldistrictaveragescoresinthenation.Hierarchicalmodel.ϕrepresentsglobalparameters.𝜃aregroup-specificparameters,comingfromaglobaldistributiondefinedbyϕ.yarespecificdatapointsbelongingtooneofthesgroups.ImagebyauthorAbenefitofthehierarchicalapproachisthatyoucanmodelpropertiesofalloftheclusters,evenifthereareveryfewdatapointsfromagivencluster.Inthisexample,ifwehaveveryfewdatapointsfromagivenschool,itsdistributionwillbewiderandmoreresembletheoverallglobaldistribution(ϕ).Again,ahierarchicalmodelismoreresistanttooutliersandlimiteddatathancreatingseparatemodelsforeverycluster.Frequentistapproachestohierarchicallinearmodelsmightlookforthemodeoftheposteriordistribution,which,ofteninhierarchicalmodels,canbeontheedgeorboundaryoftheposteriorspace[2].Thiscangiveyouanincorrectresultwhenlearningmodelparameters.WithBayesianinference,weobtainawholeposteriordistributionandwecancomputemoreappropriate(foracomplexdistribution)statisticslikemean,median,and95%credibilityintervals.Withrecentcomputationalandalgorithmicadvances,Bayesianinferenceismorefeasibleforlargermodelsandmoredata.Whileinpracticefrequentistapproachesareoftenthedefaultchoice,therearesomescenarioswhereaBayesianapproachcanbeabetteroption,mostfrequentlywhen:YouhavequantifiablepriorbeliefsDataislimitedUncertaintyisimportantThemodel(data-generatingprocess)ishierarchicalThankyouforreading.RegisteringforMediummembershipsupportsmywork.References:[1]https://www.youtube.com/watch?v=5f-9xCuyZh4[2]https://discourse.mc-stan.org/t/hierarchical-linear-models-bayes-vs-frequentist/12012/1MorefromTowardsDataScienceFollowYourhomefordatascience.AMediumpublicationsharingconcepts,ideasandcodes.ReadmorefromTowardsDataScienceRecommendedfromMediumRochDeriloinAnalyticsVidhyaAdventureswithPython:StorytellingwithpandasandMatplotlib(ft.Seaborn)DatylonOurtakeondatastorytellingExcelQuickerinTipoftheweekfromExcelQuickerAbsoluteandrelativereferencesMultiTechPowerBINewFeaturesAlbertChristopherWhySeniorBigDataEngineeringCertificationSuitsYouXichuZhanginTowardsDataScienceThesolutionoftheHeatequationFANLI[Coursera]BigDataSpecialization-Course1(week2)LindsayDevonBrinWhenIsDataVisualizationaGoodChoice?AboutHelpTermsPrivacyGettheMediumappGetstartedMaxReynolds337FollowersResearchingMLandneuroimaging|maxwellreynolds.comFollowMorefromMediumJavierFernandezinTowardsDataScienceImplementingBayesianLinearRegressionTarekSamaaliinMLearning.aiHumansoverfitaswellCarlaMartinsinTowardsAILogisticRegressionforMulti-ClassClassification:Hands-OnwithSciKit-LearnFedericoComottoinTowardsDataScienceStatistics101:CrediblevsConfidenceIntervalHelpStatusWritersBlogCareersPrivacyTermsAboutKnowable



請為這篇文章評分?