Learning Curves in Machine Learning - Baeldung

2025-01-25

文章推薦指數： 80 %

投票人數：10人

A learning curve is just a plot showing the progress over the experience of a specific metric related to learning during the training of a ... StartHereAbout ▼▲ FullArchive Thehighleveloverviewofallthearticlesonthesite. WriteforBaeldung Becomeawriteronthesite. AboutBaeldung AboutBaeldung. AuthorsTopIfyouhaveafewyearsofexperienceinComputerScienceorresearch,andyou’reinterestedinsharingthatexperiencewiththecommunity,havealookatourContributionGuidelines. 1.Overview Inthistutorial,we’llstudywhatarelearningcurvesandwhytheyarenecessaryduringthetrainingprocessofamachinelearningmodel. We’llalsodiscoverdifferenttypesofcurves,whattheyareusedfor,andhowtheyshouldbeinterpretedtomakethemostoutofthelearningprocess. Bytheendofthearticle,we’llhavethetheoreticalandpracticalknowledgerequiredtoavoidcommonproblemsinreal-lifemachinelearningtraining.Ready?Let’sbegin! 2.LearningCurves 2.1.Introduction Contrarytowhatpeopleoftenthink,machinelearningisfarfrombeingfullyautomated.Itrequireslotsof“babysitting”;monitoring,datapreparation,andexperimentation,especiallyifit’sanewproject.Inallthatprocess,learningcurvesplayafundamentalrole. Alearningcurveisjustaplotshowingtheprogressovertheexperienceofaspecificmetricrelatedtolearningduringthetrainingofamachinelearningmodel.Theyarejustamathematicalrepresentationofthelearningprocess. Accordingtothis,we’llhaveameasureoftimeorprogressinthex-axisandameasureoferrororperformanceinthey-axis. Weusethesechartstomonitortheevolutionofourmodelduringlearningsowecandiagnoseproblemsandoptimizethepredictionperformance. 2.2.SingleCurves Themostpopularexampleofalearningcurveislossovertime.Loss(orcost)measuresourmodelerror,or“howbadourmodelisdoing”.So,fornow,thelowerourlossbecomes,thebetterourmodelperformancewillbe. Inthepicturebelow,wecanseetheexpectedbehaviorofthelearningprocess: Despitethefactithasslightupsanddowns,inthelongterm,thelossdecreasesovertime,sothemodelislearning. Otherexamplesofverypopularlearningcurvesareaccuracy,precision,andrecall.Allofthesecapturemodelperformance,sothehighertheyare,thebetterourmodelbecomes. Seebelowanexampleofatypicalaccuracycurveovertime: Themodelperformanceisgrowingovertime,whichmeansthemodelisimprovingwithexperience(it’slearning). Wealsoseeitgrowsatthebeginning,butovertimeitreachesaplateau,meaningit’snotabletolearnanymore. 2.3.MultipleCurves Oneofthemostwidelyusedmetricscombinationsistrainingloss+validationlossovertime. Thetraininglossindicateshowwellthemodelisfittingthetrainingdata,whilethevalidationlossindicateshowwellthemodelfitsnewdata. Wewillseethiscombinationlateron,butfornow,seebelowatypicalplotshowingbothmetrics: Anothercommonpracticeistohavemultiplemetricsinthesamechartaswellasthosemetricsfordifferentmodels. 2.4.TwoMainTypes Weoftenseethesetwotypesoflearningcurvesappearingincharts: OptimizationLearningCurves:Learningcurvescalculatedonthemetricbywhichtheparametersofthemodelarebeingoptimized,suchaslossorMeanSquaredError PerformanceLearningCurves:Learningcurvescalculatedonthemetricbywhichthemodelwillbeevaluatedandselected,suchasaccuracy,precision,recall,orF1score BelowyoucanseeanexampleinMachineTranslationshowingBLEU(aperformancescore)togetherwiththeloss(optimizationscore)fortwodifferentmodels(orangeandgreen): 3.HowtoDetectModelBehavior Wecandetectissuesinthebehaviorofamodelbywatchingtheevolutionofalearningcurve. Next,we’llseeeachofthedifferentscenarioswecanfindformodelbehaviordetection: 3.1.HighBias/Underfitting Let’squicklyreviewwhattheseconceptsare: Bias:Highbiasoccurswhenthelearningalgorithmisnottakingintoaccountalltherelevantinformation,becomingunabletocapturethemodel’srichnessandcomplexity Underfitting:Whenthealgorithmisnotabletomodeleithertrainingdataornewdata,consistentlyobtaininghigherrorvaluesthatdon’tdecreaseovertime Wecanseetheyarecloselytied,asthemorebiasedamodelis,themoreitunderfitsthedata. Let’simagineourdataarethebluedotsbelow,andwewanttocomeupwithalinearmodelforregressionpurposes: Supposewe’reverylazymachinelearningpractitionersandweproposethislineasamodel: Clearly,astraightlinelikethatdoesn’trepresentthepatternofourdots.Itlackssomecomplexitytodescribethenatureofthegivendata.Wecanseehowthebiasedmodeldoesn’ttakeintoaccountrelevantinformation,whichleadstounderfitting. It’sdoingaterriblejobwiththetrainingdataalready,sowhatwouldbetheperformanceforanewexample? It’sprettyobviousitperformsaspoorlywiththenewexampleasitdoeswiththetrainingdata: Now,howcanweuselearningcurvestodetectourmodelisunderfitting?Seeanexampleshowingvalidationandtrainingcost(loss)curves: Thecost(loss)functionishighanddoesn’tdecreasewiththenumberofiterations,bothforthevalidationandtrainingcurves Wecouldactuallyusejustthetrainingcurveandcheckthatthelossishighandthatitdoesn’tdecrease,toseethatit’sunderfitting 3.2.HighVariance/Overfitting Let’sbrieflyreviewthesetwoconcepts: Variance:Highvariancehappenswhenthemodelistoocomplexanddoesn’trepresentthesimplerrealpatternsexistinginthedata Overfitting:Thealgorithmcaptureswellthetrainingdata,butitperformspoorlyonnewdata,soit’snotabletogeneralize Thesearealsodirectlyrelatedconcepts:Thehigherthevarianceofamodel,themoreitoverfitsthetrainingdata. Let’stakethesameexampleasbefore,wherewewantedalinearmodeltoapproximatethesebluedots: Now,ontheotherextreme,let’simaginewe’reveryperfectionistmachinelearningpractitionersandweproposealinearmodelthatperfectlyexplainsourdatalikethis: Well,weunderstandintuitivelythatthislineisnotwhatwewanted,either.Indeed,itfitsthedata,butitdoesn’trepresenttherealpatterninit. Whenanewexampleappears,itwillstruggletomodelit.Seeanewexample(inorange): Usingtheoverfittedmodel,itwon’tpredictwellenoughthenewexample: Howcouldweuselearningcurvestodetectamodelisoverfitting?We’llneedboththevalidationandtraininglosscurves: Thetraininglossgoesdownovertime,achievinglowerrorvalues Thevalidationlossgoesdownuntilaturningpointisfound,andthereitstartsgoingupagain.Thatpointrepresentsthebeginningofoverfitting 3.3.FindingtheRightBias/VarianceTradeoff Thesolutiontothebias/varianceproblemistofindasweetspotbetweenthem. Intheexamplegivenabove: agoodlinearmodelforthedatawouldbealinelikethis: So,whenanewexampleappears: Wewillmakeabetterprediction: Wecanusethevalidationandtraininglosscurvestofindtherightbias/variancetradeoff: Thetrainingprocessshouldbestoppedwhenthevalidationerrortrendchangesfromdescendingtoascending Ifwestoptheprocessbeforethatpoint,themodelwillunderfit Ifwestoptheprocessafterthatpoint,themodelwilloverfit 4.HowtoDetectRepresentativeness 4.1.WhatRepresentativenessMeans Arepresentativedatasetreflectsproportionallystatisticalcharacteristicsinanotherdatasetfromthesamedomain. Wecouldfindthetrainingdatasetisnotrepresentativeinrelationtothevalidationdatasetandvice-versa. 4.2.UnrepresentativeTrainingDataset Thishappenswhenthedataavailableduringtrainingisnotenoughtocapturethemodel,relativetothevalidationdataset. Wecanspotthisissuebyshowingonelosscurvefortrainingandanotherforvalidation: Thetrainandvalidationcurvesareimproving,butthere’sabiggapbetweenthem,whichmeanstheyoperatelikedatasetsfromdifferentdistributions. 4.3.UnrepresentativeValidationDataset Thishappenswhenthevalidationdatasetdoesnotprovideenoughinformationtoevaluatetheabilityofthemodeltogeneralize. Thefirstscenariois: Aswecansee,thetrainingcurvelooksok,butthevalidationfunctionmovesnoisilyaroundthetrainingcurve. Itcouldbethecasethatvalidationdataisscarceandnotveryrepresentativeofthetrainingdata,sothemodelstrugglestomodeltheseexamples. Thesecondscenariois: Herewefindthevalidationlossismuchbetterthanthetrainingone,whichreflectsthevalidationdatasetiseasiertopredictthanthetrainingdataset. Anexplanationcouldbethevalidationdataisscarcebutwidelyrepresentedbythetrainingdataset,sothemodelperformsextremelywellonthesefewexamples. Anyway,thismeansthevalidationdatasetdoesnotrepresentthetrainingdataset,sothereisaproblemwithrepresentativeness. 5.Conclusion Inthistutorial,wereviewedsomebasicconceptsrequiredtounderstandtheconceptsbehindlearningcurvesandhowtousethem. Next,welearnedhowtointerpretlearningcurvesandthewaytheycanbeusedtoavoidcommonlearningproblemssuchasunderfitting,overfitting,orunrepresentativeness. AuthorsBottomIfyouhaveafewyearsofexperienceinComputerScienceorresearch,andyou’reinterestedinsharingthatexperiencewiththecommunity,havealookatourContributionGuidelines. 1Comment Oldest Newest InlineFeedbacks Viewallcomments ViewComments LoadMoreComments Commentsareclosedonthisarticle! wpDiscuzInsert

請為這篇文章評分？

延伸文章資訊

Why you should be plotting learning curves in your next ...

Learning curves show the relationship between training set size and your chosen evaluation metric...

Learning Curves in Machine Learning - SpringerLink

A learning curve shows a measure of predictive performance on a given domain as a function of som...

What is a Learning Curve in machine learning? - Stack Overflow

An ROC curve is a graphical depiction of classifier performance that shows the trade-off between ...

Machine Learning學習日記— Coursera篇(Week 6.2 ... - Medium

大綱. Diagnosing Bias vs. Variance; Regularization and Bias/Variance; Learning Curves; Deciding Wha...

Learning Curve to identify Overfitting and Underfitting in ...

Learning curves plot the training and validation loss of a sample of training examples by increme...

Learning Curves in Machine Learning - Baeldung

文章推薦指數： 80 %

請為這篇文章評分？

延伸文章資訊

最新文章

相關網站資訊

中日口譯課程

中國生產力中心口譯評價

紙的應用