How to use Learning Curves to Diagnose Machine Learning ...

2025-01-25

文章推薦指數： 80 %

投票人數：10人

A learning curve is a plot of model learning performance over experience or time. Learning curves are a widely used diagnostic tool in machine ... Navigation Home MainMenuGetStarted Blog Topics DeepLearning(keras) ComputerVision NeuralNetTimeSeries NLP(Text) GANs LSTMs BetterDeepLearning Calculus IntrotoAlgorithms CodeAlgorithms IntrotoTimeSeries Python(scikit-learn) EnsembleLearning ImbalancedLearning DataPreparation R(caret) Weka(nocode) LinearAlgebra Statistics Optimization Probability XGBoost PythonforMachineLearning EBooks FAQ About Contact ReturntoContent ByJasonBrownleeonFebruary27,2019inDeepLearningPerformance Tweet Tweet Share Share LastUpdatedonAugust6,2019 Alearningcurveisaplotofmodellearningperformanceoverexperienceortime. Learningcurvesareawidelyuseddiagnostictoolinmachinelearningforalgorithmsthatlearnfromatrainingdatasetincrementally.Themodelcanbeevaluatedonthetrainingdatasetandonaholdoutvalidationdatasetaftereachupdateduringtrainingandplotsofthemeasuredperformancecancreatedtoshowlearningcurves. Reviewinglearningcurvesofmodelsduringtrainingcanbeusedtodiagnoseproblemswithlearning,suchasanunderfitoroverfitmodel,aswellaswhetherthetrainingandvalidationdatasetsaresuitablyrepresentative. Inthispost,youwilldiscoverlearningcurvesandhowtheycanbeusedtodiagnosethelearningandgeneralizationbehaviorofmachinelearningmodels,withexampleplotsshowingcommonlearningproblems. Afterreadingthispost,youwillknow: Learningcurvesareplotsthatshowchangesinlearningperformanceovertimeintermsofexperience. Learningcurvesofmodelperformanceonthetrainandvalidationdatasetscanbeusedtodiagnoseanunderfit,overfit,orwell-fitmodel. Learningcurvesofmodelperformancecanbeusedtodiagnosewhetherthetrainorvalidationdatasetsarenotrelativelyrepresentativeoftheproblemdomain. Kick-startyourprojectwithmynewbookBetterDeepLearning,includingstep-by-steptutorialsandthePythonsourcecodefilesforallexamples. Let’sgetstarted. AGentleIntroductiontoLearningCurvesforDiagnosingDeepLearningModelPerformancePhotobyMikeSutherland,somerightsreserved. Overview Thistutorialisdividedintothreeparts;theyare: LearningCurves DiagnosingModelBehavior DiagnosingUnrepresentativeDatasets LearningCurvesinMachineLearning Generally,alearningcurveisaplotthatshowstimeorexperienceonthex-axisandlearningorimprovementonthey-axis. Learningcurves(LCs)aredeemedeffectivetoolsformonitoringtheperformanceofworkersexposedtoanewtask.LCsprovideamathematicalrepresentationofthelearningprocessthattakesplaceastaskrepetitionoccurs. —Learningcurvemodelsandapplications:Literaturereviewandresearchdirections,2011. Forexample,ifyouwerelearningamusicalinstrument,yourskillontheinstrumentcouldbeevaluatedandassignedanumericalscoreeachweekforoneyear.Aplotofthescoresoverthe52weeksisalearningcurveandwouldshowhowyourlearningoftheinstrumenthaschangedovertime. LearningCurve:Lineplotoflearning(y-axis)overexperience(x-axis). Learningcurvesarewidelyusedinmachinelearningforalgorithmsthatlearn(optimizetheirinternalparameters)incrementallyovertime,suchasdeeplearningneuralnetworks. Themetricusedtoevaluatelearningcouldbemaximizing,meaningthatbetterscores(largernumbers)indicatemorelearning.Anexamplewouldbeclassificationaccuracy. Itismorecommontouseascorethatisminimizing,suchaslossorerrorwherebybetterscores(smallernumbers)indicatemorelearningandavalueof0.0indicatesthatthetrainingdatasetwaslearnedperfectlyandnomistakesweremade. Duringthetrainingofamachinelearningmodel,thecurrentstateofthemodelateachstepofthetrainingalgorithmcanbeevaluated.Itcanbeevaluatedonthetrainingdatasettogiveanideaofhowwellthemodelis“learning.”Itcanalsobeevaluatedonahold-outvalidationdatasetthatisnotpartofthetrainingdataset.Evaluationonthevalidationdatasetgivesanideaofhowwellthemodelis“generalizing.” TrainLearningCurve:Learningcurvecalculatedfromthetrainingdatasetthatgivesanideaofhowwellthemodelislearning. ValidationLearningCurve:Learningcurvecalculatedfromahold-outvalidationdatasetthatgivesanideaofhowwellthemodelisgeneralizing. Itiscommontocreateduallearningcurvesforamachinelearningmodelduringtrainingonboththetrainingandvalidationdatasets. Insomecases,itisalsocommontocreatelearningcurvesformultiplemetrics,suchasinthecaseofclassificationpredictivemodelingproblems,wherethemodelmaybeoptimizedaccordingtocross-entropylossandmodelperformanceisevaluatedusingclassificationaccuracy.Inthiscase,twoplotsarecreated,oneforthelearningcurvesofeachmetric,andeachplotcanshowtwolearningcurves,oneforeachofthetrainandvalidationdatasets. OptimizationLearningCurves:Learningcurvescalculatedonthemetricbywhichtheparametersofthemodelarebeingoptimized,e.g.loss. PerformanceLearningCurves:Learningcurvescalculatedonthemetricbywhichthemodelwillbeevaluatedandselected,e.g.accuracy. Nowthatwearefamiliarwiththeuseoflearningcurvesinmachinelearning,let’slookatsomecommonshapesobservedinlearningcurveplots. WantBetterResultswithDeepLearning? Takemyfree7-dayemailcrashcoursenow(withsamplecode). Clicktosign-upandalsogetafreePDFEbookversionofthecourse. DownloadYourFREEMini-Course DiagnosingModelBehavior Theshapeanddynamicsofalearningcurvecanbeusedtodiagnosethebehaviorofamachinelearningmodelandinturnperhapssuggestatthetypeofconfigurationchangesthatmaybemadetoimprovelearningand/orperformance. Therearethreecommondynamicsthatyouarelikelytoobserveinlearningcurves;theyare: Underfit. Overfit. GoodFit. Wewilltakeacloserlookateachwithexamples.Theexampleswillassumethatwearelookingataminimizingmetric,meaningthatsmallerrelativescoresonthey-axisindicatemoreorbetterlearning. UnderfitLearningCurves Underfittingreferstoamodelthatcannotlearnthetrainingdataset. Underfittingoccurswhenthemodelisnotabletoobtainasufficientlylowerrorvalueonthetrainingset. —Page111,DeepLearning,2016. Anunderfitmodelcanbeidentifiedfromthelearningcurveofthetraininglossonly. Itmayshowaflatlineornoisyvaluesofrelativelyhighloss,indicatingthatthemodelwasunabletolearnthetrainingdatasetatall. Anexampleofthisisprovidedbelowandiscommonwhenthemodeldoesnothaveasuitablecapacityforthecomplexityofthedataset. ExampleofTrainingLearningCurveShowingAnUnderfitModelThatDoesNotHaveSufficientCapacity Anunderfitmodelmayalsobeidentifiedbyatraininglossthatisdecreasingandcontinuestodecreaseattheendoftheplot. Thisindicatesthatthemodeliscapableoffurtherlearningandpossiblefurtherimprovementsandthatthetrainingprocesswashaltedprematurely. ExampleofTrainingLearningCurveShowinganUnderfitModelThatRequiresFurtherTraining Aplotoflearningcurvesshowsunderfittingif: Thetraininglossremainsflatregardlessoftraining. Thetraininglosscontinuestodecreaseuntiltheendoftraining. OverfitLearningCurves Overfittingreferstoamodelthathaslearnedthetrainingdatasettoowell,includingthestatisticalnoiseorrandomfluctuationsinthetrainingdataset. …fittingamoreflexiblemodelrequiresestimatingagreaternumberofparameters.Thesemorecomplexmodelscanleadtoaphenomenonknownasoverfittingthedata,whichessentiallymeanstheyfollowtheerrors,ornoise,tooclosely. —Page22,AnIntroductiontoStatisticalLearning:withApplicationsinR,2013. Theproblemwithoverfitting,isthatthemorespecializedthemodelbecomestotrainingdata,thelesswellitisabletogeneralizetonewdata,resultinginanincreaseingeneralizationerror.Thisincreaseingeneralizationerrorcanbemeasuredbytheperformanceofthemodelonthevalidationdataset. Thisisanexampleofoverfittingthedata,[…].Itisanundesirablesituationbecausethefitobtainedwillnotyieldaccurateestimatesoftheresponseonnewobservationsthatwerenotpartoftheoriginaltrainingdataset. —Page24,AnIntroductiontoStatisticalLearning:withApplicationsinR,2013. Thisoftenoccursifthemodelhasmorecapacitythanisrequiredfortheproblem,and,inturn,toomuchflexibility.Itcanalsooccurifthemodelistrainedfortoolong. Aplotoflearningcurvesshowsoverfittingif: Theplotoftraininglosscontinuestodecreasewithexperience. Theplotofvalidationlossdecreasestoapointandbeginsincreasingagain. Theinflectionpointinvalidationlossmaybethepointatwhichtrainingcouldbehaltedasexperienceafterthatpointshowsthedynamicsofoverfitting. Theexampleplotbelowdemonstratesacaseofoverfitting. ExampleofTrainandValidationLearningCurvesShowinganOverfitModel GoodFitLearningCurves Agoodfitisthegoalofthelearningalgorithmandexistsbetweenanoverfitandunderfitmodel. Agoodfitisidentifiedbyatrainingandvalidationlossthatdecreasestoapointofstabilitywithaminimalgapbetweenthetwofinallossvalues. Thelossofthemodelwillalmostalwaysbeloweronthetrainingdatasetthanthevalidationdataset.Thismeansthatweshouldexpectsomegapbetweenthetrainandvalidationlosslearningcurves.Thisgapisreferredtoasthe“generalizationgap.” Aplotoflearningcurvesshowsagoodfitif: Theplotoftraininglossdecreasestoapointofstability. Theplotofvalidationlossdecreasestoapointofstabilityandhasasmallgapwiththetrainingloss. Continuedtrainingofagoodfitwilllikelyleadtoanoverfit. Theexampleplotbelowdemonstratesacaseofagoodfit. ExampleofTrainandValidationLearningCurvesShowingaGoodFit DiagnosingUnrepresentativeDatasets Learningcurvescanalsobeusedtodiagnosepropertiesofadatasetandwhetheritisrelativelyrepresentative. Anunrepresentativedatasetmeansadatasetthatmaynotcapturethestatisticalcharacteristicsrelativetoanotherdatasetdrawnfromthesamedomain,suchasbetweenatrainandavalidationdataset.Thiscancommonlyoccurifthenumberofsamplesinadatasetistoosmall,relativetoanotherdataset. Therearetwocommoncasesthatcouldbeobserved;theyare: Trainingdatasetisrelativelyunrepresentative. Validationdatasetisrelativelyunrepresentative. UnrepresentativeTrainDataset Anunrepresentativetrainingdatasetmeansthatthetrainingdatasetdoesnotprovidesufficientinformationtolearntheproblem,relativetothevalidationdatasetusedtoevaluateit. Thismayoccurifthetrainingdatasethastoofewexamplesascomparedtothevalidationdataset. Thissituationcanbeidentifiedbyalearningcurvefortraininglossthatshowsimprovementandsimilarlyalearningcurveforvalidationlossthatshowsimprovement,butalargegapremainsbetweenbothcurves. ExampleofTrainandValidationLearningCurvesShowingaTrainingDatasetThatMayBetooSmallRelativetotheValidationDataset UnrepresentativeValidationDataset Anunrepresentativevalidationdatasetmeansthatthevalidationdatasetdoesnotprovidesufficientinformationtoevaluatetheabilityofthemodeltogeneralize. Thismayoccurifthevalidationdatasethastoofewexamplesascomparedtothetrainingdataset. Thiscasecanbeidentifiedbyalearningcurvefortraininglossthatlookslikeagoodfit(orotherfits)andalearningcurveforvalidationlossthatshowsnoisymovementsaroundthetrainingloss. ExampleofTrainandValidationLearningCurvesShowingaValidationDatasetThatMayBetooSmallRelativetotheTrainingDataset Itmayalsobeidentifiedbyavalidationlossthatislowerthanthetrainingloss.Inthiscase,itindicatesthatthevalidationdatasetmaybeeasierforthemodeltopredictthanthetrainingdataset. ExampleofTrainandValidationLearningCurvesShowingaValidationDatasetThatIsEasiertoPredictThantheTrainingDataset FurtherReading Thissectionprovidesmoreresourcesonthetopicifyouarelookingtogodeeper. Books DeepLearning,2016. AnIntroductiontoStatisticalLearning:withApplicationsinR,2013. Papers Learningcurvemodelsandapplications:Literaturereviewandresearchdirections,2011. Posts HowtoDiagnoseOverfittingandUnderfittingofLSTMModels OverfittingandUnderfittingWithMachineLearningAlgorithms Articles Learningcurve,Wikipedia. Overfitting,Wikipedia. Summary Inthispost,youdiscoveredlearningcurvesandhowtheycanbeusedtodiagnosethelearningandgeneralizationbehaviorofmachinelearningmodels. Specifically,youlearned: Learningcurvesareplotsthatshowchangesinlearningperformanceovertimeintermsofexperience. Learningcurvesofmodelperformanceonthetrainandvalidationdatasetscanbeusedtodiagnoseanunderfit,overfit,orwell-fitmodel. Learningcurvesofmodelperformancecanbeusedtodiagnosewhetherthetrainorvalidationdatasetsarenotrelativelyrepresentativeoftheproblemdomain. Doyouhaveanyquestions? AskyourquestionsinthecommentsbelowandIwilldomybesttoanswer. DevelopBetterDeepLearningModelsToday! TrainFaster,ReduceOverftting,andEnsembles ...withjustafewlinesofpythoncode DiscoverhowinmynewEbook: BetterDeepLearning Itprovidesself-studytutorialsontopicslike:weightdecay,batchnormalization,dropout,modelstackingandmuchmore... Bringbetterdeeplearningtoyourprojects! SkiptheAcademics.JustResults. SeeWhat'sInside Tweet Tweet Share Share MoreOnThisTopicTuneXGBoostPerformanceWithLearningCurvesHowtoDevelopaCNNFromScratchforCIFAR-10Photo…Multi-LabelClassificationofSatellitePhotosof…HowtoClassifyPhotosofDogsandCats(with97%accuracy)HowtoUseROCCurvesandPrecision-RecallCurves…ROCCurvesandPrecision-RecallCurvesfor… AboutJasonBrownlee JasonBrownlee,PhDisamachinelearningspecialistwhoteachesdevelopershowtogetresultswithmodernmachinelearningmethodsviahands-ontutorials. ViewallpostsbyJasonBrownlee→ HowtoFixFutureWarningMessagesinscikit-learn WhyTrainingaNeuralNetworkIsHard 246ResponsestoHowtouseLearningCurvestoDiagnoseMachineLearningModelPerformance RolandFernandez February28,2019at4:09am # ThanksforarticleonthiscoreMLtechnique.Firstlearningcurveshownseemsapoorexampleofunderfuttingsincelossonyaxisisalreadysolow.Also,maybeconditionshownon2ndplotshouldbecalled“undertrained”toavoidconfusionwith“havingtroublelearningmore”conditionofunderfitting.Alsothesummaryparagraphforunderfittinghastypoanddata“overfitting”. Reply JasonBrownlee February28,2019at6:45am # ThanksRoland. Reply RolandFernandez February28,2019at4:11am # Myowntypo:).2ndtolastwordaboveshouldbe“says” Reply AngelosAngelidakis February28,2019at5:26am # Veryinformative! Reply JasonBrownlee February28,2019at6:46am # Thanks. Reply phz April3,2019at7:18pm # Theresstillatypohere: Aplotoflearningcurvesshowsoverfittingif: Thetraininglossremainsflatregardlessoftraining. Thetraininglosscontinuestodecreaseuntiltheendoftraining. =>thisisunderfitting. Reply JasonBrownlee April4,2019at7:47am # Correct,fixed.Thankyou! Reply Ashish March6,2019at1:10am # Themethodslikegenaralizationareusedfortheseconditionsonlyornot? Reply JasonBrownlee March6,2019at7:56am # Sorry,Idon’tunderstand,canyoupleaseelaborateorrephrasethequestion? Reply AdrienKinart March21,2019at8:06pm # Iwouldhavesaidthattheerrorfromthetrainingsetshouldincreasetoconvergetotheerrorfromthevalidationsettoindicategoodfit.Whatdoyouthinkaboutthat?(https://www.dataquest.io/blog/learning-curves-machine-learning) Reply JasonBrownlee March22,2019at8:24am # Doesnothappeninpracticeinmyexperiencebecauseoftenthetest/valaresmallerandlessrepresentativethanthetrainandhavedifferenterrorprofile. Reply George April3,2019at6:22pm # HiJasonandthanksforthepost. IhaveonequestionnotrelatedwiththispostthoughandIwantedyouropinion. Lets’ssayIhaveIamtrainingsomedataandduringthepreprocessingIamcleaningthatdata.Iremovesomeweird/wrongvaluesfromit. Now,whenIamgoingtousethepredicttotheunseennewdata,doIneedtoapplythesamecleaningtothatdatabeforemakingtheprediction? Arethereanycaveatsfordoingornotdoingthis? IguessIshouldthesamecleaningbutitconfusesmethatwehaveunseendataanditcanbeanything.. (IamnottalkingaboutscalingorthatkindofpreprocessingwhichIalreadyapplytothetrainandunseendata) Thankyouverymuch! George Reply JasonBrownlee April4,2019at7:41am # Greatquestion. Yes,ifyoucanusegenericbutdomain-specificknowledgetoprepare/filterdata,thenitisagoodideatousethisprocessconsistentlywhenfittingandevaluatingamodel,aswellaswhenmakingpredictionsinthefuture. Theriskisdataleakage,e.g.usingknowledgeabout“unseen”/testdatatohelpbetterfitthemodel.Thismighthelp(andbeabittoostrict): https://machinelearningmastery.com/data-leakage-machine-learning/ Reply JG April3,2019at9:35pm # GreatpostJason.Tahnks. –Mysummary,thatIappreciateifyoucanevaluateifamIrightaboutallthisstuffis: overfittingappearswhenwelearnsomuchdetailsthatareirrelevanttothemainstreamideastobelearned(generalconcepts).Thiscanbethesituationwhenyouhave,ononesideaverybigcomplexmodel(withmanylayersandmanyweighttobeadjusted.i.e.withavery“hightentropicinformationcapacity”)andontheothersideafewamountofdatatobetrained…sothesolutioncouldbethesimplifythemodelorincreasedetraindataset. Ontheothersideunderfittingappearswhenweneedmoreexperience(moreepochs)totrainthemodel,solearningcurvestrendarecontinuallydown..untilyougettherightstabilizationwiththeappropriatesetofepochs… –Mysecondquestionitis,howdoyouinterpretthecasewhenvalidationdatagetbetterperformance(highlevel)thantrainingdata…isitagoodindicationofgoodgeneralization?. thankyouJasontoallowustoshareyourknowledge!! Reply JasonBrownlee April4,2019at7:56am # Yes,butyoucanunderfitifthemodeldoesnothavesufficientcapacitytolearnfromthedata.Thiscanbefromepochsorfrommodelcomplexity/size. Itisasignthatthevalidationdatasetistoosmallandnotrepresentativeoftheproblem–verycommon. Reply Jakub May21,2019at8:27pm # Greatpost! Thankyouverymuch. Reply JasonBrownlee May22,2019at8:04am # You’rewelcome,I’mhappyithelped. Reply TanujaShrestha January27,2020at4:53am # HiJason, SorryIaskedthisquestionoverLinkedIntoo.Postinghereagainsothateverybodycanhaveafoodforthought. IranaVGG16modelwithaverylessamountofdata-gotthevalidationaccuracyofaround83%. However,whenIpredictedforthetestdatasetIgotaroundonly53%accuracy.Ihadmydatadividedintotrain,valid,andtest.. Whatcouldgowronghere?Anyexplanationwouldbesohelpful.And,thankyouforthelearningcurvesblog.Wasindeedhelpful… Also,canyoumakepredictionsusingvalidationdata?Whatcouldgowrong/righthere? Reply JasonBrownlee January27,2020at7:09am # Perhapsthetestdatasetistoosmallornotreprensetativeofthebroaderdataset. Perhapstrya50/50split?orgetmoredata? Reply TanujaShrestha January27,2020at3:37pm # Thanks! Reply Pritam June29,2019at10:15pm # Sir,thoughissomethingofthetrackquestion,stillfeltlikeasking.HowcanI“mathematically”explainthebenefitofcenteredandscaleddataformachinelearningmodelsinsteadofrawdata.Accuracyandconvergencenodoubtimprovesforthenormalizeddata,butcanIshowitmathematically? Reply JasonBrownlee June30,2019at9:41am # Sorry,don’thaveagoodanswer. Reply Frank July4,2019at3:32am # Itiscorrecttocreatealearningcurvegraphusingthreesetsofdata(training,validation,andtesting).Usingthe“training”settotrainthemodelandusethe“validation”and“test”setstogeneratethelearningcurves? Reply JasonBrownlee July4,2019at7:52am # Typicallyjusttrainandvalidationsets. Reply Chen July5,2019at12:25pm # Thankyouforyourpost!!Ithelpsalot!!CouldyoupleasehelpmetocheckthelearningcurveIgot(http://zhuchen.org.cn/wp-content/uploads/2019/07/lc.png),isitunderfitted?It’samulti-classificationproblemusingrandomforest. Reply JasonBrownlee July6,2019at8:19am # Looksunderfit. Reply zeinab July22,2019at9:11am # Averygreatandusefultutorial,thankyou Reply JasonBrownlee July22,2019at2:02pm # Thanks. Reply zeinab July22,2019at10:54am # CanIaskaboutthemeaningof“flatline”incaseofunder-fitting? Reply JasonBrownlee July22,2019at2:05pm # Itsuggeststhemodeldoesnothavesufficientcapacityfortheproblem. Reply zeinab July23,2019at12:58am # Ifthelossincreasesthendecreasesthenincreasesthendecreasesandsoon.. Whatdoesthismeans? Doesitmeansthatthedataisunrepresentativeinthatmodel?or Doesitmeansthatanoverfittinghappens? Reply JasonBrownlee July23,2019at8:04am # Greatquestion! Itcouldmeanthatthedataisnoisy/unrepresentativeorthatthemodelisunstable(e.g.thebatchsizeorscalingofinputdata). Reply TanujaShrestha January27,2020at5:11am # HeyJason,Ihadthisproblemexactly.Whatdoyoumeanbythemodelbeingunstable–thebatchsizeandscaling?Canyouelaboratemore?Also,doesthisexplanationapplytoboth–trainingandvalidationdataset?Orjustone?Whichdatasetareyoureferringtobysayingthefluctuationinloss–trainingorvalidation? Thanks,andgreatpost Reply JasonBrownlee January27,2020at7:10am # Moreonbatchsize: https://machinelearningmastery.com/how-to-control-the-speed-and-stability-of-training-neural-networks-with-gradient-descent-batch-size/ Moreonscaling: https://machinelearningmastery.com/how-to-improve-neural-network-stability-and-modeling-performance-with-data-scaling/ Reply TanujaShrestha January27,2020at3:45pm # ThanksJason! Also– Iamtryingtotrain,anddevelopamodelwhichclassifiesimagesfromcameratraps. Fromyourexperience–whatwouldbethebestmodeltosolveacameratrapimageclassificationtoclassifywildanimals.Theanimalsasseenintheimagesareboar,deer,fox,andmonkey. Also,ifourmainobjectiveistodetectboarandnotboar–canImakedatasetlike–1000imageswithboar,andrest1000withalltheotheranimalscombinedwithmonkey,deer,andfox–ratherthangetting1000imagesforeachanimal Anysuggestionwouldbesonice,andthanksalways JasonBrownlee January28,2020at7:50am # Iwouldrecommendtransferlearning: https://machinelearningmastery.com/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/ Yesexactly.A“boar‘classandan“other”class. zeinab July23,2019at1:43pm # IusePearsoncorrelationcoefficientastheaccuracymetricforaregressionproblem. CanIusethecorrelationcoefficientastheOptimizationlearningcurve? Reply JasonBrownlee July23,2019at2:41pm # Considerusingr^2asyourmetricinstead? Reply zeinab July30,2019at4:07am # sorry,butwhatdomeanbyr^2? Reply JasonBrownlee July30,2019at6:23am # r-squaredorR^2: https://en.wikipedia.org/wiki/Coefficient_of_determination Reply jake July27,2019at3:28am # HiJason. Iposttwopicturesofmytrainingmodelhere https://stackoverflow.com/questions/57224353/is-my-training-data-set-too-complex-for-my-neural-network wouldyoubeabletotellmeifmymodelisoverfittingorunderfitting.Ibelieveitisunderfitting. howcanifixthisproblems? ThanksonceagainJaso,Youdontknowhowmuchyouhavehelpedme Reply JasonBrownlee July27,2019at6:12am # Thepostabovewillhelpyoudeterminewhetheryouareoverfittingorunderfitting. Iteachhowtodiagnoseperformanceandthenimproveperformancerighthere: https://machinelearningmastery.com/start-here/#better Reply zeinab August4,2019at11:40pm # canIaskyouabouttheneedfortheperformancelearningcurve? Iunderstandfromthistutorialthattheoptimizationlearningcurvesareusedforcheckingthemodelfitness? Butwhatistheimportanceoftheperformancelearningcurves? Reply JasonBrownlee August5,2019at6:53am # Whatdoyoumeanbyperformancelearningcurve? Reply zeinab August5,2019at12:23pm # performancelearningcurvethatrepresenttheaccuracyoverepochs Reply JasonBrownlee August5,2019at2:04pm # Isee,goodquestion. Theperformancecurvecangiveyouanideaofwhetherchangesinlossconnectwithrealtangiblegainsinskillontheproblem. Reply zeinab August4,2019at11:41pm # shouldIstoptrainingthemodelwhentheitreachestheminimumloss? Reply JasonBrownlee August5,2019at6:53am # Yes,onthevalidationset. Reply Zeinab August5,2019at8:22pm # IfIreachestheminimumvalidationlossvalue, However,thevalidationaccuracyvalueisnothigh. Inthiscase,HaveIstoplearning? Reply JasonBrownlee August6,2019at6:35am # Minimumlossis0,ifyouhitzerolossitsuggeststheproblemistrivial(MLisnotneeded)orthemodelhasoverfit. Reply zeinab August6,2019at11:16pm # Sorry,Iwanttosay,ifIreachaminimumvalidationlossvalue(not0)butatthisepochthevalidationaccuracyisnotthehighestvalue(afterthisepoch,thevalidationaccuracyishigher). Atthissituation,shouldIstoptraining? JasonBrownlee August7,2019at7:57am # Perhapstryitandsee. zeinab August5,2019at12:26pm # CanImeasurethemodelfitnessfromtheaccuracylearningcurvesinsteadofthelosslearningcurves? Reply JasonBrownlee August5,2019at2:04pm # Sure.Itjustmaynotbeashelpfulindiagnosinglearningdynamics. Reply zeinab August5,2019at10:50pm # whatdoyoumeanbylearningdynamics? Reply JasonBrownlee August6,2019at6:38am # Howthemodellearnsovertime,reflectedinthelearningcurve. Reply zeinab August5,2019at12:37pm # Isthereisaproblem,ifthelosscurveisastraightlinethatdecreasesovertheepochs? Reply JasonBrownlee August5,2019at2:04pm # Lossshoulddecrease. Reply zeinab August5,2019at12:38pm # Ifyouplease,Canyousuggestformeagoodreferencetoreadmoreaboutlearningcurves? Reply JasonBrownlee August5,2019at2:04pm # Yes,seethereferencesattheendofthepost. Reply Zeinab August5,2019at8:01pm # Doesthevalidationlossvaluemustbelowerthanthetraininglossvalue? Reply JasonBrownlee August6,2019at6:34am # Forawellfitmodel,validationandtraininglossshouldbeverysimilar. Reply zeinab August6,2019at4:22am # whichispreferredusing: –theearlystoppingor –analyzingtheoutputtofindtheminimumvalidationloss Reply JasonBrownlee August6,2019at6:41am # Itdependsonthemodelandonthedataset. Perhapsexperimentandseewhatisreliableforyourspecificscenario. Reply Zeinab August6,2019at11:19am # Whichispreferredusingearlystopwithlowpatencievalueorhighvalue Reply JasonBrownlee August6,2019at2:05pm # Itdependsonyourchoiceofmodelandthedataset.Perhapsexperiment? Reply Zeinab August6,2019at11:22am # IfIreachestheminimumvalidationlossvalue,whileatthisepochthereisagapbetweenthetrainingaccuracyandthevalidationaccuracy. Shouldistoplearningornot? Reply JasonBrownlee August6,2019at2:05pm # Maybe.Perhapstestthisstrategy. Reply zeinab August6,2019at11:19pm # WhyshouldIstopwhenIreachesaminimumvalidationlossandnotwhenIreachestheminimumgapbetweenthevalidationandtrainingloss? Reply JasonBrownlee August7,2019at7:58am # Tryarangeofapproachesandseewhatresultsinarobustandskillfulmodelforyourdataset. Ingeneral,youwanttostoptrainingwhenthetrainandvalidationlossislowestandbeforevalidationlossstartstorise. Reply JimPeyton August17,2019at12:12am # Greattutorial! Onthesecondgraphshowinganundertrainedmodel,itseemslikethevalidationdatalossshouldtrackhigherthanthetrainingdataloss,whichisdifferentthenwhatthegraphshows.Perhapsaneditingerror? Again,greatworkhere.Thanksforsharing. Reply JasonBrownlee August17,2019at5:48am # Noerror,thevalsetinthatcasewasperhapsunder-representative.Theimportantpointwastheshapeofthetrain/valcurvesshowingthatmoremeaningfultrainingisverypossible. Reply ChetanPatil September6,2019at5:23pm # HiJason,thisisaveryinformativepost.However,onequestionregardingthesectionUnrepresentativeValidationDataset:- Anunrepresentativevalidationdatasetmeansthatthevalidationdatasetdoesnotprovidesufficientinformationtoevaluatetheabilityofthemodeltogeneralize. Thismayoccurifthevalidationdatasethastoofewexamplesascomparedtothetrainingdataset. Myquestionis,ifyouhavemorevalidationexamples,say30%oftheentiredataset,thenwillthecurvesmooth-out? Or,thefaultisinthedistributionofthevalidationsetitself?(theval_datamightnotcontainthesamedistributionasthetrain_datacontained). IftheabovesentenceisnotacaseofUnrepresentedvalidationdataset,thenhowwouldthecurveslooklikewhenthevalidationdatadistributioniscompleteydifferentfromthetraining_dataset.Andwhataretheremediestocounter-actthisissue? Reply JasonBrownlee September7,2019at5:21am # Itdependsonthespecificsofthedataandthesizeofthedatasetyou’resampling. Agoodsolutionistogetmoredataandusea50/50split. Reply Hamed September7,2019at8:44am # VeryNice!Wouldappreciateifyouletmeknowwhichofthesemodelsisbetterwhenappliedtothesametraining/validationsets:theonethatproduceslowervalidationlossandalsolowertraininglossbutitsgeneralizationgapishigherthantheonewithhighervalidationandtrainingset.Igiveyouanexample: Model1:tr_loss=0.5val_loss=1.5gap=1 Model2:tr_loss=0.8val_loss=1.6gap=0.8 Thankyou! Reply JasonBrownlee September8,2019at5:09am # Generally,modelselectionisspecifictoaproject,myadvicewon’thelp. Itisagoodideatochooseamodelthatmeetstherequirementsofprojectstakeholders,typicallythisisgoodskillonaholdoutdatasetandlowcomplexity. Reply Hamed September8,2019at7:12am # Ihearyou! Thanks! Reply Felipe September21,2019at1:41pm # Howbadisthisnoise? https://imgur.com/sSL3DRJ Reply JasonBrownlee September22,2019at9:24am # Notsobad! Reply RadhouaneBaba September30,2019at12:59am # HiJason, CanthetrainingcurvebeusedtoassessamodelthatpredictsTimeSeries? Asiknow,wecannotuseCross-Validationfortimeseries,(Walk-forwardvalidation) sohowmeaningfulisittousethelearningcurve? Isexperience,trainingsize?orepochs? Reply JasonBrownlee September30,2019at6:12am # Yes,eachtimethemodelisfit,thelearningcurvecanbeainvaluablediagnosticintolearningbehavior. Reply RadhouaneBaba September30,2019at7:15am # Sotheexperienceistrainingsize? Howcanihaveamoretrainingsizeintimeseries?(Bygoingbackward(forexampleadd1dayeachtimeandappendingthelasttrainandtestloss?? Reply JasonBrownlee September30,2019at2:24pm # NotsureIfollow,sorry. Youcanhavemoredatatotrainatimeseriesmodelbyaddingmorehistory,ormoreinputvariablesmeasuredateachtimestep. Notsurehowthatisrelatedtolearningcurves? Reply RadhouaneBaba October1,2019at12:43am # asiunderstood,thex-axisinthelearningcurveisnottheepochnumbers, itisthesizeofourtrainingset,right? JasonBrownlee October1,2019at6:54am # No. Thex-axisofalearningcurveplotisepochs. AlexTagbo October10,2019at8:02pm # HiJason, Ihavebeenfollowingyourtutorialsforawhileandhasbeenveryhelpful!Thankyouverymuch! Myquestionisdirectedtotheunrepresentativevalidationdataset(thesecondgraph),whatremedialmeasurewouldyourecommendinthiscase,apartfromgettingmoredataetc? Canonealsoapplythedropouttechniqueoritisrestrictedonlyforoverfitting? Thanks! Alex.. Reply JasonBrownlee October11,2019at6:17am # Youcanusealargervalidationdataset,suchashalfthetrainingdataset. Moreonhowtoreduceoverfittinghere: https://machinelearningmastery.com/introduction-to-regularization-to-reduce-overfitting-and-improve-generalization-error/ Reply AlexTagbo October11,2019at6:12pm # Thanksforyourreply! Onemorequestionpleasebutnotrelatedtothissection,IobservedthatwhenIuseaparticularrandomseedgeneratoraslowasmaybe7-10toobtainreproducibleresultsinKerasandIchangetheseedagainfromavalueletsayabove30,Igetdifferentresultsincludingthegraphshape. Isthisnormal?OrdoIhavetoalwayssticktotheoriginalseedgiven? Thanksagain! Alex Reply JasonBrownlee October12,2019at6:50am # Yes,seethis: https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-code Reply GerhardLemmer October11,2019at10:36pm # Thisisnormal.Deeplearners(andsomeclassicalMLalgorithms)arehighlystochastic.Thisiswhyyoushouldalwaysdomultipleexperimentsandgetstatisticsofyourresults.Forexampleaveragetraining-/validationlossandstdonthelossesovermultipleexperimentswiththesamehyperparameters(experimentsdifferingonlybyRNGseed).Here’resomeotherarticlesonthisblogaboutrandomnessinresults:https://machinelearningmastery.com/reproducible-results-neural-networks-keras/,https://machinelearningmastery.com/evaluate-skill-deep-learning-models/andhttps://machinelearningmastery.com/randomness-in-machine-learning/. Also:Somepseudorandomnumbergeneratorsdon’tworkwellwithsmallseeds.Soifyougetcertainresultswithmultipledifferentsmallseedsanddifferentresultswithsignificantlylargerseeds,thatmaybeanindicationthattheRNGusedbyyourlibrariesdoesn’tworkwellwithsmallseeds.Uselargerseedsinstead. Reply JasonBrownlee October12,2019at6:59am # Spoton! Exceptthestuffonseeds.Ithinkalllibsusegoodrandomnumbergeneratorsthesedays,theresultswithdifferentseedsareverylikely“lucky”andnotrepresentative. Reply AlexTagbo October14,2019at4:13pm # Ok,thathasansweredmyquestion! Thankyouverymuch! Alex Reply JasonBrownlee October15,2019at6:06am # Happytohearthat. Reply Mohammad October19,2019at1:09am # Thereissomethingverystrangegoingonwiththeseplots.Thetraininglossseemstobealwaysmuchhigherthanthevalidation’s.Buthowisthatpossible?Exceptforthecaseofunrepresentativedata,whenyoutrainamodelyouexpecttoseeamuchlowerlossonthetrainingset(wherethemodelparameterareoptimizedfortheset)versusthevalidationsetwherethetrainingmodelneedstogeneralize(theparametersarenotoptimizedforthisset). CheckoutAndrewNg’snoteshere:http://www.holehouse.org/mlclass/10_Advice_for_applying_machine_learning.html Thetraininglossisalways(exceptcornercases)lowerthanthevalidationset. Reply JasonBrownlee October19,2019at6:46am # Typicallypeoplewilluse30%orsmallerofthetrainingsetasavalset,whichmakesthelossonthatsetnoisy/unreliable. It’ssupercommon,sadly. A50%splitmightbemoreappropriateifthereissufficientdata. Reply Mohammad October21,2019at1:37pm # Ithinkthelossfunctionneedstobenormalizedbythesizeofthedataset.Thatis,have1/m_{trainingsize}whencalculatingthetraininglossfunctionand1/m_{cvsize}fortheotherset. Reply JasonBrownlee October21,2019at1:43pm # NotsureIagree. Reply OXPHOS October21,2019at8:26pm # HiJason, Thanksforthedetailedexplanation.Ithelpedalot.IamwonderingifIcouldtranslateitintoChinese,andrepostitonmyblog,withtheaddresstoyourpostannotated? Thanks! Reply JasonBrownlee October22,2019at5:45am # Pleasedonottranslatetheposts: https://machinelearningmastery.com/faq/single-faq/can-i-translate-your-posts-books-into-another-language Reply Shabnam October22,2019at5:29am # Iwaswonderingifyoucanclarifyonlossvaluesandboundaries.Inotherwords,whatdoeslossvalueofgreaterthan1mean? (withaccuracyoverepoch,allofthevaluesarebetween0and1–or0%and100%) Reply Shabnam October22,2019at5:32am # Ihaveoneanotherquestion.Basedonthispostloss-over-epochisinformativeintermsoffit.Howaboutaccuracy-over-epoch(accuracyoftrainandvalidationsets)? Reply JasonBrownlee October22,2019at6:00am # Typicallynotasuseful.Toocoarsegrained. Reply JasonBrownlee October22,2019at6:00am # Lossisrelativetoamodel/dataset. Irecommendinterpretingbroaddynamicsonly,notspecificvalues. Reply Shabnam October22,2019at9:28am # Thanksalotforyourexplanationandclarification. Reply JasonBrownlee October22,2019at1:45pm # You’rewelcome. Reply Shabnam October23,2019at4:15pm # Ihavesomecasesthatthelossplothasincreasingbehavioroverepoch.Ididnotseethisexampleinyourpost.Iwaswonderingwhichcategoryitbelongsto. Reply JasonBrownlee October24,2019at5:35am # Iftraininglossisincreasing,itisprobablyasignofoverfitting. Thereareexamplesofthisintheabovetutorial. Reply Adam November28,2019at2:51pm # HelloJason,IhaveimpementedaRNNandmyvalidationlossstartsincreasingafter2epochsindicatingthatthemodelprobablyisoverfitting.However,IcomparedtheevaluationresultsofPrecisionandRecallandarunon2epochsandon10epochsjustgivesmealmostsimilarresults. HowcanIinterpretthat?Doesitmeanthatthemodelconvergesin2epochsanddoesnotneedmoretraining?AndcanIarguethatitwouldbethebestpointtostopafter2epochseventhoughthevalidationlossincreasesafter2epochsandindicatesoverfitting? Thanks! Reply JasonBrownlee November29,2019at6:42am # Yes,yourreasoningseemsgood.Perhapstrysmallerlearningratestoslowdownthelearning? Reply Abeer December20,2019at10:29am # Howmuchofagapbetweenvalidationandtraininglossisacceptable? Reply JasonBrownlee December20,2019at1:07pm # Goodquestion. Assmallaspossible.Atsomepointitbecomesajudgementcall. Reply Abeer December21,2019at7:06am # ThanxJason. Reply ItaruKishikawa January24,2020at12:23pm # Howdoyougeneratethesegraphs?Also,foreachcase,whatparameterdoweneedtotune? Reply JasonBrownlee January24,2020at1:33pm # Youcangeneratelinegraphsinpythonusingmatplotlibandcallingtheplot()function. Seethisonreducingoverfitting: https://machinelearningmastery.com/introduction-to-regularization-to-reduce-overfitting-and-improve-generalization-error/ Reply GabrieleValvo January25,2020at7:09pm # Goodmorning,Ibuildaneuralnetworkinordertopredictaphysicalquantity(regressiontask),Iplottedthechart“trainingloss/validationlossvsepochs”,Icanseethat,atfirst,botharedecresingandthantheybecomeconstantbutthevalidationlossisalwaysjustbelowthetrainingloss(thisdifferenceisverysmall).Isitoverfitting?Ifboth(trainandvalloss)becameconstant(afterdecresing)isitimportanifonestayovertheotherorviceversa? IwouldliketosendyousomeplotbutIdon’tnowhowcanIdo. Reply JasonBrownlee January26,2020at5:16am # Asmalldifferencebetweenthelossvaluesmightmeanagoodfit. Reply GabrieleValvo January27,2020at2:25am # Thanksfortheanswer,Iuploadinagoogledrivefolder5lossfunctionchartwherethevallossisunderthetrainloss.Canyoucheckifitisacaseofoverfitting?BecauseI’mabitconfused.Thankyou!! Thisisthelink:https://drive.google.com/open?id=1sv1Qn9RhLRL7UXBgLOzFNWga5JHzJHCF Reply JasonBrownlee January27,2020at7:06am # Sorry,Icannot. Reply Joglas February25,2020at10:11am # HiJason, Thankyouforthepost.Incaseofanunbalancedclassificationprobleminwhichthetrainingdatasetwasresampled,weareverylikelytohaveachartliketheoneyouexplainedin“UnrepresentativeValidationDataset”asthevalidationdatasetisstillunbalancedrepresentingtherealworld.Inthisscenario,dowehavetoanalyzetheperformanceofthemodelinadifferentway? Thanks. Reply JasonBrownlee February25,2020at11:19am # Perhaps.Youcouldtryplottingametricyou’reusingforevaluationratherthantheloss. Reply Xuebo March3,2020at4:47pm # Thanks,thearticleisveryhelpful. ButIstillhavequestionabouthowyoudefineagoodfit.Yousaytherecouldbeasmallgeneralizationgap.ButhowIshoulddefinethe“small”? Igotacurveandthevalidationlossdecreasestoapointofstabilityaround0.06,whilethetraininglossisstablearound0.03.HowshouldIevaluateit? Reply JasonBrownlee March4,2020at5:50am # Goodquestion.Itisrelative,e.g.isthegaprelativelysmall,shrinking,stable. Reply David March10,2020at3:34pm # HeyJason,greatjobasalways. RegardingRolandFernandezreply,thefirstreplytothisarticle.Ihavebuiltsomemodelsandcompiledthemwith‘mse’lossandI’mgettingatthefirstepochavalueof0.0090,andatsecondavalueof0.0077,anditkeepslearningbutjustalittlebitperepoch,drawingattheendanalmostflatlineliketheoneontheFirstLearningCurve“ExampleofTrainingLearningCurveShowingAnUnderfitModelThatDoesNotHaveSufficientCapacity”.SoIwantyouropiniononthis. DoesthesemodelasRolandsayaren’trepresentativeofunderfittingduetothelowvalues,orareinfactunderfittingasyouestablishedinthearticle? Imostaddthattheobtainedpredictionswiththesemodelsareintheexpectedrange. Reply JasonBrownlee March11,2020at5:19am # Iflossstaysflatduringlearning,thatisodd.Itmightbethecasethattheproblemiseithertrivialorunlearnable–perhapstheformerinthiscasewhereanysetofsmallweightsproducesgoodpredictions.Justaguess,perhapsmoreresearchisrequired. Reply David March11,2020at7:51am # WhatdoyousuggestthatIshoulddothentodeterminethereliabilityofthismodels,oriftheyareanapplicablesolutiontotheproblem. Reply JasonBrownlee March11,2020at8:07am # Startbyselectingametricthatbestcapturestheobjectivesoftheprojectforyouandstakeholders. Thendesignatestharnessthatevaluatesmodelsusingavailabledata.E.g.formodestamountsofdataforregression/classification,userepeatedstratifiedk-foldcross-validation. Compareresultsusingthemeanofeachsampleofscores.Supportdecisionsusingstatisticalhypothesistestingthatdifferencesarereal. Usevariancetocommentonstabilityofthemodel.Useensemblestoreducethevarianceinfinalpredictions. Eachofthesetopicsiscoveredontheblog,usethesearchfeatureorcontactme. Learningcurvescanprovideausefuldiagnosticforasinglerunofasinglemodeltoaidintuningmodelhyperparameters. Reply David March11,2020at8:45am # Thanksverymuch Reply JasonBrownlee March11,2020at8:47am # You’rewelcome. Reply David March11,2020at12:40pm # Heyagain,theresultsofthelossIexplainbeforeareatfitallthesamplesineachepochforalmost100epoch. Thedatadimensionsareasfollows: inputs5395,23,1. outputs5395,23. AndeachsampleasIexplainedinotheroccasionscorrespondtothisformat: Inputs:________Outputs: 1,2,3___________4,5,6 2,3,4___________5,6,7 3,4,5___________6,7,8 Couldthisbecausingthatthelearningcurveisalmostflat?ShouldIbetrainingatbatch_size? Reply JasonBrownlee March11,2020at1:58pm # Perhaps,itishardtoknow. Maybeexploreothermodelarchitectures?otherlearningrates?otheroptimizers?etc. Reply Fatih March25,2020at12:46am # HiJason, ItriedtofinetuneCNNsfor14classimageclassification.Datasethas2000image.Eachmodelsproducedsimiliarlossvaluesrange0.1to0.4.Forexample: Bestepoch:20/50 train_acc:0.9268600344657898train_loss:0.27140530943870544 val_acc:0.9145728349685669val_loss:0.358508825302124 Doyouthinkmodelsaregoodforpublication,oragoodmodelhastolossvalueunder0.1? Reply JasonBrownlee March25,2020at6:34am # Icannotknowiftheresultsaregoodobjectively. Goodresultsarerelativetoanaivemodelandtoothermodelsonthesamedataset. Reply Fatih March26,2020at12:23am # 1-)CanitbesaidthatmymodelsarenotsufficientjustbylookingatthelossvaluesandshouldIdecreasemylossvaluesbelow0.1andincreasetheaccuraciesabove0.95? 2-)Orareval_acc(0.89~0.94)andval_loss(0.1~0.4)valuessufficientfor14classeswithhighsimilarity? Reply JasonBrownlee March26,2020at7:57am # Notreally,youcaninterpretthecrossentropyobjectivelyseethis: https://machinelearningmastery.com/cross-entropy-for-machine-learning/ Itismuchbettertoselectametricandcomparemodelsthatway: https://machinelearningmastery.com/faq/single-faq/how-to-know-if-a-model-has-good-performance Reply Bel April5,2020at5:35pm # HelloJason, IsthereanyrangewhichisconsiderdgoodfortheLossvalues(y-axis),say,thehighestlossvaluemustbeabovesomespecificvalue? Orthateachproblemhasit’sownrangeofvalues,whereonlytheshapeofthecurvesmatter? Thankyou Reply JasonBrownlee April6,2020at6:03am # Yes,youcaninterpetcross-entropy: https://machinelearningmastery.com/cross-entropy-for-machine-learning/ Generally,itisbettertocomparetheresultstoanaivemodel. Reply ENGİNSEVEN April14,2020at10:33am # Hello,Jason.ImetYourWebsitetwoweeksago.Youinspiredme.I’dwanttomeetyouandshakeyourhandandthankyou.Pleasedon’tstopwriting. İstanbul.. Reply JasonBrownlee April14,2020at10:39am # Thanks! Reply shivanAB April14,2020at10:48pm # HelloSir whatifiobtainahighvalidationaccuracy,butthecurveisnotsmooth? whatisthereasonsofthat? thanks Reply JasonBrownlee April15,2020at7:59am # Perhapsthedatasetissmallorthemodelhashighvariance. Reply ShivanAB April15,2020at9:36am # Soisitbadornot?Ifyes,howcanIfixthisissue? Formycase:iusealexnetmodelwith1GBof.dicomfile(1000.dicom)dataset,dividedinto2classes. Thankssir. Reply JasonBrownlee April15,2020at1:21pm # Itisonlygoodorbadrelativetootherresultsthatyoucanachieveonyourdataset,e.g.relativetoanaivemodel. Reply Arkesha June16,2020at2:59am # whatisgeneralizationerror?isitagapbetweentrainingandvalidationloss? Reply JasonBrownlee June16,2020at5:43am # Generalizationerroristheerrorthemodelmakesondatanotusedtotrainthemodel.Erroronnewdata. Reply Sarthika June22,2020at3:12am # Hi,ImnotclearaboutwhetherlearningcurvecanbeusedasaccuracymetricforLSTM?Canweuselearningcurveonanypredictivemodelirrespectiveofthepredictionalgorithmused?Whataccuracymetricisbestfordeeplearningalgorithms? Reply JasonBrownlee June22,2020at6:17am # Yes,seethis: https://machinelearningmastery.com/diagnose-overfitting-underfitting-lstm-models/ Thiscanhelpwithchoosingametricforclassification: https://machinelearningmastery.com/tour-of-evaluation-metrics-for-imbalanced-classification/ Reply Jay June24,2020at5:53am # Thisarticleisveryhelpfulalongwithchart.Itwillbeniceifthishaspythoncodefordetailunderstandingsothatcode&chartcangoside-by-side. Isitpossibleyouprovideexamplewithcode??? Reply JasonBrownlee June24,2020at6:40am # Howwouldthecodehelpininterpretingtheplots? Reply Abs June28,2020at9:16am # HiJason, Ihaveaquestionforyou.Thisisnotrelatedtothispost. ImdoingasmallresearchprojectbasedonDeepLearning.i’mtryingtopredicttheratingsthatauserwillgivetoanunseenmovie,basedontheratingshegavetoothermovies.I’musingthemovielensdataset.TheMainfolder,whichisml-100kcontainsinformationsabout100000movies.Tocreatetherecommendationsystems,themodel‘StackedAutoencoder’isbeingused.I’musingPytorchforcodingimplementation. Isplitthedatasetintotraining(80%)setandtestingset(20%).MylossfunctionisMSE.WhenIplotTrainingLosscurveandValidationcurve,thelosscurves,lookfine.Itsshowsminimalgapbetweenthem. ButwhenIchangedmylossfunctiontoRMSEandplottedthelosscurves.Thereisahugegapbetweentraininglosscurveandvalidationlosscurve.(epoch:200trainingloss:0.0757.Testloss:0.1079) Inmycode,Ionlychangedthelossfunctionpart(MSEtoRMSE).IappliedtheRegularizationtechniquessuchasBatchNormalizationandDropoutbutstillthereisabiggapbetweenthecurves. I’mnewtodeeplearning,butdoyouknowwhatsthereasonwhythereishugegapbetweenthecurveswhenapplyingRMSE? IsitsomethingtodowiththeEvalautionmetricorsomethingwronginthecodingpart? Thanks. Reply JasonBrownlee June29,2020at6:26am # Irecommendusingmseloss,butperhapscalculatemetricsforrmse,e.g.don’tusermsetotrainthemodelbutonlytoevaluatethepredictions. Reply Abs June29,2020at9:28am # HiJason. Thanksforyourfeedback. SoIonlyuse‘RMSE’(LossFunction)fortestingtheModel? Andfortrainingthemodel,Ileaveoutthelossfunctionpartoruse‘MSE’aslossfunctionfortrainingthemodel? Reply Abs June29,2020at10:27am # https://towardsdatascience.com/stacked-auto-encoder-as-a-recommendation-system-for-movie-rating-prediction-33842386338 Myprojectisbasedonthis.(Clickthelink). Reply JasonBrownlee June29,2020at1:24pm # Sorry,Igetsent100soflinks/code/dataeachweek. Idon’thavethecapacitytoreviewthirdpartystuffforyou: https://machinelearningmastery.com/faq/single-faq/can-you-explain-this-research-paper-to-me JasonBrownlee June29,2020at1:20pm # UseRMSEasametric.DonotuseRMSEasalossfunction(e.g.donotminimizermsewhenfittingthemodel),useMSE. Reply Abs July1,2020at9:51am # ThanksJason. Iwilltrythat. Bytheway,Ihavealistofquestionsforyou. I’mstillnewtoDeepLearningandI’mconfusedwiththeterminologiesofValidationLossandTestLoss.Aretheythesameorcompletelydifferent? Andalsoyoucan’ttrainthemodelonthetestdata? Isitonlyreservedfortesting(evaluatethepredictions)? Iknowyoucan’treviewmydata,butwhenIaddedthevalidationlosstomycode,Ireusedthetrainingloopandremovedthebackwardandoptimizer.step()calls.MymetricforthatisMSE.IassumedthatvalidationlossisthesameasTestloss.ButImaybewrong. Iliketohearyourfeedbackonthis. JasonBrownlee July1,2020at11:22am # Yes,wecancalculatelossondifferentdatasetsduringtraining,suchasatestsetandvalidationset,seethemdefinedhere: https://machinelearningmastery.com/difference-test-validation-datasets/ Afterwechooseamodelandconfig,wecanfitthefinalmodelonallavailabledata.Wecannotfitthemodelontestdatainordertoevaluateitasthemodelmustbeevaluatedondatanotusedtotrainittogiveafairestimateofperformance. Abs July4,2020at10:48am # ThanksJason. NowIunderstandtheconceptofValidationandTrainingsets. Inmyminiproject,i’mpredictingtheratingsthatauserwillgivetoanunseenmovie,basedontheratingshegavetoothermovies.Themodel,i’musingisStackedAutoencoder. Formyanothertask,IwanttocomparewithotherDeepLearningmodels.ForinstanceIwanttouseMLP(Multilayerperceptron)orLogisticRegression(MachineLearningModel).Isitpossibletoemploythosemodelsformovieratingpredictionfrom0to5? Thanks. Reply JasonBrownlee July5,2020at6:52am # Yes. Reply Aaron July14,2020at12:04am # I’mbuildingaLSTMmodelforprediction.Thevalidationerrorcurveisflat,validationmseislessthantrainingmseintheend.val_loss=0.00002,training_loss=0.013533. IreadyourarticlecarefullybutI’mnotsurewhethermyvalidationsetisunrepresentative.ShouldIexpandmyvalidationset? Hereisthechartandproblem: https://stackoverflow.com/questions/62877425/validation-loss-curve-is-flat-and-training-loss-curve-is-higher-than-validation Thanks. Reply JasonBrownlee July14,2020at6:28am # Itmaybethecasethatyourvalidationsetisnotrepresentativeoftrainingortoosmall. Reply QUANGHUYCHU July21,2020at9:46pm # HiJason. ThanktoyourpostIknowwhatisUnder,OverandGoodfit. IamalsocurrentlyasmallANNmodel(95input,3classesoutput,2hiddenlayerswith200nodesand30nodesrespectively). Mydatasetissmalldataset(105sampleswith95featuresaseachsamples)withshape(105,95).IsplitmydataintoTraindata(80samples),Validationdata(10samples)andTestdata(15samples). MyquestionisItriedtotrain,validateandpredictmymodelfor10times.forabout7or8timesIobservedaGoodfit(Train-ValidationAccuracyandLossGraph)andother3or2timesIgotOverfitting.Isthisphenomenonisalright?andalthoughitsOverfillingthepredictiononTestdataquitegood(over85%). Thankyouverymuchforyourhelp. Reply JasonBrownlee July22,2020at5:31am # Perhapsyoucanchangetheconfigurationsothemodelismorestableonaverage. Reply QUANGHUYCHU July22,2020at10:13am # HiJason.Thankyouforyourreply. Theconfigurationhereyoumeanisthehyperparameters(likenumberoslayer,nodesortraintestsplit,etc,..)right? Reply JasonBrownlee July22,2020at1:40pm # Correct. Reply Jay July23,2020at5:01am # DOwehaverealworldexampleonlearningcurves???? Thatwillbemuchbettertounderstand&howtoplotit. Reply JasonBrownlee July23,2020at6:26am # Yesmany–searchtheblog,perhapsthiswillhelp: https://machinelearningmastery.com/how-to-develop-a-cnn-from-scratch-for-cifar-10-photo-classification/ Reply nkm July23,2020at4:47pm # HiJason, thanksforyourgreatsupport. Iwouldliketoaskpossiblereasonsforthezigzag/crowdyvalidationcurveovertrainingepochsandalso,howcanIminimise/mitigateit.Generally,trainingcurvechangessmoothlybutvalidationcurvenot.Guidanceplease. Reply JasonBrownlee July24,2020at6:23am # Itmightbethecasethatthevalidationsetistoosmalland/ornotrepresentative. Reply Julia August5,2020at3:57am # HiJason, Isthereanywaytoattributethesebehaviorstomodelarchitecture/hyperparametersettingsratherthanthetraining/validationdatadistributions?ThereasonIaskisthatIhaverunahyperparametersearchwiththeexactsametraining/validationdataandachievedmodelsthathavetraining/validationcurvesthatlooklike3oftheaboveexamplesthatyougive(ifIcouldembedimageshereIwould). Model1:Curvesappearliketheexampleyougivefor“UnrepresentativeTrainDataset”,Model2:appearsliketheexampleyougivefor“UnrepresentativeValidDataset”,andModel3:appearslikethe“validationdatasetmaybeeasierforthemodeltopredictthanthetrainingdataset”examplethatyougive. Haveyougotanyintuitionaboutthis?Itwouldbeappreciated. Thanksforyourblog,I’vereferenceditnumeroustimes! Reply JasonBrownlee August5,2020at6:19am # Thelearningcurvesareimpactedbythestructureofthemodelandconfigurationofthelearningalgorithm,thedatahasmuchlesseffect–ifpreparedcorrectly. Here“unrepresentative”meansyoursampleistoosmall. Reply chouchou August11,2020at8:27pm # Hello! Thispostisveryinteresting,thankyouforthat.However,IhaveaquestionconcerningatrainingIdid.I’mnewindeep-learning,andIusedacodethatwasalreadywritten.Ididn’tsucceedinplotingthecurveforvalidation(IthinkyoumeanwhatIwouldcall“test”).Ionlyhavethelosscurveoftraining+validation,butnottheonefortest.Itrainedmyneuralnetworkon50epochs,andIonlyknow: -theintermediateaccuracyvaluesforvalidation(nottest)(aftersavingweightsaftereach5epochs) -thevalueofaccuracyaftertraining+validationattheendofalltheepochs -theaccuracyforthetestset. Ihaveanaccuracyof94%aftertraining+validationand89,5%aftertest.Concerninglossfunctionfortraining+validation,itstagnesatavaluebelow0.1after35trainingepochs.Thereisatotalof50trainingepochs. Istheonlylittledifferencebetweenaccuracyoftraining+validationandtestsufficienttosaythatmynetworkdoes’ntoverfitt? Reply chouchou August11,2020at8:30pm # Iwantedtosay“itstagnatesatavalue…” Reply JasonBrownlee August12,2020at6:10am # Thanks! Noproblem,usevalinsteadoftest. Iftheholdoutdatasetistoosmalltheresultswillbeunstable. Reply chouchou August12,2020at6:30am # Thankyouforyouranswer.Idon’tunderstandwhyyousay“Usevalinsteadoftest.”Infact,theonlythinkIcandowithmycodeis: -drawaccuracycurveforvalidation(theaccuracyisknownevery5epochs) -knowingthevalueofaccuracyafter50epochsforvalidation -knowingthevalueofaccuracyfortest Reply Michelle August15,2020at12:13am # HiJason, thanksagainforthearticle. Duringthewholedeepnetworktraining,bothofvalidationdatalossandtrainingdatalossreducesalongwiththeincreaseoftheepochs.Butthereductionofvalidationdatalossismuchsmallerthanthereductionoftrainingdataloss,isitnormalandrepresentative? whenepochissmallfrom0,thecurveoftrainingdatalossstartshighandreducesalongtheepochs,butthevalidationdatalosscurvestartsalreadysmallandthenreducesalongtheepochsslightly. Thankyou. Reply JasonBrownlee August15,2020at6:30am # Maybeyourvalidationdatasetistoosmall? Reply Michelle August17,2020at2:35am # Thankyou,Jason,Ihavetriedtogetmoresamplesfromtrainingdatatovalidationdatatoincreasethevalidationdatasamplesize,stillthelearningcurveshowsthatalthoughbothvalidationdatalossandtrainingdatalossreducesalongwithepochs,butthereductionofvalidationdatalossismuchsmallerthantrainingdata,finally,theloss(mse,standardiseddatawithmean0andstd1)oftrainingdatais0.25whilethelossofvalidationdatais0.41,isitstilloverfitting? Differentliteraturealwayssaysgoodfitisthatthevalidationlossisslighterhigherthantrainingloss,buthowhighisslightlyhigher,couldyoupleasegivesomehint? Thankyouasalways. Reply JasonBrownlee August17,2020at5:49am # Nicework. Perhapstryslowingdownthelearningwithanalternatelearningrateoraddingregularization. Ifthebehaviorremainsstubbornlythesame,perhapsyouarereachingthelimitsofyourchosenmodelonyourdataset. Reply Chouchou August26,2020at6:54pm # Thankyouforyouranswerofthe12thofAugust.ButI’mstillnotsuretounderstand.Inthisarticle(ofthispage),whatisforyou“training”and“validation”?Hasitthesamemeaninglikeinthisarticle?:https://machinelearningmastery.com/difference-test-validation-datasets/ Ihaveresults(F1score,precision,recall,…)afterthelastvalidationformyneuralnetwork.Ihavealsoresultsafterusingtestsettoevaluateperformancesofneuralnetworksonnewimages.Theresultsonthevalidationsetandthetestsetareslightlydifferent(4,5%differenceonaccuracy),theaccuracyonthetestsetarealittleworser(of4,5%).Isthiswhatwecall“generalizationgap”?Whytheresultsontestsetarelittleworser(of4,5%)? Thankyouforyourhelp Reply JasonBrownlee August27,2020at6:13am # Yes,youcanexpectsmalldifferencesinperformancefromdifferentsamplesofdata,perhapsthiswillhelp: https://machinelearningmastery.com/different-results-each-time-in-machine-learning/ Reply Chouchou August28,2020at1:35am # Thankyouverymuchforyouranswer.Thisotherarticleisveryinteresting.Inmycase,Iuseaneuralnetworkforsemanticsegmentation(SegNet).Afterthelastvalidation(=resultsonthevalidationdatasetforfinalmodel),Igot91,1%accuracy.Usingthisfinalmodelonthetestdataset,Igot85,9%.Fromyourarticle“Differentresultseachtime…”,IsupposeIcanexplainthisdifferencebyahighvarianceofmymodel(thevalidationdatasetandthetestdatasethavedifferentimages,3imagesof5000*5000pixelsforeach).Isitright?Inyourarticleyouseemtospeakaboutvarianceonlyfortrainingdata,soI’mnotsureofmyassumption. Thankyouforyourhelp Reply JasonBrownlee August28,2020at6:50am # Nicework! Yes,varianceinthefinalmodeliscommon,whichcanbeovercomebyusinganensembleoffinalmodels: https://machinelearningmastery.com/ensemble-methods-for-deep-learning-neural-networks/ Reply sezar September1,2020at11:32pm # Hi,Jason,thanksforthispostandyourblog!I’verecentlystartedmyML/DLjourneyandIfoundyourblogextremelyhelpful. Ihaveaquestionabouttrain/valloss.Whatifamodellearnsonlyduringfirstniterationsandthenthelossandaccuracyreachaplateauduringtheveryfirstepoch,andthevallossafterthatfirstepochishuge?I’musingAdamwithdefaultparameters. Reply JasonBrownlee September2,2020at6:29am # Stoptrainingwhenthemodelstopslearning.Perhapstryalternateconfigurationsofthemodelorlearningalgorithm. Reply Tethys September15,2020at9:16am # Forthe3rdfigure,itisclearlyanoverfittingphenomenon.Butisitharmfultocontinuetraining?Causethecontinuingtrainingdidn’tincreasethevalidationlossanyway,atleastfornow.ThereasonIaskedisIhaveseenonespecificbehaviorthatthemodelisovetfittedlikethe3rdfigure,butboththeAccuracyandIoUforvalidationsetstillincreaseifthetrainingprocesscontinues.Whatdoyouthink? Reply JasonBrownlee September15,2020at2:50pm # Thethirdfiguretitled“ExampleofTrainandValidationLearningCurvesShowinganOverfitModel”showsoverfitting. Continuedtraininginthiscasewillresultinbetterperformanceonthetrainingsetandworsegeneralizationerrorontheholdoutsetandanyothernewdata. Thebehaviouroflosstypicallycorrespondstoothermetrics.Butgoodpoint,perhapsplotthemetricyouintendtousetochooseyourmodel. Reply ayesh November3,2020at4:47pm # Whatcouldbepossiblydoneasimprovementsinthecaseofanunrepresentativetraindataset?(ifIdonothavetheoptiontoincreasethedataset) Reply JasonBrownlee November4,2020at6:35am # Yourmodelwillonlybeaseffectiveasyourtrainingdataset. Perhapstryoversampling,suchassmote. Perhapstrydatacleaningtomakethedecisionboundarymoreclear. Perhapstrytransformstofindamoreappropriaterepresentation. Reply Kodjovi November13,2020at12:19am # Hi,Nicearticle.Ihaveaquestionthough. Whatisthedifferencebetween: –aMLLearningcurve(asdescribedhere)and –alearningcurvetheoryasagraphicalrepresentationoftherelationshipbetweenhowproficientsomeoneisatataskandtheamountofexperiencetheyhave)https://en.wikipedia.org/wiki/Learning_curve Thanksforyourtime Reply JasonBrownlee November13,2020at6:33am # Thanks! Norelationship. Reply TanujaShrestha November19,2020at9:07pm # HiJason, Whatisyoursuggestiononthemodellearningcurveshaving–loss0andaccuracy1onthefirstepochitself? Also,whataretheprobablereasonsforthis? Anylinkwherethisquestionisaddressed? Thanksalways. Reply JasonBrownlee November20,2020at6:45am # Itsuggestsatrivialproblemthatprobablydoesnotneedmachinelearning: https://machinelearningmastery.com/faq/single-faq/why-cant-i-get-100-accuracy-or-zero-error-with-my-model Reply MarlonLohrbach December22,2020at3:58am # HelloJason, Ihaveaquestionregardingmylearningcurves.Iwantedtopostmyquestiononstat.stackexchange,butIhaveafeelingthatIcantrustyoumore…. 1.)Ihaveadatasetwith23.000entriesandihaveabinaryclassificationtask.Thetargetvariableisdistributedlike87%vs13%.XGBClassiferperformsbestonmydataandresultsinanalloveraccuracywith97.88%. Mycurvelookslikethis: https://ibb.co/NsnY1qH AsyoucanseeIamusingLoglossforevaluation.Myinterpretationisthatitdoesn’tover-orunderfitthedataandthatIamgoodtogo. 2.)Ihavearegressiontaskforthelast13%ofthedata(positivesamples)andIhavetopredictthedifferentcontractvalues. Mylearningcurvelookslikethis: https://ibb.co/MnZbB15 MyinterpretationhereisthatIneedmoredatatomakeagoodprediction.Thecontractvaluesrangefrom0to200.000$anddistributionissuperskewed… Thanksasalwaysforallyoursupport! Marlon Reply JasonBrownlee December22,2020at6:50am # Itrytoavoidinterpretingresultsforreaders,sorry. Perhapsexploreadditionalmodels,configs,dataprepstoseeifyoucanachievebetterresults,otherwiseperhapsyuhavehitthelimitforyourdataset. Reply Marlon December22,2020at5:11pm # SorryIdidn‘tknowthatandthankyou Reply FelipeAraya January27,2021at11:04am # Excellentpost,veryinformative.Justhaveacoupleofquestionsifyoudon’tmindplease: 1.Whenyourefertovalidationset,youactuallymeanvalidationsetfroma3splitdataset(Train/validation/test)?itisjusttomakesuresinceinsomeplacestheycallthetestset,thevalidationtest. 2.Isthereanycodeavailablethatwecanusetoreplicatethechartsthatyoushowed?(itwouldbemuchappreciated) 3.PleasecorrectifmeIamwrong,ifIwastodoanestedcrossvalidation,IthinkthatIwouldn’tneedtodolearningcurvessinceIamalreadyarrivingtothebestpossiblemodelperformanceandgeneralization,theoretically(givenasufficientamountofdata,therightnumberofiterationsandfeatures,andtherighthyperparametervalues).So,inmymindbyusingnestedcrossvalidation,thereisn’tanythingelsethatIcouldhavedonetoreduceoverfitting,hencemakinglearningcurvesunnecessary,right? Reply JasonBrownlee January27,2021at1:22pm # Thanks! Correct,validationsetasasubsetofthetrainingset: https://machinelearningmastery.com/difference-test-validation-datasets/ Yes,Ihavetonsofexamplesontheblog,usethesearchbox.Perhapsstarthere: https://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/ Correct,learningcurveisadiagnosticforpoormodelperformance,nothelpfulformodelselection/generaltestharnesslikenestedcv. Reply Vaishnavi February12,2021at12:07am # HiJason, Ifthedatasethas3ormorefeaturesX1,X2,…andIwanttoplotagraphofoutputvariationvsallthefeaturesX1,X2,…,howshouldIdothat?WhatwouldbeitssignificanceinML? Thankyou Reply JasonBrownlee February12,2021at5:46am # Perhapspair-wisescatterplots,oneforeachpairofvariables. Reply Vaishnavi February12,2021at7:26am # Thankyousomuchforhelp. Ilookedthroughthisarticlehttps://machinelearningmastery.com/visualize-machine-learning-data-python-pandas/ Reply TanujaShrestha February12,2021at9:17pm # Hi,Jason Ihavemodellearningcurveswithlosscurves–both,trainandtest–okay,however,bothtrainingandthetestingaccuracyisat100%fromthefirstepoch. WhatshouldIdo? Anysuggestions? Alwaysthankyou! Reply JasonBrownlee February13,2021at6:06am # ThisisacommonquestionthatIanswerhere: https://machinelearningmastery.com/faq/single-faq/what-does-it-mean-if-i-have-0-error-or-100-accuracy Reply Beny March31,2021at11:36pm # Hello, Iwouldbegratefulifyoucandiagnosemylearningcurveshere:https://imgur.com/a/z5ge9QI TheaccuracythatIgotis97%,butIdon’tknowwhetherthemodelisoverfittingorunderfittingbasedonthelearningcurvesthatIgot. Thankyou. Reply JasonBrownlee April1,2021at8:19am # Sorry,Iavoidtryingtointerpetresultsforreaders. Instead,Iprovidegeneraladvicesoyoucaninterpetresultsyourself. Reply Gaken April28,2021at1:36pm # HI,thanksfortheinformativepost!Whatifaftereachrunofmypythonappthelearningcurvesgeneratedbyitlookstoodifferentfromeachother.sometimesthevalidationcurveistoonoisyandsometimesitconvergeswiththetrainingcurve.Doesitsaysomethingaboutthedataset,orit’sjustthatmycodeforgeneratingthecurvesiswrong?Thankyou! Reply JasonBrownlee April29,2021at6:23am # You’rewelcome. Goodquestion,thismayhelp: https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-code Reply DavidEspinosa June2,2021at2:38pm # HelloJason, Thanksforthetutorial,nothinglikerefreshingthebasics. Youknow,Ihavehadmanytimesbehaviourssimilartothegraphlabeledas“ExampleofTrainandValidationLearningCurvesShowingaValidationDatasetThatIsEasiertoPredictThantheTrainingDataset”.Weshouldincreasethesizeofthevalidationsetandreducetheoneoftraining,right? Puttingsomefiguresonthatexample:ifIobtainedthatfigurewithasplitof80%trainand20%validation,agoodapproachforabetterfitwouldbetrying70%-30%then?I’dlovesome“based-on-experience”replyhere,becauseyouknow,“trialanderror”sometimesmighttakehours… Ihavealwaystackledthatissuebyusingcallbacks,butmaybeI’mlimitingthelearningcapabilityofmymodel,sothiscouldbetherightmomnttorealizesomethingI(probably)havebeendoingwrongthewholetime… Thankyouandbestregards. Reply JasonBrownlee June3,2021at5:29am # Thanks. Igofor50-50quiteoften…thenrepeattheexperimentafewtimestoaveragetheresutls. Reply Ibtissam June10,2021at11:16pm # Hellosir, muproblemisregression,ihave2models wheniplotwithfirtmodel:giveagoodfitbutthevalueofRMSEitisnotgood butwheniplotsecondmodelihavetestlossplotbelowoftrainlossplotwithdifferencebetweenthemnearlysimilairof“UnrepresentativeValidationDataset”(trainlossdecreaseandstable)butwiththeRMSEvaluebetterthenoffirstmodel ihave191981samplefortrain/47996samplefortest pleasethesecondmodeliscorrect? Reply JasonBrownlee June11,2021at5:15am # Perhapstestasuiteofdifferentmodelsandusetheonethatgivesthebestperformanceforyourspecificchosenmetric. Reply Sylvia June16,2021at5:35pm # ThanksfortheinformativearticleJason. MayipleaseknowanypossiblesolutionstoUnrepresentativeValidationDatasetproblem? IamapplyingittoECGproblemwheredifferentpatientshavedifferentcardiaccyclepatterns.Soeventhoughthereareabout4000normaltrainingpatternstolearnfrombuttheyalllookdifferentbecauseoftheinherentnatureoftheproblemitself(i.e.somedifferenceinecgpatternforeachpatient). Thanks. Reply JasonBrownlee June17,2021at6:14am # Youcouldtryusingalargedatasetforvalidation,e.g.a50/50splitoftraining. Notsurehowvalidationsetsworkfortimeseries,mightnotbeavalidconcept. Reply Sylvia June24,2021at3:52am # okay,Thankyou. Reply Sylvia June25,2021at10:58am # HelloJason Ialwaysgetloss:0.0000e+00–val_loss:0.0000e+00startingfromEpoch1itselfofmodeltrainingandhenceastraightlineat0learningcurve. Doyouadviceanypossiblereasonsregardingthismodelbehaviour,whichpossibletotune?Thanks. Reply JasonBrownlee June26,2021at4:51am # Itmaysuggestthatyourproblemiseasilysolved/trivial,e.g.: https://machinelearningmastery.com/faq/single-faq/what-does-it-mean-if-i-have-0-error-or-100-accuracy Reply Bill June23,2021at11:26pm # Hello, Isthisoverfit? https://ibb.co/Z6nrXM4 Thankyouverymuch Reply JasonBrownlee June24,2021at6:02am # Sorry,Itrytoavoidinterpretingresultsforreaders. Reply Sylvia June30,2021at2:00am # Okay,thankyouverymuchforthereference. Reply JasonBrownlee June30,2021at5:21am # You’rewelcome. Reply puneetsai August12,2021at3:55am # https://docs.google.com/document/d/1Va__vfW7JaXSLOsRuC5mXX4T1333AUPI/edit?usp=sharing&ouid=107190645093315861813&rtpof=true&sd=true iwantedtoaskjasonwhatarebestpracticestofindinflextionpoint, inabovelearningcurve,wecanseelosscontinuestodecreasebutval_losshasabump. [email protected]0,01(epoch0–20)pointAinflectionpointgivescloserprediction. Doespeopleuse%decreaseinlossand%increaseinval_lossduringsametimetoidentifyinflextion. IearlierusedinflextionpointBb/wepoch40-60whenval_losswasaround0.02butthatgiveslargepredictionerror. ThenIobservedthatb/wepoch15-50(theseareapprox),therewas8%decreaseinlossvs100%increaseinval_loss. willthatbesufficientcrieriatostoptrainingandchoosepointAasinflextionpoint? thx Reply AdrianTam August12,2021at6:09am # Itisnormaltoseethelosskeepdecreasingwhenyoutrainbutvalidationlossmaygoupafterawhile.That’soverfittingstarts.Youcanseethepostonearlystoppingtolearnmore:https://machinelearningmastery.com/how-to-stop-training-deep-neural-networks-at-the-right-time-using-early-stopping/ Reply puneet August13,2021at4:27am # Unlessididntunderstandearlystoppingandbestmodelcorrectly,ithinkbelowalgorithmwillgivethebestepochandidontthinkitisgivenbyneither. foranepochtobestepoch,lossshudbeminimumacrossallepochsANDforthatepochval_lossshudbealsominimum.forexampleifthebestepochhaslossof.01andval_lossof.001,thereisnootherepochwhereloss<=.01andval_loss<.001. bestmodelonlytakesintoaccountval_lossinisolation.itshudbeincoordinationwithloss. soweneedtoimplementabovealgorithmtogetbestepochbecausenotalllearningcurvesaresmoothandhavebumps. notsureearlystoppingalsohelpsheretogettoexactlythatbestepoch. thoughts reply adriantam august13 fromkerasdocumentationontheearlystoppingmodule onur hello iamtryingtobuilt3dcnnregressionnetwork.myinputdatais thevalidationlossslightlyincreasesuchasfrom0.016to0.018.butthevalidationlossstartswithverysmallnumbe reveninfirstepoch.whatshouldido thanksforreply august14 validationlossvaluedependsonthescaleofthedata.thevalue0.016maybeok hackercop september5 sirthesearetheresultsfrommymodel.https: jasonbrownlee september6 lookslikethevalidationsetissmall sam october11 hellodr.jason mydatasettrainingis30.000imagesandfortesting5000.igottheplotlike https: howcanisolvethisproblem october13 thelossdifferencebetweentrainingandvalidationisnotverybig.isthataproblem bruce december24 hi jamescarmichael january10 hibruce aggelospapoutsis january13 hiall iatryingtounderstandthislearningcurveofaclassificationproblem.butiamnotsurewhattoinfer.ibelievethati haveoverfittingbuticannotsure. however ontheotherhand iamconfusedcanyoupleaseprovidemewithsomeadvice january14 helloaggelos jurrian january26 howdoesthesecondexamplecurveofunderfittingwork february2 hijurrian belal march8 whatarethetypesoflearningcurvesinhealthandthedifferencebetweenthepastandnow march9 hibelal march10 whatarethetypesoflearningcurvesinhealthandthedifferencebetweenthepastandthepresent dion march13 hidion wyatt march29 hijames iactuallymetasituationconfusingmethatthetraininglossiskeepingdecreasewhilethevallossisturntostable wouldyouthinkthisisakindofunderfitting manythanks march30 hiwyatt rj june10 thankyou hirj talalahmed july8 iwasworkingonaclassificationproblemwhereifacedastrangebehaviorinlearningcurves.iplottedlosscurveanda ccuracycurve.accuracyofmymodelontrainsetwas84 hitalal leaveareplyclickheretocancelreply.comment email welcome i andihelpdevelopersgetresultswithmachinelearning. readmore nevermissatutorial: pickedforyou: howtouselearningcurvestodiagnosemachinelearningmodelperformancestackingensemblefordeeplearningneural networksinpythonhowtoimprovedeeplearningperformancehowtousedatascalingimprovedeeplearningmodelstabil ityandperformancehowtochooselossfunctionswhentrainingdeeplearningneuralnetworks lovingthetutorials thebetterdeeplearningebookiswhereyou>>SeeWhat'sInside

請為這篇文章評分？

延伸文章資訊

Learning Curves in Machine Learning - SpringerLink

A learning curve shows a measure of predictive performance on a given domain as a function of som...

What is a Learning Curve in machine learning? - Stack Overflow

An ROC curve is a graphical depiction of classifier performance that shows the trade-off between ...

Learning Curve to identify Overfitting and Underfitting in ...

Learning curves plot the training and validation loss of a sample of training examples by increme...

Why you should be plotting learning curves in your next ...

Learning curves show the relationship between training set size and your chosen evaluation metric...

How to use Learning Curves to Diagnose Machine Learning ...

A learning curve is a plot of model learning performance over experience or time. Learning curves...

How to use Learning Curves to Diagnose Machine Learning ...

文章推薦指數： 80 %

請為這篇文章評分？

延伸文章資訊

最新文章

相關網站資訊

中日口譯課程

中國生產力中心口譯評價

紙的應用