Learning Curves Tutorial: What Are Learning Curves?
文章推薦指數: 80 %
Learning curves are plots used to show a model's performance as the training set size increases. Another way it can be used is to show the ...
SkiptomaincontentSignInGetStartedBlogBlogArticlesPodcastTutorialsCheatSheetsCategoryCategoryAboutDataCampLatestnewsaboutourproductsandteamForBusinessCategoryTechnologiesDiscovercontentbytoolsandtechnologyGitPowerBIPythonRProgrammingScalaSpreadsheetsSQLTableauCategoryTopicsDiscovercontentbydatasciencetopicsAIBigDataDataAnalysisDataEngineeringDataLiteracyDataScienceDataVisualizationDeepLearningMachineLearningWorkspaceWriteforusCategorySearch
Alookatthebias-variancetradeoff
Theanatomyofalearningcurve
Usecase:Predictingrealestatevaluations
Diagnosinglearningcurves
Model1:Decisiontreeregressor
Model2:SupportVectorMachine
Model3:RandomForestRegressor
Machinelearningmodelsareemployedtolearnpatternsindata.Thebestmodelscangeneralizewellwhenfacedwithinstancesthatwerenotpartoftheinitialtrainingdata.Duringtheresearchphase,severalexperimentsareconductedtofindthesolutionthatbestsolvesthebusiness'sproblem,andreducestheerrorbeingmadebythemodel.Anerrormaybedefinedasthedifferencebetweenthepredictionofobservationandthetruevalueoftheobservation.
Therearetwomajorcausesforerrorsinmachinelearningmodels:
Biasdescribesamodelwhichmakessimplifiedassumptionssothetargetfunctioniseasiertoapproximate;amodelmaylearnthatevery5'9maleintheworldwearsasizemediumtop-thisisclearlybiased.
Variancedescribesthevariabilityinthemodelprediction;howmuchthepredictionofthemodelchangeswhenwechangethedatausedtotrainit.
Toattainamoreaccuratesolution,weseektoreducetheamountofbiasandvariancepresentinourmodel.Thisisnotastraightforwardtask.Biasandvarianceareatoddswitheachother-reducingonewillincreasetheotherbecauseofaconceptknownasthebias-variancetradeoff.
Inthisarticleyou'lllearn:
Howtodetectwhetheramodelsuffersfromhighbiasorhighvariance
Howtodiagnoseamodelsufferingfromeithersymptom
Howtobuildagood-fitmodel
Beforewegetintodetectingtheerrorsymptoms,let'sfirstgointomoredepthwiththebias-variancetradeoff.
Alookatthebias-variancetradeoff
Allsupervisedlearningalgorithmsstrivetoachievethesameobjective:estimatingthemappingfunction(f_hat)foratargetvariable(y)givensomeinputdata(X).Werefertothefunctionthatamachinelearningmodelaimstoapproximateasthetargetfunction.
Changingtheinputdatausedtoapproximatethetargetvariablewilllikelyresultinadifferenttargetfunction,whichmayimpacttheoutputspredictedbythemodel.Howmuchourtargetfunctionvariesasthetrainingdataischangedisknownasthevariance.Wedon'twantourmodeltohavehighvariancebecausewhileouralgorithmmayperformflawlesslyduringtraining,itfailstogeneralizetounseeninstances.
[Source:Wikipedia]
Intheaboveimage,theapproximatedtargetfunctionisthegreenlineandthelineofbestfitisinblack.Noticehowwellthemodellearnsthetrainingdatawiththegreenline.Itdoesitsbesttoensureallredandblueobservationsareseparated.Ifwetrainedthismodelonnewobservations,itwouldlearnanentirelynewtargetfunctionandattempttoenactthesamebehavior.
Considerascenarioinwhichweusealinearmethodlikelinearregressiontoapproximatethetargetfunction.Thefirstthingtonoteaboutlinearregressionisthatitassumesalinearrelationshipbetweentheinputdataandthetargetwearetryingtopredict.Eventsintherealworldarealotmorecomplex.Atthecostofsomeflexibility,thissimpleassumptionmakesthetargetfunctionmuchquickertolearnandeasiertounderstand.WerefertothisparadigmasBias.
[Source:Wikipedia]
Intheimageabove,theredlinerepresentsthelearnedtargetfunction.Manyoftheobservationsfallfarawayfromthevaluespredictedbythemodel.
Wecanreducethebiasinamodelbymakingitmoreflexible,butthisintroducesvariance.Ontheflipside,wecanreducethevarianceofamodelbysimplifyingit,butthisisintroducingbias.There'snowaytoescapethisrelationship.Thebestalternativeistochooseamodelandconfigureitsuchthatitstrikesabalanceinthetradeoffbetweenbiasandvariance.
[Source:Wikipedia]
Duetounknownfactorsinfluencingthetargetfunction,therewillalwaysbesomeerrorpresentinthemodel,knownastheirreducibleerror.ThismaybeobservedintheimageabovebynotingtheamountoferrorthatoccursunderthelowestpointoftheTotalErrorplot.Tobuildtheidealmodel,wemustfindabalancebetweenbiasandvariancesuchthatthetotalerrorisminimized.ThisisillustratedwiththedottedlinecalledOptimumModelComplexity.
Let'sexpandonbiasandvarianceusinglearningcurves.
Theanatomyofalearningcurve
Learningcurvesareplotsusedtoshowamodel'sperformanceasthetrainingsetsizeincreases.Anotherwayitcanbeusedistoshowthemodel'sperformanceoveradefinedperiodoftime.Wetypicallyusedthemtodiagnosealgorithmsthatlearnincrementallyfromdata.Itworksbyevaluatingamodelonthetrainingandvalidationdatasets,thenplottingthemeasuredperformance.
Forexample,imaginewe'vemodeledtherelationshipbetweensomeinputsandoutputsusingamachinelearningalgorithm.Westartoffbytrainingthemodelononeinstanceandvalidatingagainstone-hundredinstances.Whatdoyouthinkwillhappen?Ifyousaidthemodelwilllearnthetrainingdataperfectlythenyou'recorrect-therewouldbenoerrors.
It'snothardtomodeltherelationshipofoneinputtooutput;allyouhavetodoisrememberthatrelationship.Thedifficultpartwouldbetryingtomakeaccuratepredictionswhenpresentedwithnew,unseeninstances.Sinceourmodellearnedthetrainingdatasowell,itwouldhaveaterribletimetryingtogeneralizetodatait'snotseenbefore.Themodelwillperformpoorlyonourvalidationdataasaresult.Thiswouldmeantherewouldbealargedifferencebetweentheperformanceofourmodelonthetrainingdataandvalidationdata.Wecallthisdifferencethegeneralizationerror.
Ifouralgorithmisgoingtostandachanceofmakingbetterpredictionsonthevalidationdataset,weneedtoaddmoredata.Introducingnewinstancestothetrainingdatawillinevitablychangethetargetfunctionofourmodel.Howthemodelperformsaswegrowthetrainingdatasetcouldbemonitoredandplottedtorevealtheevolutionofthetrainingandvalidationerrorscores.
Thismeansthegraphwilldisplaytwodifferentresults:
Trainingcurve:Thecurvecalculatedfromthetrainingdata;usedtoinformhowwellamodelislearning.
Validationcurve:Thecurvecalculatedfromthevalidationdata;usedtoinformofhowwellthemodelisgeneralizingtounseeninstances.
Thesecurvesshowushowwellthemodelisperformingasthedatagrows,hencethenamelearningcurves.
Note:Thesameprocessmaybeusedtoinformusofhowourmodellearnsovertime.Insteadofmonitoringhowthemodelisdoingasthedatagetslarger,wemonitorhowwellthemodellearnsovertime.Forexample,youmaydecidetolearnanewlanguage.Yourgraspofthatlanguagecouldbeevaluatedandassignedanumericalscoretoshowhowyou'vefairedoverthecourseof52weeks.
You'venowlearnedtheanatomyofalearningcurve;let'sputitintopracticewithareal-worlddatasettogiveyouavisualunderstanding.
Usecase:Predictingrealestatevaluations
Wewillbeusingthedataset:themarkethistoricaldatasetofrealestatevaluation.ThisdatawascollectedfromSindianDist.,NewTaipei,Taiwanandconsistsofmarkethistoricaldata.
Ourtaskistopredicttherealestatevaluationgiventhefollowingfeatures:
X1=thetransactiondate(forexample,2013.250=2013March,2013.500=2013June,etc.)
X2=thehouseage(unit:year)
X3=thedistancetothenearestMRTstation(unit:meter)
X4=thenumberofconveniencestoresinthelivingcircleonfoot(integer)
X5=thegeographiccoordinate,latitude.(unit:degree)
X6=thegeographiccoordinate,longitude.(unit:degree)
Thetargetvariableisdefinedas:
Y=housepriceofunitarea(10000NewTaiwanDollar/Ping,wherePingisalocalunit,1Ping=3.3meterssquared)
Thetargetwearepredictingiscontinuous,thustheproblemisgoingtorequireregressiontechniques.
Let'sstartbypeekingatthedata:
importpandasaspd
data=pd.read_excel("/content/gdrive/MyDrive/real_estate_valuation_data.xlsx")
print(data.info())
data.head()
>>>>
延伸文章資訊
- 1How to use Learning Curves to Diagnose Machine Learning ...
A learning curve is a plot of model learning performance over experience or time. Learning curves...
- 2Why you should be plotting learning curves in your next ...
Learning curves show the relationship between training set size and your chosen evaluation metric...
- 3Learning curve (machine learning) - Wikipedia
In machine learning, a learning curve (or training curve) plots the optimal value of a model's lo...
- 4Learning Curves in Machine Learning - SpringerLink
A learning curve shows a measure of predictive performance on a given domain as a function of som...
- 5What is a Learning Curve in machine learning? - Stack Overflow
An ROC curve is a graphical depiction of classifier performance that shows the trade-off between ...