Learning curve (machine learning) - Wikipedia

文章推薦指數: 80 %
投票人數:10人

In machine learning, a learning curve (or training curve) plots the optimal value of a model's loss function for a training set against this loss function ... Learningcurve(machinelearning) FromWikipedia,thefreeencyclopedia Jumptonavigation Jumptosearch Thisarticleprovidesinsufficientcontextforthoseunfamiliarwiththesubject.Pleasehelpimprovethearticlebyprovidingmorecontextforthereader.(March2019)(Learnhowandwhentoremovethistemplatemessage) Learningcurveshowingtrainingscoreandcrossvalidationscore PartofaseriesonMachinelearninganddatamining Problems Classification Clustering Regression Anomalydetection DataCleaning AutoML Associationrules Reinforcementlearning Structuredprediction Featureengineering Featurelearning Onlinelearning Semi-supervisedlearning Unsupervisedlearning Learningtorank Grammarinduction Supervisedlearning(classification •regression) Decisiontrees Ensembles Bagging Boosting Randomforest k-NN Linearregression NaiveBayes Artificialneuralnetworks Logisticregression Perceptron Relevancevectormachine(RVM) Supportvectormachine(SVM) Clustering BIRCH CURE Hierarchical k-means Expectation–maximization(EM) DBSCAN OPTICS Meanshift Dimensionalityreduction Factoranalysis CCA ICA LDA NMF PCA PGD t-SNE Structuredprediction Graphicalmodels Bayesnet Conditionalrandomfield HiddenMarkov Anomalydetection k-NN Localoutlierfactor Artificialneuralnetwork Autoencoder Cognitivecomputing Deeplearning DeepDream Multilayerperceptron RNN LSTM GRU ESN reservoircomputing RestrictedBoltzmannmachine GAN SOM Convolutionalneuralnetwork U-Net Transformer Vision Spikingneuralnetwork Memtransistor ElectrochemicalRAM(ECRAM) Reinforcementlearning Q-learning SARSA Temporaldifference(TD) Multi-agent Self-play Learningwithhumans Activelearning Crowdsourcing Human-in-the-loop Modeldiagnostics Learningcurve Theory Kernelmachines Bias–variancetradeoff Computationallearningtheory Empiricalriskminimization Occamlearning PAClearning Statisticallearning VCtheory Machine-learningvenues NeurIPS ICML ML JMLR ArXiv:cs.LG Relatedarticles Glossaryofartificialintelligence Listofdatasetsformachine-learningresearch Outlineofmachinelearning vte Inmachinelearning,alearningcurve(ortrainingcurve)plotstheoptimalvalueofamodel'slossfunctionforatrainingsetagainstthislossfunctionevaluatedonavalidationdatasetwithsameparametersasproducedtheoptimalfunction.Itisatooltofindouthowmuchamachinemodelbenefitsfromaddingmoretrainingdataandwhethertheestimatorsuffersmorefromavarianceerrororabiaserror.Ifboththevalidationscoreandthetrainingscoreconvergetoavaluethatistoolowwithincreasingsizeofthetrainingset,itwillnotbenefitmuchfrommoretrainingdata.[1] Themachinelearningcurveisusefulformanypurposesincludingcomparingdifferentalgorithms,[2]choosingmodelparametersduringdesign,[3]adjustingoptimizationtoimproveconvergence,anddeterminingtheamountofdatausedfortraining.[4] Inthemachinelearningdomain,therearetwoimplicationsoflearningcurvesdifferinginthex-axisofthecurves,withexperienceofthemodelgraphedeitherasthenumberoftrainingexamplesusedforlearningorthenumberofiterationsusedintrainingthemodel.[5] Contents 1Formaldefinition 1.1Trainingcurveforamountofdata 1.2Trainingcurvefornumberofiterations 2Seealso 3References Formaldefinition[edit] Onemodelofamachinelearningisproducingafunction,f(x),whichgivensomeinformation,x,predictssomevariable,y,fromtrainingdata X train {\displaystyleX_{\text{train}}} and Y train {\displaystyleY_{\text{train}}} .Itisdistinctfrommathematicaloptimizationbecause f {\displaystylef} shouldpredictwellfor x {\displaystylex} outsideof X train {\displaystyleX_{\text{train}}} . Weoftenconstrainthepossiblefunctionstoaparameterizedfamilyoffunctions, { f θ ( x ) : θ ∈ Θ } {\displaystyle\{f_{\theta}(x):\theta\in\Theta\}} ,sothatourfunctionismoregeneralizable[6]orsothatthefunctionhascertainpropertiessuchasthosethatmakefindingagood f {\displaystylef} easier,orbecausewehavesomeapriorireasontothinkthatthesepropertiesaretrue.[6]: 172  Giventhatitisnotpossibletoproduceafunctionthatperfectlyfitsoutdata,itisthennecessarytoproducealossfunction L ( f θ ( X ) , Y ′ ) {\displaystyleL(f_{\theta}(X),Y')} tomeasurehowgoodourpredictionis.Wethendefineanoptimizationprocesswhichfindsa θ {\displaystyle\theta} whichminimizes L ( f θ ( X , Y ) ) {\displaystyleL(f_{\theta}(X_{,}Y))} referredtoas θ ∗ ( X , Y ) {\displaystyle\theta^{*}(X,Y)} . Trainingcurveforamountofdata[edit] Thenifourtrainingdatais { x 1 , x 2 , … , x n } , { y 1 , y 2 , … y n } {\displaystyle\{x_{1},x_{2},\dots,x_{n}\},\{y_{1},y_{2},\dotsy_{n}\}} andourvalidationdatais { x 1 ′ , x 2 ′ , … x m ′ } , { y 1 ′ , y 2 ′ , … y m ′ } {\displaystyle\{x_{1}',x_{2}',\dotsx_{m}'\},\{y_{1}',y_{2}',\dotsy_{m}'\}} alearningcurveistheplotofthetwocurves i ↦ L ( f θ ∗ ( X i , Y i ) ( X i ) , Y i ) {\displaystylei\mapstoL(f_{\theta^{*}(X_{i},Y_{i})}(X_{i}),Y_{i})} i ↦ L ( f θ ∗ ( X i , Y i ) ( X i ′ ) , Y i ′ ) {\displaystylei\mapstoL(f_{\theta^{*}(X_{i},Y_{i})}(X_{i}'),Y_{i}')} where X i = { x 1 , x 2 , … x i } {\displaystyleX_{i}=\{x_{1},x_{2},\dotsx_{i}\}} Trainingcurvefornumberofiterations[edit] Manyoptimizationprocessesareiterative,repeatingthesamestepuntiltheprocessconvergestoanoptimalvalue.Gradientdescentisonesuchalgorithm.Ifyoudefine θ i ∗ {\displaystyle\theta_{i}^{*}} astheapproximationoftheoptimal θ {\displaystyle\theta} after i {\displaystylei} steps,alearningcurveistheplotof i ↦ L ( f θ i ∗ ( X , Y ) ( X ) , Y ) {\displaystylei\mapstoL(f_{\theta_{i}^{*}(X,Y)}(X),Y)} i ↦ L ( f θ i ∗ ( X , Y ) ( X ′ ) , Y ′ ) {\displaystylei\mapstoL(f_{\theta_{i}^{*}(X,Y)}(X'),Y')} Seealso[edit] Overfitting Bias–variancetradeoff Modelselection Cross-validation(statistics) Validity(statistics) Verificationandvalidation References[edit] ^scikit-learndevelopers."Validationcurves:plottingscorestoevaluatemodels—scikit-learn0.20.2documentation".RetrievedFebruary15,2019. ^Madhavan,P.G.(1997)."ANewRecurrentNeuralNetworkLearningAlgorithmforTimeSeriesPrediction"(PDF).JournalofIntelligentSystems.p. 113Fig.3. ^"MachineLearning102:PracticalAdvice".Tutorial:MachineLearningforAstronomywithScikit-learn. ^Meek,Christopher;Thiesson,Bo;Heckerman,David(Summer2002)."TheLearning-CurveSamplingMethodAppliedtoModel-BasedClustering".JournalofMachineLearningResearch.2(3):397.Archivedfromtheoriginalon2013-07-15. ^Sammut,Claude;Webb,GeoffreyI.(Eds.)(28March2011).EncyclopediaofMachineLearning(1st ed.).Springer.p. 578.ISBN 978-0-387-30768-8. ^abGoodfellow,Ian;Bengio,Yoshua;Courville,Aaron(2016-11-18).DeepLearning.MITPress.p. 108.ISBN 978-0-262-03561-3. Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Learning_curve_(machine_learning)&oldid=1057677925" Categories:ModelselectionMachinelearningArtificialintelligenceHiddencategories:WikipediaarticlesneedingcontextfromMarch2019AllWikipediaarticlesneedingcontextWikipediaintroductioncleanupfromMarch2019Allpagesneedingcleanup Navigationmenu Personaltools NotloggedinTalkContributionsCreateaccountLogin Namespaces ArticleTalk English Views ReadEditViewhistory More Search Navigation MainpageContentsCurrenteventsRandomarticleAboutWikipediaContactusDonate Contribute HelpLearntoeditCommunityportalRecentchangesUploadfile Tools WhatlinkshereRelatedchangesUploadfileSpecialpagesPermanentlinkPageinformationCitethispageWikidataitem Print/export DownloadasPDFPrintableversion Languages 粵語 Editlinks



請為這篇文章評分?