How to read csv data from local system and replace and ...

文章推薦指數: 80 %
投票人數:10人

This recipe helps you read csv data from local system and replace and rename the columns in python. SolvedProjects CustomerReviews ExpertsNew ProjectPath DataScienceProjectPath BigDataProjectPath Recipes AllRecipes RecipesByTag RecipesByCompany SignIn StartLearning Howtoreadcsvdatafromlocalsystemandreplaceandrenamethecolumnsinpython Thisrecipehelpsyoureadcsvdatafromlocalsystemandreplaceandrenamethecolumnsinpython LastUpdated:06Jul2022 GetaccesstoDataScienceprojects ViewallDataScienceprojects DATASCIENCEPROJECTSINPYTHON DATACLEANINGPYTHON DATAMUNGING MACHINELEARNINGRECIPES PANDASCHEATSHEET     ALLTAGS RecipeObjective Inmostofthebigdatascenarios,datacleaningtaskisanintegralpartofadatapipelineinwhichtherawfileistakenandmostofthebelowcleaningoperationsaredoneonthedataaccordingtotheusecase.Withoutdatacleaningbeingdone,it'shardtostandardizethestructureofthedataandstoreitintodatalakeswhichwillhaveadrasticimpactonthebusinessdecisionsonthedata. MastertheArtofDataCleaninginMachineLearning Systemrequirements: Installthepythonmoduleasfollowsifthebelowmodulesarenotfound: pipinstallrequests ThebelowcodescanberuninJupyternotebook,oranypythonconsole TableofContents RecipeObjective Systemrequirements: Step1:Importthemodule Step2:Readthecsvfile Step3:Changethedateformat Step4:Convertcolumnnamestolowercase Step5:ReplacingEmptyspaceswithunderscore Step6:Renamethecolumnnames Step7:CheckforMissingValues Step8:FillingMissingData Step9:Droppingmissingdata Step10:Createcsvfile Step1:Importthemodule Toimport importpandasaspd importdatetime Step2:Readthecsvfile Readthecsvfilefromthelocalandcreateadataframeusingpandas,andprintthe5linestocheckthedata. df=pd.read_csv('stockdata.csv') df.head() Outputoftheabovecode: Step3:Changethedateformat Inthisbelowcodechangingthedatecolumnformat"28-February-2015"to28-02-2015. df['Date']=pd.to_datetime(df['Date']) print(df['Date']) Outputoftheabovecode: Step4:Convertcolumnnamestolowercase #Toconvertthecolumnnameslowercase df.columns.str.lower() Outputoftheabovecode: Step5:ReplacingEmptyspaceswithunderscore Inthecodebelowwearegoingtoreplacingtheemptyspacesinthecolumnnameswithunderscore #toputunderscoreinallcolumns ###ReplacingEmptyspaceswithunderscore df.columns.str.replace('','_') Outputoftheabovecode: Step6:Renamethecolumnnames Inthebelowwearereadingcolumnnamesfromthedataframeandrenamethecolumnnamestocreateanewdataframe.Printthecolumnstocheck: df.columns df.head() Outputoftheabovecode: Outputoftheabovecode: Createanewdataframerenamecolumnnamesusingrenamemethodinkeyvalueformatandprintthe5lines. df2=df.rename(columns={'Date':'stock_date','OpenPrice':'open_price','HighPrice':'high_price','LowPrice':'low_price','ClosePrice':'close_price','WAP':'weighted_avg_price', 'No.ofShares':'num_of_shares','No.ofTrades':'num_of_rades','TotalTurnover(Rs.)':'tot_turnover_in_rupees','DeliverableQuantity':'delvry_quantity','%Deli.QtytoTradedQty':'delvry_qty_to_traded_qty','SpreadHigh-Low':'spread_hi-low', 'SpreadClose-Open':'spread_close-open'},inplace=False) df2.head() Outputoftheabovecode: Step7:CheckforMissingValues Tomakedetectingmissingvalues,Pandaslibraryprovidestheisnull()andnotnull()functions,whicharealsomethodsonSeriesandDataFrameobjects.Trueiftheoriginalvalueisnull,Falseiftheoriginalvalueisn'tnull df2.isnull() Outputoftheabovecode: Anotherwayitcaneasilybedoneisbyusingtheisnullfunctionpairedwiththe'sum'function.itwillgivethecountofnullvaluesfromeachcolumn. df2.isnull().sum() Outputoftheabovecode: Step8:FillingMissingData Pandaslibraryprovidesdifferenttypesofmethodsforcleaningthemissingvalues.Thefillnafunctioncan"fillin"NAvalueswithnon-nulldatainacoupleofways. df2.close_price=df2.close_price.fillna(0) print(df2.close_price.isnull().sum()) Outputoftheabovecode:Intheabovecodefillingthenullvalueswith0 Step9:Droppingmissingdata Thefinalthingyoucando,istodeletethemissingrowsofdata.Wecandeleteallrowswithanymissingdataasshownbelow. df2=df2.dropna() df2.isnull().sum() Otherway:UsethedroptoRemoverowsorcolumnsbyspecifyinglabelnamesandcorrespondingaxisdataasshownbelow. df2=df2.drop() df2.isnull().sum() Step10:Createcsvfile Afterrenamingthecolumnnameswritetheformatteddataintoacsvfileinyourlocalorhdfs. #writeformattedtocsv df2.to_csv("stock_data.csv") Outputoftheabovelines: WhatUsersaresaying.. GautamVermani DataConsultantatConfidential HavingworkedinthefieldofDataScience,IwantedtoexplorehowIcanimplementprojectsinotherdomains,SoIthoughtofconnectingwithProjectPro.Aprojectthathelpedmeabsorbthistopic... ReadMore RelevantProjects MachineLearningProjects DataScienceProjects PythonProjectsforDataScience DataScienceProjectsinR MachineLearningProjectsforBeginners DeepLearningProjects NeuralNetworkProjects TensorflowProjects NLPProjects KaggleProjects IoTProjects BigDataProjects HadoopReal-TimeProjectsExamples SparkProjects DataAnalyticsProjectsforStudents Youmightalsolike DataScienceTutorial DataScientistSalary HowtoBecomeaDataScientist DataAnalystvsDataScientist DataScientistResume DataScienceProjectsforBeginners MachineLearningEngineer MachineLearningProjectsforBeginners Datasets PandasDataframe MachineLearningAlgorithms RegressionAnalysis MNISTDataset DataScienceInterviewQuestions PythonDataScienceInterviewQuestions SparkInterviewQuestions HadoopInterviewQuestions DataAnalystInterviewQuestions MachineLearningInterviewQuestions AWSvsAzure HadoopArchitecture SparkArchitecture RelevantProjects OpenCVProjectforBeginnerstoLearnComputerVisionBasics InthisOpenCVproject,youwilllearncomputervisionbasicsandthefundamentalsofOpenCVlibraryusingPython. ViewProjectDetails TimeSeriesForecastingProject-BuildingARIMAModelinPython BuildatimeseriesARIMAmodelinPythontoforecasttheuseofarrivalratedensitytosupportstaffingdecisionsatcallcentres. ViewProjectDetails Hands-OnApproachtoRegressionDiscontinuityDesignPython Inthismachinelearningproject,youwilllearntoimplementRegressionDiscontinuityDesignExampleinPythontodeterminetheeffectofageonMortalityRateinPython. ViewProjectDetails BuildaCustomerChurnPredictionModelusingDecisionTrees Developacustomerchurnpredictionmodelusingdecisiontreemachinelearningalgorithmsanddatascienceonstreamingservicedata. ViewProjectDetails BuildCNNImageClassificationModelsforRealTimePrediction ImageClassificationProjecttobuildaCNNmodelinPythonthatcanclassifyimagesintosocialsecuritycards,drivinglicenses,andotherkeyidentityinformation. ViewProjectDetails AvocadoMachineLearningProjectPythonforPricePrediction InthisMLProject,youwillusetheAvocadodatasettobuildamachinelearningmodeltopredicttheaveragepriceofavocadowhichiscontinuousinnaturebasedonregionandvarietiesofavocado. ViewProjectDetails SkipGramModelPythonImplementationforWordEmbeddings Skip-GramModelword2vecExample-LearnhowtoimplementtheskipgramalgorithminNLPforwordembeddingsonasetofdocuments. ViewProjectDetails TensorflowTransferLearningModelforImageClassification ImageClassificationProject-BuildanImageClassificationModelonaDatasetofT-ShirtImagesforBinaryClassification ViewProjectDetails AWSMLOpsProjectforGaussianProcessTimeSeriesModeling MLOpsProjecttoBuildandDeployaGaussianProcessTimeSeriesModelinPythononAWS ViewProjectDetails BuildaTextClassificationModelwithAttentionMechanismNLP InthisNLPProject,youwilllearntobuildamulticlasstextclassificationmodelwithattentionmechanism. ViewProjectDetails SubscribetoRecipes × GetAccess CONTINUE SignuptoViewRecipe × ViewRecipe CONTINUE



請為這篇文章評分?