How to read csv data from local system and replace and ...
文章推薦指數: 80 %
This recipe helps you read csv data from local system and replace and rename the columns in python. SolvedProjects CustomerReviews ExpertsNew ProjectPath DataScienceProjectPath BigDataProjectPath Recipes AllRecipes RecipesByTag RecipesByCompany SignIn StartLearning Howtoreadcsvdatafromlocalsystemandreplaceandrenamethecolumnsinpython Thisrecipehelpsyoureadcsvdatafromlocalsystemandreplaceandrenamethecolumnsinpython LastUpdated:06Jul2022 GetaccesstoDataScienceprojects ViewallDataScienceprojects DATASCIENCEPROJECTSINPYTHON DATACLEANINGPYTHON DATAMUNGING MACHINELEARNINGRECIPES PANDASCHEATSHEET ALLTAGS RecipeObjective Inmostofthebigdatascenarios,datacleaningtaskisanintegralpartofadatapipelineinwhichtherawfileistakenandmostofthebelowcleaningoperationsaredoneonthedataaccordingtotheusecase.Withoutdatacleaningbeingdone,it'shardtostandardizethestructureofthedataandstoreitintodatalakeswhichwillhaveadrasticimpactonthebusinessdecisionsonthedata. MastertheArtofDataCleaninginMachineLearning Systemrequirements: Installthepythonmoduleasfollowsifthebelowmodulesarenotfound: pipinstallrequests ThebelowcodescanberuninJupyternotebook,oranypythonconsole TableofContents RecipeObjective Systemrequirements: Step1:Importthemodule Step2:Readthecsvfile Step3:Changethedateformat Step4:Convertcolumnnamestolowercase Step5:ReplacingEmptyspaceswithunderscore Step6:Renamethecolumnnames Step7:CheckforMissingValues Step8:FillingMissingData Step9:Droppingmissingdata Step10:Createcsvfile Step1:Importthemodule Toimport importpandasaspd importdatetime Step2:Readthecsvfile Readthecsvfilefromthelocalandcreateadataframeusingpandas,andprintthe5linestocheckthedata. df=pd.read_csv('stockdata.csv') df.head() Outputoftheabovecode: Step3:Changethedateformat Inthisbelowcodechangingthedatecolumnformat"28-February-2015"to28-02-2015. df['Date']=pd.to_datetime(df['Date']) print(df['Date']) Outputoftheabovecode: Step4:Convertcolumnnamestolowercase #Toconvertthecolumnnameslowercase df.columns.str.lower() Outputoftheabovecode: Step5:ReplacingEmptyspaceswithunderscore Inthecodebelowwearegoingtoreplacingtheemptyspacesinthecolumnnameswithunderscore #toputunderscoreinallcolumns ###ReplacingEmptyspaceswithunderscore df.columns.str.replace('','_') Outputoftheabovecode: Step6:Renamethecolumnnames Inthebelowwearereadingcolumnnamesfromthedataframeandrenamethecolumnnamestocreateanewdataframe.Printthecolumnstocheck: df.columns df.head() Outputoftheabovecode: Outputoftheabovecode: Createanewdataframerenamecolumnnamesusingrenamemethodinkeyvalueformatandprintthe5lines. df2=df.rename(columns={'Date':'stock_date','OpenPrice':'open_price','HighPrice':'high_price','LowPrice':'low_price','ClosePrice':'close_price','WAP':'weighted_avg_price', 'No.ofShares':'num_of_shares','No.ofTrades':'num_of_rades','TotalTurnover(Rs.)':'tot_turnover_in_rupees','DeliverableQuantity':'delvry_quantity','%Deli.QtytoTradedQty':'delvry_qty_to_traded_qty','SpreadHigh-Low':'spread_hi-low', 'SpreadClose-Open':'spread_close-open'},inplace=False) df2.head() Outputoftheabovecode: Step7:CheckforMissingValues Tomakedetectingmissingvalues,Pandaslibraryprovidestheisnull()andnotnull()functions,whicharealsomethodsonSeriesandDataFrameobjects.Trueiftheoriginalvalueisnull,Falseiftheoriginalvalueisn'tnull df2.isnull() Outputoftheabovecode: Anotherwayitcaneasilybedoneisbyusingtheisnullfunctionpairedwiththe'sum'function.itwillgivethecountofnullvaluesfromeachcolumn. df2.isnull().sum() Outputoftheabovecode: Step8:FillingMissingData Pandaslibraryprovidesdifferenttypesofmethodsforcleaningthemissingvalues.Thefillnafunctioncan"fillin"NAvalueswithnon-nulldatainacoupleofways. df2.close_price=df2.close_price.fillna(0) print(df2.close_price.isnull().sum()) Outputoftheabovecode:Intheabovecodefillingthenullvalueswith0 Step9:Droppingmissingdata Thefinalthingyoucando,istodeletethemissingrowsofdata.Wecandeleteallrowswithanymissingdataasshownbelow. df2=df2.dropna() df2.isnull().sum() Otherway:UsethedroptoRemoverowsorcolumnsbyspecifyinglabelnamesandcorrespondingaxisdataasshownbelow. df2=df2.drop() df2.isnull().sum() Step10:Createcsvfile Afterrenamingthecolumnnameswritetheformatteddataintoacsvfileinyourlocalorhdfs. #writeformattedtocsv df2.to_csv("stock_data.csv") Outputoftheabovelines: WhatUsersaresaying.. GautamVermani DataConsultantatConfidential HavingworkedinthefieldofDataScience,IwantedtoexplorehowIcanimplementprojectsinotherdomains,SoIthoughtofconnectingwithProjectPro.Aprojectthathelpedmeabsorbthistopic... ReadMore RelevantProjects MachineLearningProjects DataScienceProjects PythonProjectsforDataScience DataScienceProjectsinR MachineLearningProjectsforBeginners DeepLearningProjects NeuralNetworkProjects TensorflowProjects NLPProjects KaggleProjects IoTProjects BigDataProjects HadoopReal-TimeProjectsExamples SparkProjects DataAnalyticsProjectsforStudents Youmightalsolike DataScienceTutorial DataScientistSalary HowtoBecomeaDataScientist DataAnalystvsDataScientist DataScientistResume DataScienceProjectsforBeginners MachineLearningEngineer MachineLearningProjectsforBeginners Datasets PandasDataframe MachineLearningAlgorithms RegressionAnalysis MNISTDataset DataScienceInterviewQuestions PythonDataScienceInterviewQuestions SparkInterviewQuestions HadoopInterviewQuestions DataAnalystInterviewQuestions MachineLearningInterviewQuestions AWSvsAzure HadoopArchitecture SparkArchitecture RelevantProjects OpenCVProjectforBeginnerstoLearnComputerVisionBasics InthisOpenCVproject,youwilllearncomputervisionbasicsandthefundamentalsofOpenCVlibraryusingPython. ViewProjectDetails TimeSeriesForecastingProject-BuildingARIMAModelinPython BuildatimeseriesARIMAmodelinPythontoforecasttheuseofarrivalratedensitytosupportstaffingdecisionsatcallcentres. ViewProjectDetails Hands-OnApproachtoRegressionDiscontinuityDesignPython Inthismachinelearningproject,youwilllearntoimplementRegressionDiscontinuityDesignExampleinPythontodeterminetheeffectofageonMortalityRateinPython. ViewProjectDetails BuildaCustomerChurnPredictionModelusingDecisionTrees Developacustomerchurnpredictionmodelusingdecisiontreemachinelearningalgorithmsanddatascienceonstreamingservicedata. ViewProjectDetails BuildCNNImageClassificationModelsforRealTimePrediction ImageClassificationProjecttobuildaCNNmodelinPythonthatcanclassifyimagesintosocialsecuritycards,drivinglicenses,andotherkeyidentityinformation. ViewProjectDetails AvocadoMachineLearningProjectPythonforPricePrediction InthisMLProject,youwillusetheAvocadodatasettobuildamachinelearningmodeltopredicttheaveragepriceofavocadowhichiscontinuousinnaturebasedonregionandvarietiesofavocado. ViewProjectDetails SkipGramModelPythonImplementationforWordEmbeddings Skip-GramModelword2vecExample-LearnhowtoimplementtheskipgramalgorithminNLPforwordembeddingsonasetofdocuments. ViewProjectDetails TensorflowTransferLearningModelforImageClassification ImageClassificationProject-BuildanImageClassificationModelonaDatasetofT-ShirtImagesforBinaryClassification ViewProjectDetails AWSMLOpsProjectforGaussianProcessTimeSeriesModeling MLOpsProjecttoBuildandDeployaGaussianProcessTimeSeriesModelinPythononAWS ViewProjectDetails BuildaTextClassificationModelwithAttentionMechanismNLP InthisNLPProject,youwilllearntobuildamulticlasstextclassificationmodelwithattentionmechanism. ViewProjectDetails SubscribetoRecipes × GetAccess CONTINUE SignuptoViewRecipe × ViewRecipe CONTINUE
延伸文章資訊
- 1How to read csv data from local system and replace and ...
This recipe helps you read csv data from local system and replace and rename the columns in python.
- 2Python - Read CSV Columns Into List - GeeksforGeeks
In this method we will import the csv library and open the file in reading mode, then we will use...
- 315 ways to read CSV file with pandas - ListenData
This tutorial explains how to read a CSV file in python using read_csv ... Example 6 : Set Index ...
- 4python - Read specific columns from a csv file with csv module?
and I'm expecting that this will print out only the specific columns I want for each row except i...
- 5How to read specific column from CSV file in Python