How to Read Specific Columns from CSV File in Python - Finxter

文章推薦指數: 80 %
投票人數:10人

Method 4: Using csv Module · Import the csv module and open up the csv file. · Allow Python to read the csv file as a dictionary using csv. · Once the file has ... Skiptocontent Menu 5/5-(1vote) TableofContents Method1:UsingPandas➤List-BasedIndexingofaDataFrameMethod2:IntegerBasedIndexingwithilocMethod3:Name-BasedIndexingwithloc()Method4:UsingcsvModuleConclusionLearnPandastheFunWaybySolvingCodePuzzles AQuickGlanceatTheSolutions[Eachsolutionstaysfor5-10secs.] Problem:GivenaCSVfile,howtoreadonlyspecificcolumn(s)fromthecsvfile?(Readingaspecificcolumnfromacsvfilewillyieldalltherowvaluespertainingtothatcolumn.) Example:Consierthefollowingcsvfile(countries.csv): Country,Capital,Population,Area Germany,Berlin,"84,267,549","348,560" France,Paris,"65,534,239","547,557" Spain,Madrid,"46,787,468","498,800" Italy,Rome,"60,301,346","294,140" India,Delhi,"1,404,495,187","2,973,190" USA,Washington,"334,506,463","9,147,420" China,Beijing,"1,449,357,022","9,388,211" Poland,Warsaw,"37,771,789","306,230" Russia,Moscow,"146,047,418","16,376,870" England,London,"68,529,747","241,930" Question:Howwillyoureadtheabovecsvfileanddisplaythefollowingcolumns– CountrycolumnalongwiththeCapitalcolumn?Allvaluesinthepopulationcolumn? Method1:UsingPandas UsingthePandaslibraryisprobablythebestoptionifyouaredealingwithcsvfiles.Youcaneasilyreadacsvfileandstoreanentirecolumnwithinavariable. Code: importpandasaspd df=pd.read_csv("countries.csv") country=df['Country'] #or #country=df.Country capital=df['Capital'] #or #capital=df.Capital #displayingselectedcolumns(CountryandCapital) forx,yinzip(country,capital): print(f"{x}{y}") #displayingasinglecolumn(Country) print() print(df['Population']) Output: GermanyBerlin FranceParis SpainMadrid ItalyRome IndiaDelhi USAWashington ChinaBeijing PolandWarsaw RussiaMoscow EnglandLondon 084,267,549 165,534,239 246,787,468 360,301,346 41,404,495,187 5334,506,463 61,449,357,022 737,771,789 8146,047,418 968,529,747 Name:Population,dtype:object Explanation: Readthecsvfileusingpd.read_csv()Pandasfunction.SavealltheinformationofthecolumnsCountryandCapitalwithinindependentvariablesusing–country=df['Country']Alternatively,youcanalsousecountry=df.Countrycapital=df['Capital']Alternatively,youcanalsousecapital=df.Capital Todisplaythecountrynamesandtheircapitalssimultaneously,youcanbindthetwocolumns,countryandcapital,usingthezip()functionandthendisplayeachcountryalongwithitscapitalusingaforloopuponthezippedobject.Todisplayallthevaluesinthepopulationcolumn,youcansimplyusedf['Population']. TRIVIAzip() isabuilt-infunctioninPythonthattakesanarbitrarynumberof iterables andbindsthemintoasingleiterable,azipobject.Itcombinesthe n-th valueofeachiterableargumentintoatuple.Readmoreaboutzip()here. ➤List-BasedIndexingofaDataFrame Incaseyouarenotcomfortablewithusingzip()todisplaymultiplecolumnsatonce,youhaveanotheroption.Youcansimplyuselist-basedindexingtoaccomplishyourgoal. List-basedindexingisatechniquethatallowsyoutopassmultiplecolumnnamesasa list withinthesquare-bracketselector. Example: importpandasaspd df=pd.read_csv("countries.csv") print() print(df[['Country','Capital']]) Output: CountryCapital 0GermanyBerlin 1FranceParis 2SpainMadrid 3ItalyRome 4IndiaDelhi 5USAWashington 6ChinaBeijing 7PolandWarsaw 8RussiaMoscow 9EnglandLondon Method2:IntegerBasedIndexingwithiloc Approach:Theideahereistousethe df.iloc[rows,columns].values toaccessindividualcolumnsfromtheDataFrameusingindexing.Notethatthefirstcolumnalwayshastheindex0,whilethesecondcolumnhasindex1,andsoon. rowsisusedtoselectindividualrows.Usethe slicingcolon: toensureallrowshavebeenselected.columnsisusedtoselectindividualcolumns.Usecountry=data.iloc[:,0].valuestosavethevaluesoftheCountrycolumn.capital=data.iloc[:,1].valuestosavethevaluesoftheCapitalcolumn.population=data.iloc[:,2].valuestosavethevaluesofthePopulationcolumn. importpandasaspd data=pd.read_csv('countries.csv') country=data.iloc[:,0].values capital=data.iloc[:,1].values population=data.iloc[:,2].values #displayingselectedcolumns print(data[['Country','Capital']]) print() #displayingasinglecolumn(Population) print(population) Output: CountryCapital 0GermanyBerlin 1FranceParis 2SpainMadrid 3ItalyRome 4IndiaDelhi 5USAWashington 6ChinaBeijing 7PolandWarsaw 8RussiaMoscow 9EnglandLondon ['84,267,549''65,534,239''46,787,468''60,301,346''1,404,495,187' '334,506,463''1,449,357,022''37,771,789''146,047,418''68,529,747'] Method3:Name-BasedIndexingwithloc() Insteadofselectingthecolumnsbytheirindex,youcanalsoselectthembytheirnameusingthedf.loc[]selecter. ThefollowingexampleshowshowtoselectthecolumnsCountryandCapitalfromthegivenDataFrame. importpandasaspd data=pd.read_csv('countries.csv') val=data.loc[:,['Country','Capital']] print(val) Output: CountryCapital 0GermanyBerlin 1FranceParis 2SpainMadrid 3ItalyRome 4IndiaDelhi 5USAWashington 6ChinaBeijing 7PolandWarsaw 8RussiaMoscow 9EnglandLondon RelatedTutorial:SlicingDatafromaPandas DataFrameusing.locand.iloc Method4:UsingcsvModule csvmoduleisyetanotherspectacularoptioninPythonthatallowsyoutoplaywithcsvfiles.Letushavealookatthecodethathelpsustoreadthegivencsvfileandthenreadspecificcolumnsfromit: importcsv population=[] withopen('countries.csv',newline='',encoding='utf-8-sig')ascsvfile: data=csv.DictReader(csvfile) forrindata: print("Country",":","Capital") #appendvaluesfrompopulationcolumntopopulationlist population.append(r['Population']) #displayingspecificcolumns(CountryandCapital) print(r['Country'],":",r['Capital']) #displaythepopulationlist print(population) Output: Country:Capital Germany:Berlin Country:Capital France:Paris Country:Capital Spain:Madrid Country:Capital Italy:Rome Country:Capital India:Delhi Country:Capital USA:Washington Country:Capital China:Beijing Country:Capital Poland:Warsaw Country:Capital Russia:Moscow Country:Capital England:London ['84,267,549','65,534,239','46,787,468','60,301,346','1,404,495,187','334,506,463','1,449,357,022','37,771,789','146,047,418','68,529,747'] Explanation: Importthecsvmoduleandopenupthecsvfile.Ensurethatyoufeedintheencodingargumentasithelpstoeliminateanyunreadablecharactersthatmayoccurinthegivencsvfile.withopen('countries.csv',newline='',encoding='utf-8-sig')ascsvfileAllowPythontoreadthecsvfileasadictionaryusingcsv.Dictreaderobject.Oncethefilehasbeenreadintheformofadictionary,youcaneasilyfetchthevaluesfromrespectivecolumnsbyusingthekeyswithinsquarebracketnotationfromthedictionary.Hereeachcolumnrepresentsthekeywithinthegivendictionary. Bonus:Here’saquicklookathowtheDictReader()classlookslike: importcsv population=[] withopen('countries.csv',newline='',encoding='utf-8-sig')ascsvfile: data=csv.DictReader(csvfile) forrowindata: print(row) Output: {'Country':'Germany','Capital':'Berlin','Population':'84,267,549','Area':'348,560'} {'Country':'France','Capital':'Paris','Population':'65,534,239','Area':'547,557'} {'Country':'Spain','Capital':'Madrid','Population':'46,787,468','Area':'498,800'} {'Country':'Italy','Capital':'Rome','Population':'60,301,346','Area':'294,140'} {'Country':'India','Capital':'Delhi','Population':'1,404,495,187','Area':'2,973,190'} {'Country':'USA','Capital':'Washington','Population':'334,506,463','Area':'9,147,420'} {'Country':'China','Capital':'Beijing','Population':'1,449,357,022','Area':'9,388,211'} {'Country':'Poland','Capital':'Warsaw','Population':'37,771,789','Area':'306,230'} {'Country':'Russia','Capital':'Moscow','Population':'146,047,418','Area':'16,376,870'} {'Country':'England','Capital':'London','Population':'68,529,747','Area':'241,930'} Itisevidentfromtheoutputthatcsv.DictReader()returnsadictionaryforeachrowsuchthatthecolumnheaderisthekeywhilethevalueintherowistheassociatedvalueinthedictionary. Conclusion Tosumthingsup,therearemajorlyfourdifferentwaysofaccessingspecificcolumnsfromagivencsvfile: List-BasedIndexing.Integer-BasedIndexing.Name-BasedIndexing.UsingcsvmodulesDictReaderclass. Feelfreetousetheonethatsuitsyoubest.Ihopethistutorialhelpedyou.Pleasesubscribeandstaytunedformoreinterestingtutorials.Happylearning! LearnPandastheFunWaybySolvingCodePuzzles IfyouwanttoboostyourPandasskills,considercheckingoutmypuzzle-basedlearningbookCoffeeBreakPandas(AmazonLink). Itcontains74hand-craftedPandaspuzzlesincludingexplanations.Bysolvingeachpuzzle,you’llgetascorerepresentingyourskilllevelinPandas.CanyoubecomeaPandasGrandmaster? CoffeeBreakPandasoffersafun-basedapproachtodatasciencemastery—andatrulygamifiedlearningexperience. ShubhamSayonIamaprofessionalPythonBloggerandContentcreator.Ihavepublishednumerousarticlesandcreatedcoursesoveraperiodoftime.PresentlyIamworkingasafull-timefreelancerandIhaveexperienceindomainslikePython,AWS,DevOps,andNetworking. Youcancontactme@: UpWork LinkedIn RelatedTutorialsHowtoGettheLastNRowsofaPandasDataFrame?Pandasdrop_level(),pivot(),pivot_table(),…Pandasappend(),assign(),compare(),join(),merge(),…PandasDataFramepivot_table()MethodTheUltimateGuidetoPythonListsPythonDictionary-TheUltimateGuide WhyFinxter? "Givemealeverlongenough[...]andIshallmovetheworld."-ArchimedesFinxteraimstobeyourlever!Oursinglepurposeistoincreasehumanity'scollectiveintelligenceviaprogrammingtutorialssoyoucanleverageinfinitecomputationalintelligencetoyoursuccess!FinxterMissionVideo LearningResources Toboostyourskills,joinourfreeemailacademywith1000+tutorialsonPython,freelancing,datascience,machinelearning,andBlockchaindevelopment!Tocreateyourthrivingcodingbusinessonline,checkoutourFinxterbooksandtheworld's#1freelancedeveloperprogram.Ifyou'renotquitereadytogoall-in,watchthefreemasterclassonbuildingyourhigh-incomeskillprogramming. NewFinxterTutorials: SolidityDeepDive—Syllabus+VideoTutorialResources SolidityFunctionTypes—ASimpleGuidewithVideo User-DefinedValueTypesinSolidity SolidityStringTypes,Unicode/HexLiterals,andEnums HowtoRemoveTextWithinParenthesesinaPythonString? PythonPrintDictionaryValuesWithout“dict_values” HowtoCleanandFormatPhoneNumbersinPython PythonPrintDictionaryWithoutOneKeyorMultipleKeys StateVariablesinSolidity HowtoExtractaZipFileinPython FinxterCategories: Categories SelectCategory 2-minComputerScienceConcepts 2-minComputerSciencePapers AlexaSkills Algorithms AppDevelopment Arduino ArtificialIntelligence Automation BeautifulSoup Binary Bitcoin Blockchain Blogging Brownie C C# C++ Career CheatSheets Clojure CloudComputing CodingBusiness CodingInterview ComputerScience Crypto CSS CSV DailyDataSciencePuzzle DailyPythonPuzzle dApp Dash DataScience DataStructures DataVisualization Database DeepLearning DeFi DependencyManagement DevOps DistributedSystems Django DunderMethods Error Ethereum Excel ExceptionHandling Finance Flask Float Freelancing FunctionalProgramming Functions Git Go GraphTheory GUI Hardware HTML ImageProcessing Input/Output Investment Java JavaScript json Jupyter Keras Linux MachineLearning macOS Math Matplotlib NaturalLanguageProcessing Networking Newspaper3k NFT ObjectOrientation OpenCV OperatingSystem PandasLibrary Performance PHP Pillow pip Polygon Powershell Productivity Projects PyCharm PyTest Python PythonBuilt-inFunctions PythonDictionary PythonEmailCourse PythonKeywords PythonList PythonOne-Liners PythonOperators PythonRequests PythonSet PythonString Pythonsys PythonTime PythonTuple PyTorch React Regex Research Scikit-learnLibrary SciPy Scripting Seaborn Security Selenium shutil sklearn SmartContracts Solana Solidity SQL Statistics Streamlit SymPy Tableau TensorFlow Testing TextProcessing TheNumpyLibrary TKinter Trading VisualStudio Web3 WebDevelopment WebScraping Windows XML



請為這篇文章評分?