Reading CSVs With Python's "csv" Module
文章推薦指數: 80 %
Now it's time to start using Python to read CSV files. Here, I've got a simple CSV file that contains some employee data for two employees and has their ...
Start Here
LearnPython
PythonTutorials→In-deptharticlesandvideocourses
LearningPaths→Guidedstudyplansforacceleratedlearning
Quizzes→Checkyourlearningprogress
BrowseTopics→Focusonaspecificareaorskilllevel
CommunityChat→LearnwithotherPythonistas
OfficeHours→LiveQ&AcallswithPythonexperts
Podcast→Hearwhat’snewintheworldofPython
Books→Roundoutyourknowledgeandlearnoffline
UnlockAllContent→
More
PythonLearningResources
PythonNewsletter
PythonJobBoard
MeettheTeam
BecomeaTutorialAuthor
BecomeaVideoInstructor
Search
Join
Sign‑In
Exit Fullscreen
Hint:Youcanadjustthedefaultvideoplaybackspeedinyouraccountsettings.
×
Hint:Youcansetthedefaultsubtitleslanguageinyouraccountsettings.
×
Sorry!Lookslikethere’sanissuewithvideoplayback🙁Thismightbeduetoatemporaryoutageorbecauseofaconfigurationissuewithyourbrowser.Pleaseseeourvideoplayertroubleshootingguidetoresolvetheissue.
×
ReadingCSVsWithPython's"csv"Module
ReadingandWritingCSVFiles
JoeTatusko
04:13
MarkasCompleted
SupportingMaterial
RecommendedTutorial
SampleCSVFiles(.zip)
Description
Transcript
Comments&Discussion(32)
Inthisvideo,you’lllearnhowtoreadstandardCSVfilesusingPython’sbuiltincsvmodule.TherearetwowaystoreaddatafromaCSVfileusingcsv.Thefirstmethodusescsv.Reader()andthesecondusescsv.DictReader().
csv.Reader()allowsyoutoaccessCSVdatausingindexesandisidealforsimpleCSVfiles.csv.DictReader()ontheotherhandisfriendlierandeasytouse,especiallywhenworkingwithlargeCSVfiles.
We’llbeusingthefollowingsampleCSVfilecalledemployee_birthday.csv:
name,department,birthdaymonth
JohnSmith,Accounting,November
EricaMeyers,IT,March
ThefollowingcodesamplesshowhowtoreadCSVfilesusingthetwomethods:
Usingcsv.Reader():
importcsv
withopen('employee_birthday.csv')ascsv_file:
csv_reader=csv.Reader(csv_file,delimiter=',')
line_count=0
forrowincsv_reader:
print(f'\t{row[0]}worksinthe{row[1]}department,andwasbornin{row[2]}')
Usingcsv.DictReader():
importcsv
withopen('employee_birthday.csv')ascsv_file:
csv_reader=csv.DictReader(csv_file,delimiter=',')
line_count=0
forrowincsv_reader:
print(f'\t{row["name"]}worksinthe{row["department"]}department,andwasbornin{row["month"]}')
00:00
Nowit’stimetostartusingPythontoreadCSVfiles.Here,I’vegotasimpleCSVfilethatcontainssomeemployeedatafortwoemployeesandhastheirname,department,andbirthdaymonth.OpenupanewPythonscriptandimportcsv.
00:17
ThefirstthingyouneedtodoisactuallyopentheCSVfile.
00:27
Goaheadandcallthiscsv_file.Next,createacsv_reader,whichwilljustbecsv.reader().Passinthecsv_fileandalsoadelimiter,whichinthiscase,isjustasimplecomma(',').
00:45
Nowyoucancreatealine_counter,whichwe’lljustsetequalto0,andsincethecsv_readergeneratesalistofthedifferentrows,youcangoaheadandloopthroughforrowincsv_reader:,andthefirstrowcontainsallthatheaderinformation,sowhenline_count==0:youcanprintf'Columnnamesare'—
01:16
andsincethat’salsoalistofeachitem,justgoaheadand.join()thatrowtogether.Increasetheline_count,andhandleeverythingelse.
01:28
So,print(),andlet’sstartwithatab('\t')here,andyoucangoaheadandjustindexoffofthatrow,sothenamewillbethefirstitem,
01:42
thedepartmentwillbethesecond,
01:51
andthethirditemisthebirthdaymonth.Cool!Likebefore,increasetheline_countandgoaheadandprintasummary,whichwilljustbesomethinglikef'Processed{line_count}lines.'Andthat’sit!Savethis,openuptheterminal,andseewhatweget.Allright,Columnnamesarename,department,birthdaymonth,andbothemployees’informationprintsouthere.Andactually—haha—itlookslikeIprintedthenametwice.
02:30
Thisshouldbea1sothatweaccessthesecondindexediteminthere.Let’strythatagain.Therewego.Nowwecanseetheirdepartment.Okay!Sothat’sprettycool.YoucanopenCSVfilesandparsethemusingPython.
02:47
Now,ifyourCSVfilesstartgettinglarger,orifyouhavecolumnsthatyouneedtoaddorremove,theindexingcangetverydifficult,andasyoucansee,Imadeatypohereandprintedthewrongthing.
02:58
Fortunately,csvhasadifferenttypeofreaderthatallowsyoutousetheheadernames.Sogoupheretothecsv.reader,andyou’regoingtoreplacethiswithaDictReader,fordictionary.
03:10
Everythingelsewillbethesame.Goaheadandgetridofthiselsestatementhereandremovetheindentsonyoureveryotherrow.Andnow,insteadofpickingaparttherowsbytheirindex,youcanactuallypassintheheadername.
03:26
Sowhat’sgoingonhereistheDictReader(dictionaryreader)looksatthefirstrowandassumesthatthat’sthenameoftherow,soyoucanthenrefertoitbythatname.
03:40
SoifIgoahead,savethis,weshouldbeabletorunitandgetthesameresult.Yeah!Soifyouhavethatfirstrowthatcontainstheheaderinformation,youcanuseadifferentreader,andthisisalittlefriendliertousebecauseyoucanactuallyseewhatyou’rereferringto.Allright!
03:59
SonowyoucanreadstandardCSVfiles.Inthenextvideo,we’regoingtotalkaboutchangingsomeofthereaderparameterstodealwiththosenonstandardcases.Thanksforwatching,andI’llseeyouthere!
newoptionzonJune1,2019
Thecodedidnotrecognizethefileinwindows,Imodifiedthepathasfollows
importcsv
frompathlibimportPath
data_folder1=Path("C:/Users/Stephen/OneDrive/Documents/PythonCSV/")
data_folder1=Path("C:\\Users\\Stephen\\OneDrive\\Documents\\PythonCSV\\")
file_to_open=data_folder1/"employee_birthday.txt"
withopen(file_to_open)ascsv_file:
newoptionzonJune1,2019
Themagicalpowersofthe.
Inthecodeabove,theseconddata_folder1haddouble\,notsingleonesasshownabove.
josephjaewookimonJune15,2019
ItwouldbeniceifyouprovidedtheactualCSVfile…
DanBaderRPTeamonJune16,2019
@josephjaewookim:YoucanfindtheCSVfilebyclickingonthe“SupportingMaterial”button.ButI’malsoaddingittothedescriptionabove,thanksfortheheadsup.
EricPonAug.16,2019
Humm
Python3.7.2
importcsv
print(csv.__file__)
withopen('employee_birthday.txt')ascsv_file:
csv_reader=csv.Reader(csv_file,delimiter=',')
line_count=0
forrowincsv_reader:
print(f'\t{row[0]}worksinthe{row[1]}department,andwasbornin{row[3]}')
givesme
/Users/epalmer/.virtualenvs/data_science/bin/python/Users/epalmer/projects_sorted/real_python/data_science/csv/read1.py
Traceback(mostrecentcalllast):
File"/Users/epalmer/projects_sorted/real_python/data_science/csv/read1.py",line4,in frompathlibimportPath
project_folder=Path(‘C:\Users\user_name\project_file_folder_name\‘)
file_to_open=project_folder/‘employee_birthday.txt’
fh=open(file_to_open,mode=’w’)#fhstandsforfilehandlerobject
theonlyrequiredparameterforopenismode,thereforeyoucanjusttype‘w’
fh.write(‘name,department,birthdaymonth\n’
‘JohnSmith,Accounting,November\n’
‘EricaMeyers,IT,March’)
fh.close()
shallahrichardsononMay25,2020
Hiaretherefilesforthislessonifsowherearethey?Thesupportingmaterialbuttononlytakesmetoanothertutorial.Thanks
shallahrichardsononMay25,2020
Hiaretherefilesforthislessonifsowherearethey?Thesupportingmaterialbuttononlytakesmetoanothertutorial.Thanks
JonFincherRPTeamonMay26,2020
shallahrichardson,checkoutthelinkedarticle.Thefilesreferencedcanbefoundifyouscrollthroughthatarticle.
shallahrichardsononMay26,2020
@JonFincherI’vescrooledthroughallthecommentsandarticlesandIstilldon’tseethefiles.
DanBaderRPTeamonMay26,2020
Here’sthedirectlinkforthesampleCSVs:realpython.com/courses/reading-and-writing-csv-files/downloads/reading-and-writing-csv-files-samples/
(TheycanalsobefoundatthestartofthecourseandundertheSupportingMaterialsdropdownoneachvideolesson.)
theramstossonJune18,2020
Couldyoupleaseexplainwhythepythondocumentationrequiresnewline=‘’?Ikindofunderstanditbutnot100%.Thanks!
theramstossonJune18,2020
Also,thepythondocsdoesnotseemtospecify‘r’.Isthedefault‘r’?
UweSteineronJune27,2020
goodoverviewofwhat’spossiblewithZIP
WaltSorensenonMarch31,2021
WhyusethisCSVreaderfromimportcsvoverPandasreadCSV?
importpandasaspd
file=csvfile.csv
df=pd.read_csv(file)
(especiallywhenconsideringworkingwithdatainpandas?)
BartoszZaczyńskiRPTeamonApril1,2021
@WaltSorensenWell,noteveryoneneedsorknowsaboutPandasinthefirstplace.Itdependsonyourusecase,butoneadvantageofthebuilt-incsvmoduleoverPandasisthatit’savailableineveryPythondistribution,whilePandasisnot.Thatmightbeaproblemifyoudon’thaveInternetaccessoradminprivileges,whichiscommonincorporateenvironments.
naomifosonApril2,2021
Ihaveaprobablyweirdquestion.NowIknowthatnormallyCSVfilesreadthrougharowhorizontally.ButinmyquestionproblemIhave,itiswantingmetoreadthroughitverticallysothatmyheaderisakeyinanewdictionaryandthevalueswouldbethethingsinthecolumnvertically.HowwouldIgoaboutdoingthat?
BartoszZaczyńskiRPTeamonApril2,2021
@naomifosWiththepandaslibrary,that’swhathappensimplicitlybecauseitstoresdataincolumnarformat:
>>>importpandasaspd
>>>df=pd.read_csv("data.csv")
>>>df["first_name"]
0Joe
1Anna
2Bob
Name:first_name,dtype:object
>>>list(df["first_name"])
['Joe','Anna','Bob']
IfyouwantedtosimulatethesamebehaviorwithpurePython,thisishowyoucouldgoaboutit:
importcsv
fromcollectionsimportdefaultdict,namedtuple
columns=defaultdict(list)
withopen("data.csv",encoding="UTF-8")asfile:
reader=csv.reader(file)
field_names=next(reader)
Record=namedtuple("Record",field_names)
forrowinreader:
record=Record(*row)
forkey,valueinrecord._asdict().items():
columns[key].append(value)
print(columns["first_name"])
EduardoonMay11,2021
I’mtryingitwithacsvuploadedonGithubrunningmycodeinGoogleColab.
url='https://raw.githubusercontent.com/UpInside/play_import_csv/master/municipios.csv'
withopen(url)ascsv_file:
csv_reader=csv.reader(csv_file,delimiter=',')
line_count=0
forrowincsv_reader:
ifline_count==0:
print(f'Columnnamesare{",".join(row)}')
line_count+=1
else:
print(f'\t{row[0]}istheUFof{row[2]}city.')
line_count+=1
print(f'Processed{line_count}lines.')
Butthereturnistheerror:
FileNotFoundError:[Errno2]Nosuchfileordirectory:'https://raw.githubusercontent.com/UpInside/play_import_csv/master/municipios.csv'
LeodanisPozoRamosRPTeamonMay11,2021
@EduardoTheproblemhereseemstobethatyou’reusingopen()toopenanurlinsteadofalocalfile.Ithinkyoushouldtrydownloadingthefileorusing:
docs.python.org/3/library/urllib.request.html#urllib.request.urlopen
KimonOct.31,2021
Hi,all.
IjustwentthroughthistutorialandwithDictReader,itappearstohaveiteratedrowsthroughalloftherows,butonlyprintedrows0and2andbypassed1.Hereisthesnippet,whichIbelieveisidenticaltothescriptinthetutorial.
withopen(r'C:\Users\xxxx\xxxx\csv_example.txt')ascsv_file:
csv_reader=csv.DictReader(csv_file,delimiter=',')
line_count=0
forrowincsv_reader:
ifline_count==0:
print(f'Columnnamesare{",".join(row)}')
line_count+=1
else:
print(f'\t{row["name"]}worksinthe{row["department"]}department,andwasbornin{row["birthdaymonth"]}.')
line_count+=1
print(f'Processed{line_count}lines.')
ThiscodeissimilartowhatIactuallyran…Ide-identifiedsomedetailsinmyfilelocation.
Anyhintsonthefix?
Thanks!
BartoszZaczyńskiRPTeamonNov.2,2021
@KimIfyoudon’texplicitlyprovidealistoffieldnamesforyourrecords,thenDictReaderwillassumethefirstlineintheCSVfileistheheader.Itwilltrytogetthosefieldnamesfromthere,alwaysskippingthefirstlineduringiteration:
name,department,birthdaymonth
JoeDoe,IT,1999-01-01
AnnaSmith,HR,2001-12-31
BobBrown,Sales,2002-15-31
>>>importcsv
>>>withopen(r"example.csv")ascsv_file:
...csv_reader=csv.DictReader(csv_file,delimiter=",")
...forrowincsv_reader:
...print(row)
...
{'name':'JoeDoe','department':'IT','birthdaymonth':'1999-01-01'}
{'name':'AnnaSmith','department':'HR','birthdaymonth':'2001-12-31'}
{'name':'BobBrown','department':'Sales','birthdaymonth':'2002-15-31'}
Noticetheloopstartsatthesecondlineinthefile,whichcontainstheactualdata.
SirVeyoronNov.19,2021
Ilearnjustasmuchifnotmorefromthecomments.Thankstoall!
acamposlaonDec.16,2021
Iamtryingthisapproach
counter=0
withopen("csv-sample-files/co-curricular-activities-ccas.csv")assingapore:
singapore=csv.reader(singapore,delimiter=",")
line_count=0
forrowinsingapore:
ifline_count==0:
print(f"Lascolumnasson{','.join(row)}")
print(f"Hayuntotalde{len(row)}columnas")
line_count+=1
else:
foriinrange(5):
print(f"\t{row[i]}")
counter+=1
ifcounter==5:
print("---nextline---")
Igetthis:
Lascolumnassonschool_name,school_section,cca_grouping_desc,cca_generic_name,cca_customized_name
Hayuntotalde5columnas
ADMIRALTYPRIMARYSCHOOL
PRIMARY
PHYSICALSPORTS
MODULARCCA(SPORTS)
SPORTSCLUB
---nextline---
ADMIRALTYPRIMARYSCHOOL
PRIMARY
VISUALANDPERFORMINGARTS
ARTANDCRAFTS
VISUALARTSCLUB
ADMIRALTYPRIMARYSCHOOL
PRIMARY
CLUBSANDSOCIETIES
ENGLISHLANGUAGE,DRAMAANDDEBATING
ENGLISHLANGUAGEANDDRAMA
ADMIRALTYPRIMARYSCHOOL
Howeveriwouldliketoprint“Nextline”every5lines(columns)
BecomeaMembertojointheconversation.
CourseContents
Overview
33%
WhatAreCSVFiles?01:37
ReadingCSVsWithPython's"csv"Module04:13
AdvancedCSVReaderParameters03:51
WritingCSVsWithPython's"csv"Module05:33
ReadingCSVsWithPandas04:30
WritingCSVsWithPandas02:12
延伸文章資訊
- 1How to parse csv formatted files using csv.DictReader?
CSV, or "comma-separated values", is a common file format for data. The csv module helps you to ....
- 2【Day 2】常見的資料格式(1/3) - CSV
CSV 的全名是Comma Separated Values,顧名思義就是用**逗點(,)**分隔的資料。 ... Python 有提供標準的模組來操作CSV 資料,這邊就介紹一些常用到的API。
- 3csv --- CSV 文件读写— Python 3.8.14 說明文件
csv 模块中的 reader 类和 writer 类可用于读写序列化的数据。也可使用 DictReader 类和 DictWriter 类以字典的形式读写数据。 也參考. 该实现在“Pytho...
- 4CSV 檔案操作- Python 教學 - STEAM 教育學習網
Python 的標準函式「csv」提供了操作CSV 檔案的方法,可以針對CSV 檔案進行讀取、寫入或修改,這篇教學將會介紹csv 常用的方法。
- 5Reading CSVs With Python's "csv" Module
Now it's time to start using Python to read CSV files. Here, I've got a simple CSV file that cont...