Batch convert encoding in files - Super User
文章推薦指數: 80 %
UTFCast is a Unicode converter that lets you batch convert all text files to UTF encodings with just a click of your mouse. You can use it to ...
SuperUserisaquestionandanswersiteforcomputerenthusiastsandpowerusers.Itonlytakesaminutetosignup.
Signuptojointhiscommunity
Anybodycanaskaquestion
Anybodycananswer
Thebestanswersarevotedupandrisetothetop
Home
Public
Questions
Tags
Users
Companies
Unanswered
Teams
StackOverflowforTeams
–Startcollaboratingandsharingorganizationalknowledge.
CreateafreeTeam
WhyTeams?
Teams
CreatefreeTeam
Teams
Q&Aforwork
Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch.
LearnmoreaboutTeams
Batchconvertencodinginfiles
AskQuestion
Asked
13years,1monthago
Modified
2monthsago
Viewed
103ktimes
60
HowcanIbatch-convertfilesinadirectoryfortheirencoding(e.g.ANSI→UTF-8)withacommandortool?
Forsinglefiles,aneditorhelps,buthowcanIdothemassfilesjob?
linuxwindowsmacosbatchencoding
Share
Improvethisquestion
Follow
editedAug29,2020at11:45
PeterMortensen
11.9k2323goldbadges6969silverbadges9090bronzebadges
askedAug21,2009at9:12
desolatdesolat
1,05211goldbadge1212silverbadges1717bronzebadges
2
1
related:stackoverflow.com/questions/724083/…
– user4358
Aug21,2009at9:18
stackoverflow.com/a/24713621/242933
– ma11hew28
Jul12,2014at13:56
Addacomment
|
14Answers
14
Sortedby:
Resettodefault
Highestscore(default)
Datemodified(newestfirst)
Datecreated(oldestfirst)
44
CygwinorGnuWin32provideUnixtoolslikeiconvanddos2unix(andunix2dos).UnderUnix/Linux/Cygwin,you'llwanttouse"windows-1252"astheencodinginsteadofANSI(seebelow).(Unlessyouknowyoursystemisusingacodepageotherthan1252asitsdefaultcodepage,inwhichcaseyou'llneedtotelliconvtherightcodepagetotranslatefrom.)
Convertfromone(-f)totheother(-t)with:
$iconv-fwindows-1252-tutf-8infile>outfile
Orinafind-all-and-conquerform:
##thiswillclobbertheoriginalfiles!
$find.-name'*.txt'-execiconv--verbose-fwindows-1252-tutf-8{}\>{}\;
Alternatively:
##thiswillclobbertheoriginalfiles!
$find.-name'*.txt'-execiconv--verbose-fwindows-1252-tutf-8-o{}{}\;
Thisquestionhasbeenaskedmanytimesonthissite,sohere'ssomeadditionalinformationabout"ANSI".Inananswertoarelatedquestion,CesarBmentions:
Thereareseveralencodingswhicharecalled"ANSI"inWindows.Infact,ANSIisamisnomer.iconvhasnowayofguessingwhichyouwant.
TheANSIencodingistheencodingusedbythe"A"functionsintheWindowsAPI(the"W"functionsuseUTF-16).WhichencodingitcorrespondstousuallydependsonyourWindowssystemlanguage.ThemostcommonisCP1252(alsoknownasWindows-1252).So,whenyoureditorsaysANSI,itismeaning"whatevertheAPIfunctionsuseasthedefaultANSIencoding",whichisthedefaultnon-Unicodeencodingusedinyoursystem(andthususuallytheonewhichisusedfortextfiles).
Thepagehelinkstogivesthishistoricaltidbit(quotedfromaMicrosoftPDF)ontheoriginsofCP1252andISO-8859-1,anotheroft-usedencoding:
[...]thiscomesfromthefactthattheWindowscodepage1252wasoriginallybasedonanANSIdraft,whichbecameISOStandard8859-1.However,inaddingcodepointstotherangereservedforcontrolcodesintheISOstandard,theWindowscodepage1252andsubsequentWindowscodepagesoriginallybasedontheISO8859-xseriesdeviatedfromISO.Tothisday,itisnotuncommontohavethedevelopmentcommunity,bothwithinandoutsideofMicrosoft,confusethe8859-1codepagewithWindows1252,aswellassee"ANSI"or"A"usedtosignifyWindowscodepagesupport.
Share
Improvethisanswer
Follow
editedAug25,2020at15:59
phuclv
24k1111goldbadges103103silverbadges209209bronzebadges
answeredSep30,2009at18:17
quackquixotequackquixote
41.3k1414goldbadges102102silverbadges127127bronzebadges
2
7
Don'tusethesamefilenameasinputandoutput!iconvseemstotruncatefilesto32,768bytesiftheyexceedthissize.Ashewritesinthefilehe'stryingtoreadfrom,hemanagestodothejobifthefileissmallenough,elsehetruncatesthefilewithoutanywarning...
– sylbru
Sep11,2014at7:32
3
FYIThisquestionistaggedwithosxanditdoesn'tlooklikeeitheroftheconvert-allcommandsworkonYosemiteorElCap.TheiconvversionApplesshipsdoesn'tsupport--verboseor-o,andtheothersyntaxredirectingstdoutdoesn'tworkforsomereasonandjustsendsittoregularstdout.
– ScottMcIntyre
May9,2016at13:02
Addacomment
|
40
WithPowerShellyoucandosomethinglikethis:
Get-ContentIN.txt|Out-File-encodingENC-filepathOUT.txt
WhileENCissomethinglikeunicode,ascii,utf8,andutf32.Checkout'helpout-file'.
Toconvertallthe*.txtfilesinadirectorytoUTF-8,dosomethinglikethis:
foreach($iinls-nameDIR/*.txt){\
Get-ContentDIR/$i|\
Out-File-encodingutf8-filepathDIR2/$i\
}
whichcreatesaconvertedversionofeach.txtfileinDIR2.
Toreplacethefilesinallsubdirectories,use:
foreach($iinls-recurse-filter"*.java"){
$temp=Get-Content$i.fullname
Out-File-filepath$i.fullname-inputobject$temp-encodingutf8-force
}
Share
Improvethisanswer
Follow
editedAug29,2020at12:22
PeterMortensen
11.9k2323goldbadges6969silverbadges9090bronzebadges
answeredFeb26,2010at6:31
akiraakira
59.6k1717goldbadges135135silverbadges164164bronzebadges
5
ConvertingfromANSItoUTFviayourfirstproposaldoeserasethewholecontentofmytextfile...
– Orsinus
May9,2015at7:24
@Acroneos:thenyoumadeamistake:thein-fileisIN.txt,theoutfileisOUT.txt...thiswayitisimpossibletooverwritetheoriginal.ifyouusedthesamefilenameforIN.txtandOUT.txtthenyouoverwritethefileyouarereadingfrom,obviously.
– akira
May10,2015at6:06
1
PowershellwillconverttoUTFwithBOM.findandiconvmightbemucheasier.
– pparas
Aug23,2017at14:25
@pparasthat'swrong.CommandsrelatedtotextfileslikeOut-File,Get-Content,Set-Content...allhavean-Encodingparameterwhichallowsutf8BOMorutf8NoBOM.iconvismuchworseinthisregardbecauseitneversupportsUTF-8withBOM
– phuclv
Aug21,2020at23:17
1
\isnotanescapecharacterinpowershellsoputtingitattheendofeachlinewon'twork
– phuclv
Aug25,2020at16:00
Addacomment
|
7
Onelinerusingfind,withautomaticdetection
ThecharacterencodingofallmatchingtextfilesgetsdetectedautomaticallyandallmatchingtextfilesareconvertedtoUTF-8encoding:
$find.-typef-iname*.txt-execsh-c'iconv-f$(file-bi"$1"|sed-e"s/.*[]charset=//")-tutf-8-oconverted"$1"&&mvconverted"$1"'--{}\;
Toperformthesesteps,asubshellshisusedwith-exec,runningaone-linerwiththe-cflag,andpassingthefilenameasthepositionalargument"$1"with--{}.Inbetween,theUTF-8outputfileistemporarilynamedconverted.
Thefindcommandisveryusefulforsuchfilemanagementautomation.
Clickhereformorefindgalore.
Share
Improvethisanswer
Follow
editedAug30,2020at19:48
answeredAug28,2016at19:53
SergeStroobandtSergeStroobandt
1,87011goldbadge2525silverbadges2525bronzebadges
2
2
ThisworksonMac:find.-typef-iname"*.txt"-execsh-c'iconv-fwindows-1252-tutf-8"$1">converted&&mvconverted"$1"'--"{}"\;,toconvertfromANSI
– djjeck
Feb17,2021at20:05
2
Myiconvcommandfromgitbashhasno-ooptionsoIusefileredirection>:find.-typef-name'*.txt'-execsh-c'iconv-f$(file-bi"$1"|sed-e"s/.*[]charset=//")-tutf-8>/tmp/converted"$1"&&mv/tmp/converted"$1"'--{}\;.Anyadvantageofusingthissyntax--asopposedtopassing{}directly?find.-typef-name'*.txt'-execsh-c'iconv-f$(file-bi{}|sed-e"s/.*[]charset=//")-tutf-8>/tmp/converted{}&&mv/tmp/converted{}'\;
– Sybuser
Dec30,2021at14:31
Addacomment
|
5
TheWikipediapageonnewlineshasasectiononconversionutilities.
ThisseemsyourbestbetforaconversionusingonlytoolsWindowsshipswith:
TYPEunix_file|FIND""/V>dos_file
Share
Improvethisanswer
Follow
answeredAug21,2009at9:21
user4358user4358
1
thisisnewlineconversionandhasnothingtodowithencodingconversion
– phuclv
Aug21,2020at23:18
Addacomment
|
3
UTFCastisaUnicodeconverterforWindowswhichsupportsbatchmode.I'musingthepaidversionandamquitecomfortablewithit.
UTFCastisaUnicodeconverterthatletsyoubatchconvertalltextfilestoUTFencodingswithjustaclickofyourmouse.YoucanuseittoconvertadirectoryfulloftextfilestoUTFencodingsincludingUTF-8,UTF-16andUTF-32toanoutputdirectory,whilemaintainingthedirectorystructureoftheoriginalfiles.Itdoesn'tevenmatterifyourtextfilehasadifferentextension,UTFCastcanautomaticallydetecttextfilesandconvertthem.
Share
Improvethisanswer
Follow
editedDec7,2011at2:16
Gaff
18.2k1414goldbadges5656silverbadges6868bronzebadges
answeredDec6,2011at18:48
TilerTiler
3111bronzebadge
3
Seemstheycannotconvertintothesamefolder,onlyintoanotherdestinationfolder.
– UweKeim
Aug9,2016at19:49
Theproversionallowsin-placeconversion.$20/3months.rotatingscrew.com/utfcast-version-comparison.aspx
– SherylHohman
Jan30,2019at19:19
Oh,express(free)versionisuseless-itonly"Detects"utf-8WITHBOM!!(everyonecandothat).OnlyProversionthatAuto-Renewsevery3monthsat$20apop,willauto-detect.Priceissteepforanon-enterpriseuser.ANDBewareifyoutrythebasicversion,andyourfileisalreadyutf-8(withoutBOM),thenthisconverterwilldetectitasASCII,then(re-)"convert"ittoutf-8,whichcouldresultingibberish.BeAwareifthisbeforetryingtheexpressversion!Theyhaveademoversionfortheprothatproducesnooutput-pointlessIMHOcuzcan'tverifyresultsbeforebuying!
– SherylHohman
Jan30,2019at19:38
Addacomment
|
3
ThereisfreeandopensourcebatchencodingconverternamedCPConverter.
Share
Improvethisanswer
Follow
editedAug29,2020at12:32
PeterMortensen
11.9k2323goldbadges6969silverbadges9090bronzebadges
answeredMar28,2020at19:15
MSSMSS
21222silverbadges55bronzebadges
1
ThislookedlikeexactlywhatIwashopingfor(aGUI)thoughnodrag-and-drop(youhavetousetheFilemenu)andnoANSIoption(thatIcouldfind).NothingIdidshoweduplaterasUTF-8inNotepad++.Ifthishadhadjustalittlemoredevelopmentthistoolwouldhavebeennearlyperfect.
– John
Feb24,2021at11:23
Addacomment
|
2
Inmyusecase,IneededautomaticinputencodingdetectionandtheretherewasalotoffileswithWindows-1250encoding,forwhichcommandfile-bi
延伸文章資訊
- 1Batch convert encoding in files - Super User
UTFCast is a Unicode converter that lets you batch convert all text files to UTF encodings with j...
- 2convert2utf - PyPI
***Batch mode***: Pass in a directory as the input, and all text files that meets the criteria un...
- 3convert all .txt files in a folder from ibm-850 unicode utf-8
Option Explicit 'Batch Convert of ibm-850 files to unicode utf-8 with BOM header Dim FS, Folder, ...
- 4Convert multiple files to UTF-8 encoding with Notepad++
This file will search all files and folders within a given directory, and use Notepad++ to conver...
- 5Notepad++ - Batch convert ANSI GB2312 files to UTF8
Copy the files you want to convert to F:\temp\UTF8. (You may change your path in your script); In...