Illegal Character Compilation Error | Baeldung

文章推薦指數: 80 %
投票人數:10人

The illegal character compilation error is a file type encoding error. It's produced if we use an incorrect encoding in our files when they ... StartHereCourses ▼▲ RESTwithSpring ThecanonicalreferenceforbuildingaproductiongradeAPIwithSpring LearnSpringSecurity ▼▲ THEuniqueSpringSecurityeducationifyou’reworkingwithJavatoday LearnSpringSecurityCore FocusontheCoreofSpringSecurity5 LearnSpringSecurityOAuth FocusonthenewOAuth2stackinSpringSecurity5 LearnSpring Fromnoexperiencetoactuallybuildingstuff​ LearnSpringDataJPA ThefullguidetopersistencewithSpringDataJPA Guides ▼▲ Persistence ThePersistencewithSpringguides REST TheguidesonbuildingRESTAPIswithSpring Security TheSpringSecurityguides About ▼▲ FullArchive Thehighleveloverviewofallthearticlesonthesite. BaeldungEbooks DiscoverallofoureBooks WriteforBaeldung Becomeawriteronthesite AboutBaeldung AboutBaeldung. JavaTop GetstartedwithSpring5andSpringBoot2,throughtheLearnSpringcourse: >CHECKOUTTHECOURSE 1.Overview Theillegalcharactercompilationerrorisafiletypeencodingerror.It'sproducedifweuseanincorrectencodinginourfileswhentheyarecreated.Asresult,inlanguageslikeJava,wecangetthistypeoferrorwhenwetrytocompileourproject.Inthistutorial,we'lldescribetheproblemindetailalongwithsomescenarioswherewemayencounterit,andthen,we'llpresentsomeexamplesofhowtoresolveit. 2.IllegalCharacterCompilationError 2.1.ByteOrderMark(BOM) Beforewegointothebyteordermark,weneedtotakeaquicklookattheUCS(Unicode)TransformationFormat(UTF).UTFisacharacterencodingformatthatcanencodeallofthepossiblecharactercodepointsinUnicode.ThereareseveralkindsofUTFencodings.Amongallthese,UTF-8hasbeenthemostused. UTF-8usesan8-bitvariable-widthencodingtomaximizecompatibilitywithASCII.Whenweusethisencodinginourfiles,wemayfindsomebytesthatrepresenttheUnicodecodepoint.Asaresult,ourfilesstartwithaU+FEFFbyteordermark(BOM).Thismark,correctlyused,isinvisible.However,insomecases,itcouldleadtodataerrors. IntheUTF-8encoding,thepresenceoftheBOMisnotfundamental.Althoughit'snotessential,theBOMmaystillappearinUTF-8encodedtext.TheBOMadditioncouldhappeneitherbyanencodingconversionorbyatexteditorthatflagsthecontentasUTF-8. TexteditorslikeNotepadonWindowscouldproducethiskindofaddition.Asaconsequence,whenweuseaNotepad-liketexteditortocreateacodeexampleandtrytorunit,wecouldgetacompilationerror.Incontrast,modernIDEsencodecreatedfilesasUTF-8withouttheBOM.Thenextsectionswillshowsomeexamplesofthisproblem. 2.2.ClasswithIllegalCharacterCompilationError Typically,weworkwithadvancedIDEs,butsometimes,weuseatexteditorinstead.Unfortunately,aswe'velearned,sometexteditorscouldcreatemoreproblemsthansolutionsbecausesavingafilewithaBOMcouldleadtoacompilationerrorinJava.The“illegalcharacter”erroroccursinthecompilationphase,soit'squiteeasytodetect.Thenextexampleshowsushowitworks. First,let'swriteasimpleclassinourtexteditor,suchasNotepad.Thisclassisjustarepresentation–wecouldwriteanycodetotest.Next,wesaveourfilewiththeBOMtotest: publicclassTestBOM{ publicstaticvoidmain(String...args){ System.out.println("BOMTest"); } } Now,whenwetrytocompilethisfileusingthejavaccommand: $javac./TestBOM.java Consequently,wegettheerrormessage: ∩╗┐publicclassTestBOM{ ^ .\TestBOM.java:1:error:illegalcharacter:'\u00bf' ∩╗┐publicclassTestBOM{ ^ 2errors Ideally,tofixthisproblem,theonlythingtodoissavethefileasUTF-8withoutBOMencoding.Afterthat,theproblemissolved.WeshouldalwayscheckthatourfilesaresavedwithoutaBOM. Anotherwaytofixthisissueiswithatoollikedos2unix.ThistoolwillremovetheBOMandalsotakecareofotheridiosyncrasiesofWindowstextfiles. 3.ReadingFiles Additionally,let'sanalyzesomeexamplesofreadingfilesencodedwithBOM. Initially,weneedtocreateafilewithBOMtouseforourtest.Thisfilecontainsoursampletext,“HelloworldwithBOM.”–whichwillbeourexpectedstring.Next,let'sstarttesting. 3.1.ReadingFilesUsingBufferedReader First,we'lltestthefileusingtheBufferedReaderclass: @Test publicvoidwhenInputFileHasBOM_thenUseInputStream()throwsIOException{ Stringline; Stringactual=""; try(BufferedReaderbr=newBufferedReader(newInputStreamReader(file))){ while((line=br.readLine())!=null){ actual+=line; } } assertEquals(expected,actual); } Inthiscase,whenwetrytoassertthatthestringsareequal,wegetanerror: org.opentest4j.AssertionFailedError:expected:butwas: Expected:HelloworldwithBOM. Actual:HelloworldwithBOM. Actually,ifweskimthetestresponse,bothstringslookapparentlyequal.Evenso,theactualvalueofthestringcontainstheBOM.Asresult,thestringsaren'tequal. Moreover,aquickfixwouldbetoreplaceBOMcharacters: @Test publicvoidwhenInputFileHasBOM_thenUseInputStreamWithReplace()throwsIOException{ Stringline; Stringactual=""; try(BufferedReaderbr=newBufferedReader(newInputStreamReader(file))){ while((line=br.readLine())!=null){ actual+=line.replace("\uFEFF",""); } } assertEquals(expected,actual); } ThereplacemethodclearstheBOMfromourstring,soourtestpasses.Weneedtoworkcarefullywiththereplacemethod.Ahugenumberoffilestoprocesscanleadtoperformanceissues. 3.2.ReadingFilesUsingApacheCommonsIO Inaddition,theApacheCommonsIOlibraryprovidestheBOMInputStreamclass.ThisclassisawrapperthatincludesanencodedByteOrderMarkasitsfirstbytes.Let'sseehowitworks: @Test publicvoidwhenInputFileHasBOM_thenUseBOMInputStream()throwsIOException{ Stringline; Stringactual=""; ByteOrderMark[]byteOrderMarks=newByteOrderMark[]{ ByteOrderMark.UTF_8,ByteOrderMark.UTF_16BE,ByteOrderMark.UTF_16LE,ByteOrderMark.UTF_32BE,ByteOrderMark.UTF_32LE }; InputStreaminputStream=newBOMInputStream(ioStream,false,byteOrderMarks); Readerreader=newInputStreamReader(inputStream); BufferedReaderbr=newBufferedReader(reader); while((line=br.readLine())!=null){ actual+=line; } assertEquals(expected,actual); } Thecodeissimilartopreviousexamples,butwepasstheBOMInputStreamasaparameterintotheInputStreamReader. 3.3.ReadingFilesUsingGoogleData(GData) Ontheotherhand,anotherhelpfullibrarytohandletheBOMisGoogleData(GData).Thisisanolderlibrary,butithelpsmanagetheBOMinsidethefiles.ItusesXMLasitsunderlyingformat.Let'sseeitinaction: @Test publicvoidwhenInputFileHasBOM_thenUseGoogleGdata()throwsIOException{ char[]actual=newchar[21]; try(Readerr=newUnicodeReader(ioStream,null)){ r.read(actual); } assertEquals(expected,String.valueOf(actual)); } Finally,asweobservedinthepreviousexamples,removingtheBOMfromthefilesisimportant.Ifwedon'thandleitproperlyinourfiles,unexpectedresultswillhappenwhenthedataisread.That'swhyweneedtobeawareoftheexistenceofthismarkinourfiles. 4.Conclusion Inthisarticle,wecoveredseveraltopicsregardingtheillegalcharactercompilationerrorinJava.First,welearnedwhatUTFisandhowtheBOMisintegratedintoit.Second,weshowedasampleclasscreatedusingatexteditor–WindowsNotepad,inthiscase.Thegeneratedclassthrewthecompilationerrorfortheillegalcharacter.Finally,wepresentedsomecodeexamplesonhowtoreadfileswithaBOM. Asusual,allthecodeusedforthisexamplecanbefoundoveronGitHub. Javabottom GetstartedwithSpring5andSpringBoot2,throughtheLearnSpringcourse: >>CHECKOUTTHECOURSE Genericfooterbanner LearningtobuildyourAPIwithSpring? DownloadtheE-book Commentsareclosedonthisarticle! Javasidebarbanner BuildingaRESTAPIwithSpring5? DownloadtheE-book Followthe Java Category FollowtheJavacategorytogetregularinfoaboutthenewarticlesandtutorialswepublishhere. FOLLOWTHEJAVACATEGORY



請為這篇文章評分?