This is another simple thing I keep on looking up: how to set the encoding when importing CSV data in R under Windows. I need this when my data file is in UTF-8 (pretty standard these days), but I’m using R under Windows; or when I have a Windows-encoded file when using R elsewhere. The default encoding in Windows is not UTF-8, and R uses the default encoding — well, by default. Typically this is not an issue unless my data file contains accented characters in strings, which can lead to garbled text when the wrong encoding is set/assumed.
The solution is quite simple: add
encoding="" to the
read.csv() command, like this:
x <- read.csv("datafile.csv", encoding="Windows-1252")
or like this:
x <- read.csv("datafile.csv", encoding="UTF-8")
One Reply to “Set Encoding When Importing CSV Data in R Under Windows”
Working across operating systems (Windows, Mac, GNU/Linux), I found it good practice to always specify encodings.