Set Encoding When Importing CSV Data in R Under Windows

This is another simple thing I keep on looking up: how to set the encoding when importing CSV data in R under Windows. I need this when my data file is in UTF-8 (pretty standard these days), but I’m using R under Windows; or when I have a Windows-encoded file when using R elsewhere. The default encoding in Windows is not UTF-8, and R uses the default encoding — well, by default. Typically this is not an issue unless my data file contains accented characters in strings, which can lead to garbled text when the wrong encoding is set/assumed.

The solution is quite simple: add encoding="" to the read.csv() command, like this:

x <- read.csv("datafile.csv", encoding="Windows-1252")

or like this:

x <- read.csv("datafile.csv", encoding="UTF-8")

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s