Wordscores/JFreq with Long Manifestos

Today I run into an unexpected error when using Wordscores in R. I used JFreq 0.5.4 to calculate the word frequencies from 35 parties with rather long party manifestos. This resulted in a 3.4M CSV file with 42462 columns. R would throw up an error regarding read.table when I called Austin‘s (0.2) wfm function to import the word frequencies: “Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names”. Well, the file seems too wide to open.

The solution I found was to use the old JFreq 0.2.5, which produces the output the other way around (rows/columns switched). Even if it is a bit slower than the newer JFreq, having a rather long (as opposed to wide) CSV with the word frequencies does not seem to pose problems.

15 thoughts on “Wordscores/JFreq with Long Manifestos

  1. Thanks for posting the link to the old JFreq program. I did run into the problem that my file was “too wide” and couldn’t be transposed using R (which has probably less to do with R than with an internal memory problem).

  2. Hi Didier,

    I am trying the older version of JFreq to avoid read.table error. But as it has switched my rows and columns. Now, how can I access of reference txt files and virgin txt files.
    As getdocs command is returning a word and not doc. Can you please help me out knowing how the code will change for older version of JFreq.

    For Example : ref <- getdocs(a, c(4,3)) //How can I change this piece of line to include docs and not words.

  3. Hi Didier,

    Is there any way to extract the scorable words out of total words. The code is :

    predict(ws, newdata=vir)
    output : 5551 of 101159 words (5.49%) are scorable

    How can I extract these 5551 words.

  4. If your Wordscores model is called “ws”, I guess what you’re looking for is in ws$pi [use str(ws) to explore the structure of the model, or any R object for that]. The total words are in ws$data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.