How to do cross-references in SciFlow

SciFlow is one of several options when it comes to collaborative writing. I like the intuitive interface, but sometimes it can be hard to see some useful features built in — like cross-references to other sections and figures/tables. This is super easy in SciFlow.

Let’s start with a new document. In this example, I have two sections on the left, with a table in Section 2. I have also pressed the “outline” button to see the document outline on the right.

In this example, I want to add a cross-reference to section 2 at the end of the “Main text” section. I simply select the section I want to refer to on the right

and drop it in the text where I want the cross-reference to appear.

Here we go, the placeholder for the cross-reference is included.

We can cross-reference figures and tables in the same way. Select the figure or table on the right (note the “Figures, Tables & Equations” below the list of sections),

and drag it

into the main text:

At the time of writing, the placeholder makes no distinction between figures and tables, but it’s just a placeholder…

A bit like when using LaTeX, in SciFlow you use “What You See is What You Mean”, so the output will probably look different from what you have on the screen. Indeed, this is a strength of SciFlow, both in that it allows you to export in many formats, and in that it prevents you from spending hours tinkering with the formatting. Unlike some other online editors, SciFlow is good at producing Word documents that are commonplace in the social sciences (many journals insist on a Word document during submission), or PDF, as you like. You choose the style and can readily change that style because SciFlow separates content from form.

Here’s that little section in one style:

and here in another style:

There you go, placeholders replaced with the relevant text depending on the template used.

How to add text labels to a scatter plot in R?

Adding text labels to a scatter plot in R is easy. The basic function is text(), and here’s a reproducible example how you can use it to create these plots:

Adding text to a scatter plot in R

For the example, I’m creating random data. Since the data are random, your plots will look different. In this fictitious example, I look at the relationship between a policy indicator and performance. It is conventional to put the outcome variable on the Y axis and the predictor on the X axis, but in this example there’s no relationship to reality anyway… The reason I chose min and max values for the random variables here is that I jotted down this code as an explanation for a replication. In this example, we have 25 observations, for 25 units I call “cantons”. The third line here creates a string of characters “A” to “Y”, these are the labels!

policy = runif(25, min=0.4, max=0.7)
perfor = runif(25, min=500, max=570)
canton = sapply(65:89, function(x) rawToChar(as.raw(x)))

For the scatter plot on the left, we use plot(). Then we add the trend line with abline() and lm(). To add the labels, we have text(), the first argument gives the X value of each point, the second argument the Y value (so R knows where to place the text) and the third argument is the corresponding label. The argument pos=1 is there to tell R to draw the label underneath the point; with pos=2 (etc.) we can change that position.

plot(policy ~ perfor, bty="n", ylab="Policy Indicator", xlab="Performance", main="Policy and Performance")
abline(lm(policy ~ perfor), col="red")
text(perfor, policy, canton, pos=1)

The scatter plot on the right is similar, but here we actually plot the labels instead of the dots. There are two differences in the code: First, we add type="n" to create the scatter plot without actually drawing any circles (an empty plot if you will). Second, when we add the text in the third line of the code, we do not have pos=1, because we want to place the labels exactly where the points are.

plot(policy ~ perfor, bty="n", type="n", ylab="Policy Indicator", xlab="Performance", main="Policy and Performance")
abline(lm(policy ~ perfor), col="red")
text(perfor, policy, canton)

Calculating VIF by hand

A widespread measure of multicollinearity is the VIF (short for variance inflation factor). Multicollinearity describes the situation when the predictor variables in a multiple regression model are highly correlated, which is usually not desirable (assuming you haven’t gone Bayesian yet).

In R, the VIF can easily be calculated with a function in library car. It’s actually not difficult to do it by hand — which incidentally helps understand what we measure with the VIF, or why there is no different VIF for logistic regression models, or why the VIF is better than looking at bivariate correlations between predictors.

We start with some random data to run the multiple regression model. Here we create one outcome (y) and three predictor variables (x, z, a), full of random numbers. That’ll do for a demonstration.

x = runif(50) 
y = runif(50)
z = runif(50)
a = runif(50)

Here’s a simple OLS model:


m = lm(y ~ x + z + a)

If you have library car installed, you can easily calculate the VIF:

library(car)
vif(m)

To do it by hand, though, we run a linear regression model (OLS) for each of the predictors. Here’s the code for predictor x. One of the predictors becomes the outcome variable (here x), and the other predictors remain predictors. The variable used as the outcome previously (y) does not appear here.

mx = lm(x ~ z + a)

The VIF is simply: 1/(1-R²) of this model. In R, we can run the following:

1/(1-summary(mx)$r.squared)

R: empty cells in weighted cross-tabs across multiple variables

I’m not even sure how to succinctly describe the problem, but here’s what worked for me. Well, I have two sets of variables and want to run a cross-tabulation. I also want to weigh the frequencies and then calculate the sum of them, and there are some empty (blank) cells to add to the mix. Three small problems in one; R to the rescue.

The two series of variables are as follows: attributes, each with 6 categories to indicate frequencies, alas grouped. So for attribute 1, variable Q5_1 indicates 0 occurrences, 1 to 5 occurrences, 6 to 10 etc. There are also sectors to identify subgroups, using a series of dummy variables to identify the sector (Q6_1, Q6_2, … Q6_10). So basically I want to run table(Q5_{1:6}, Q6_{1:10}), turning the categorical variables into approximative frequency counts.

First, I attach() my data; the get(paste(...)) code seems to like this by a mile.

Second, I create an empty matrix that I will subsequently fill with the (approximated) frequencies: tbl <- matrix(data=NA, nrow=6,ncol=10).

Third, I cycle through each pair of variables: 1:6 sectors (sectvar), and 1:10 attributes (attrvar).
for(attrvar in 1:6) {
for(sectvar in 1:13) {

Here I create a simple cross-tabulation for the current pair of variables. get(paste(...)) does all the work.
raw <- table(get(paste("Q5_",sectvar,sep="")), get(paste("Q6_",attrvar,sep="")))

Since I want to weigh the counts so as to approximate the actual frequencies from the categorical counts, I run into problems if there are empty cells in the previous step. table() simply leaves them out in the result. That’s usually fine, but problematic because of the weights. So I have to add the zeros back in. Here’s one way to do this: create an empty vector with as many zeros as the variable (attrvar) has: 6. (The package agrmt has a helper function for similar cases.)
raw2 <- c(0,0,0,0,0,0)

Next I replace all the zeros with the actual values from the variables raw if they exist. If they do not exist, we keep the zero.
for(i in 1:6){raw2[as.numeric(dimnames(raw)[[1]])[i]] <- raw[i]}

Now we have a complete frequency vector and I can apply my weights.
wei <- raw2 * c(0, 2.5, 5.5, 10.5, 15.5, 0)

and then sum up to approximate the actual count:
tbl[sectvar, attrvar] <- sum(wei)
}
}

print(tbl) now gives me the cross-tabulation with approximate counts.