Skip to Navigation Skip to Content Skip to Search Skip to Site Map
Search

Merging data from Excel in R – 15 June 2017

This demonstration covers merge two Excel files in R and a couple of steps to check the data along the way.

You can watch the recording (4 minutes 46 seconds):

or download  (right click and choose the save/download link option) the recording to your computer via the link:

mergeExcelInR (.m4v format, 42 MB)

and read the parts of the script below (code available when viewing the full article)

The script:

library(dplyr)
library(readxl)

# read in data files
students <- read_excel("Student Info.xlsx", skip=2)
papersTaken <- read_excel("PapersStudents.xls", skip=2)


names(students)
names(papersTaken)

names(students) <- make.names(names(students))
names(students)

# Check if there are a reasonable number of multiple entries

papersTaken %>% group_by(studentID) %>%
 summarise(subtotal = n()) %>% 
 filter(subtotal > 1)

students %>% group_by(student.ID) %>%
 summarise(subtotal = n()) %>% 
 filter(subtotal > 1)

#check if there are unmatched entries
papersTaken[!(papersTaken$studentID %in% students$student.ID),]
students[!(students$student.ID %in% papersTaken$studentID),]

#combine data sets
combined <- merge(papersTaken, students, by.x="studentID", by.y="student.ID")

# save as csv
write.csv(combined, file="combined.csv", row.names=FALSE)

 

Leave a comment