In my data scientist job, I regularly have to import several different files that contain the same type of information due to export constraints in different software. If you are in a similar situation, below is a clear and simple way to be able to automatically import your files as individual data frames or combine them into a single data frame.
Before we get started with our code, we first must prepare our files. We need to have a way to programmatically choose the files that we want to import into R. While you could choose any way to distinguish these files, here are two of the easiest ways:
- Create a unique prefix on all of the files that you want to import at once.
- Create a separate folder in your working directory and only include those files in that folder.
For example, if I had a set of Excel files called “SA#.xlsx”. If I had no other similar files that started with SA, then I already have my prefix. If there are other files in my folder that start with SA such as “SAT.xlsx”, I can easily create a folder and I will name it “SA”. Then, I will only include the files I want to import as SA into that folder.
Once we have a programmatic way to identify our files, we need to create a list of all of the file names. We can use the R function list.files() to achieve this.
File list with prefix
If you choose to add a prefix to your file names, we will use the pattern parameter of list.files() to select the specific files that we want.
# Formula
filelist <- list.files(pattern = "^<prefix>")#Example
filelist <- list.files(pattern = "^SA")
The pattern takes in a regular expression. Therefore, we can use the “^” symbol to represent the beginning of the string. This ensures that any other file names that include “SA” within the name but not at the beginning will not be included in this set of names. Note: This will only pull files from your working directory. You can change the…