Testing CSV and Excel uploads with tempfiles

Lately, I've been working on a couple of features that allow bulk data to be uploaded through .csv or .xlsx files.

When working with files, passing file fixtures to tests is a common practice. File fixtures are sample files that you can store under spec/fixtures/files (or in your own customized path). This way, you can read an actual file in your tests or just pass the file path to other classes that might need to access the file data.

The problem is that in most cases the user can upload files with different structures or data, depending on the use case. This means that, for test purposes, if we want to cover different scenarios, we will eventually end up stuffing our fixtures folder with enough file variations that would cover all the scenarios. It's just not flexible enough and it looks hard to maintain. Plus, it has the downside of hiding data from your tests since you have to open the files to know what's inside.

Ideally, we should be able to create the sample data explicitly in each test scenario, run our expectations, and clean up the data afterward. Sounds like a good case for Tempfiles - a file that will be available to test but that will be deleted after the test runs.

# students_uploader_spec.rb
describe 'Services::StudentsUploader' do 
  it 'creates students' do 
    Tempfile.open(['students.csv', '.csv']) do |temp|
       CSV.open(temp, 'wb', col_sep: ';', headers: true) do |csv|
         csv << ['email', 'name']
         csv << ['[email protected]', 'Student Name']
       end

       result = described_class.call(temp)

       expect(result[:count]).to eq(1)
       expect(Student.count).to eq(1)
     end
  end
end

Looking at the example above I can cleary define three necessary steps:

  • Open a Tempfile and pass it a filename (extension is optional).
  • Inside the Tempfile block, open a CSV, pass it the tempfile instance and build the rows with the data you need
  • Run the expectations inside the tempfile block

A fourth automatic step will delete the tempfile so you don't end up with a bunch of tempfiles lying around in your project.

It's important to reinforce that everything that depends on the tempfile will have to be run inside the tempfile block. That's because the tempfile is deleted immediately after its block is closed. If you try to run your expectation after the tempfile block is closed, you'll get an error similar to this one:

Failures:
  Failure/Error: result = described_class.call(temp)

  NameError:
    undefined local variable or method `temp' for #<RSpec::ExampleGroups::Services::SudentsUploader:0x00007fb74000c0b8>
  # ./spec/services/students_uploader_spec.rb:1:in `block (3 levels) in <top (required)>'

But going back to our happy path, I can also write a similar context for xlsx files (I'm using the axlsx gem to write excel files):

describe 'Services::StudentsUploader' do
  context 'when uploading a xlsx' do  
    it 'creates students' do 
      Tempfile.open(['students.xlsx', '.xlsx']) do |temp|
        Axlsx::Package.new do |p|
          p.workbook.add_worksheet(name: "test") do |sheet|
            sheet.add_row ['name', 'email', 'code']
            sheet.add_row ['Jane Doe', nil, 123]
            sheet.add_row ['John Doe', '[email protected]', nil]
          end
          p.serialize(temp.path)
        end

        result = described_class.call(temp.path)

        expect(result[:count]).to eq(1)
        expect(Student.count).to eq(1)
      end
    end
  end
end

Now I can play around with different file structures, knowing that I will have my data explicitly available in each scenario and that I will be able to tweak it easily without the hassle of having to create and maintain external files.

Happy testing! 🧪

19