Laravel-excel: Why do queued exports take a lot of time?

Created on 2 Jul 2020  ·  7Comments  ·  Source: Maatwebsite/Laravel-Excel

Prerequisites

Versions

  • PHP version: 7.1.3
  • Laravel version: 5.8
  • Package version: 3.1

Description

I have been using this package over a year now. Recently I started to have problems with memory and performance. Users are trying to download a lot of data; some times historic data. So I read the docs and I found the queued section. At first, it was the perfect solution. With queued exports, multiple jobs are dispatched in order to generate the file by chunks. But I noticed something. Each job (AppendQueryToSheet) that was dispatched took more time to finish than the last one. And the final job (CloseSheet) took even more time to finish.

My question is: ¿is this the way that queued export works? ¿or am I missing something?

Additional Information

I am using Redis and Horizon. Even I tried with multiple workers but it seems that everytime a job finishes, a new job is dispatched. So, it did not solve my problem.

I attached this image so you can see times registered.

Selection_007

question

Most helpful comment

Every query has to reopen the spreadsheet file which takes PhpSpreadsheet some time, so indeed gets longer every append.

I've got some changes planned for 3.2 that will improve this.

All 7 comments

Every query has to reopen the spreadsheet file which takes PhpSpreadsheet some time, so indeed gets longer every append.

I've got some changes planned for 3.2 that will improve this.

I'm experiencing the same issue here. The file sizes aren't massive either, only 5-10mb. I'm not experiencing any spikes in CPU or Memory usage.

I decided to build a service with python and pandas and it improved a lot, and I mean A LOT. Laravel Excel is an excellent library but I guess for this job maybe it's not the right tool

@okmkey45 To be fair, it's not really Laravel Excel's fault, it relies on PHPSpreadsheet which is an inefficient library for reading/writing files. It's a shame because the abstractions around chunking queries and queueing jobs are really handy here, and save me a lot of time writing a service in Go/Python and hooking it up with my queue.

There are quite a few alternatives to PHPSpreadsheet which are much faster on large files, maybe they could be swapped in when we use ->queue or ShouldQueue?

@aakarim that's actually something I've consider, but currently don't have the time or use-case to implement this. I would be open for a PR as long as it would be kept simple and opt-in.

I'll have a bash after this deadline. Are there any packages you prefer/recommend @patrickbrouwers? https://github.com/box/spout seems like a good place to start.

@aakarim yes box/spout. I have even consider to make it possible to use league/csv if you just want csv, that's by far the fastest option for large bulk imports/exports.

Was this page helpful?
0 / 5 - 0 ratings