Optimizing Memory Usage while Working with Big Files having Large Datasets
When building a workbook with large data sets, or reading a big Microsoft Excel file, the total amount of RAM the process will take is always a concern. There are measures that can be adapted to cope with the challenge. Aspose.Cells for Python via .NET provides some relevant options and API calls to lower, reduce and optimize memory use. Also, it can help the process work more efficiently and run faster.
Use the MemorySetting.MEMORY_PREFERENCE option to optimize memory use for cells data and decrease the overall memory cost. When building a large data set for cells, it can save a certain amount of memory compared to using the default setting (MemorySetting.NORMAL).
Optimizing Memory
Reading Large Excel Files
The following example shows how to read a large Microsoft Excel file in optimized mode.
Writing Large Excel Files
The following example shows how to write a large dataset to a worksheet in an optimized mode.
Caution
The default option, MemorySetting.NORMAL is applied for all versions. For some situations, such as building a workbook with a large data set for cells, the MemorySetting.MEMORY_PREFERENCE option may optimize the memory use and decrease the memory cost for the application. However, this option may degrade performance in some special cases such as follow.
- Accessing Cells Randomly and Repeatedly: The most efficient sequence for accessing the cells collection is cell by cell in one row, and then row by row. Especially, if you access rows/cells by the Enumerator acquired from Cells, RowCollection and Row, the performance would be maximized with MemorySetting.MEMORY_PREFERENCE.
- Inserting & Deleting Cells & Rows: Please note that if there are lots of insert/delete operations for Cells/Rows, the performance degradation will be notable for MemoryPreference mode as compared to the Normal mode.
- Operating on Different Cell Types: If most of the cells contain string values or formulas, the memory cost will be the same as Normal mode but if there are lots of empty cells, or cell values are numeric, bool and so on, the MemorySetting.MEMORY_PREFERENCE option will give better performance.