Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue with rich text #664

Open
StevenQuickGS1 opened this issue Jun 6, 2022 · 3 comments
Open

Performance issue with rich text #664

StevenQuickGS1 opened this issue Jun 6, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@StevenQuickGS1
Copy link

StevenQuickGS1 commented Jun 6, 2022

I'm inserting text with superscript footnote ids into cells and have found and usage of rich text is ~15x slower than plain text

Attached is a benchmark comparing epplus to npoi.

I am inserting 50k rows with two columns, the first column just an int id, the second column:

  • for plain text: [id] This is the test for id (ie: [1] This is the text for 1)
  • for rich text: instead of [id] above, just the int id as superscript.

rich text

epplus

Is there a better way to apply rich text than this that I'm missing?

Code for the benchmark is just a .net 6 console app using benchmark .net

    [SimpleJob(RuntimeMoniker.Net60)]
    [MemoryDiagnoser]
    //[RPlotExporter]
    public class ExcelLibraryCreateTests
    {
        private readonly Dictionary<int, string> items = new();

        [GlobalSetup]
        public void Setup()
        {
            ExcelPackage.LicenseContext = LicenseContext.NonCommercial;

            for (int i = 0; i < 50000; i++)
                items.Add(i, "This is text for " + i);
        }

        [Benchmark]
        public byte[] EpplusPlainText()
        {
            using (var file = new ExcelPackage())
            {
                var rowIndex = 1;
                var worksheet = file.Workbook.Worksheets.Add("Data");
                worksheet.Cells[rowIndex, 1].Value = "Key";
                worksheet.Cells[rowIndex, 2].Value = "Value";
                foreach (var item in items)
                {
                    rowIndex++;
                    worksheet.Cells[rowIndex, 1].Value = item.Key;
                    worksheet.Cells[rowIndex, 2].Value = $"[{item.Key}] {item.Value}";
                }
                return file.GetAsByteArray();
            }
        }

        [Benchmark]
        public byte[] EpplusRichText()
        {
            using (var file = new ExcelPackage())
            {
                var rowIndex = 1;
                var worksheet = file.Workbook.Worksheets.Add("Data");
                worksheet.Cells[rowIndex, 1].Value = "Key";
                worksheet.Cells[rowIndex, 2].Value = "Value";
                foreach (var item in items)
                {
                    rowIndex++;
                    worksheet.Cells[rowIndex, 1].Value = item.Key;

                    var richTextCollection = worksheet.Cells[rowIndex, 2].RichText;
                    var richText = richTextCollection.Add(item.Key.ToString());
                    richText.VerticalAlign = ExcelVerticalAlignmentFont.Superscript;
                    richTextCollection.Add(item.Value);
                }
                return file.GetAsByteArray();
            }
        }

        [Benchmark]
        public byte[] NpoiPlainText()
        {
            var workbook = new XSSFWorkbook();
            var sheet = workbook.CreateSheet();
            var rowIndex = 0;
            var header = sheet.CreateRow(0);
            header.CreateCell(0).SetCellValue("Key");
            header.CreateCell(1).SetCellValue("Value");

            foreach (var item in items)
            {
                rowIndex++;

                var row = sheet.CreateRow(rowIndex);
                row.CreateCell(0).SetCellValue(item.Key);
                row.CreateCell(1).SetCellValue($"[{item.Key}]{item.Value}");
            }

            using var ms = new MemoryStream();
            workbook.Write(ms);
            return ms.ToArray();
        }

        [Benchmark]
        public byte[] NpoiRichText()
        {
            var workbook = new XSSFWorkbook();
            var sheet = workbook.CreateSheet();
            var rowIndex = 0;
            var header = sheet.CreateRow(0);
            header.CreateCell(0).SetCellValue("Key");
            header.CreateCell(1).SetCellValue("Value");

            var superscript = workbook.CreateFont();
            superscript.TypeOffset = NPOI.SS.UserModel.FontSuperScript.Super;

            foreach (var item in items)
            {
                rowIndex++;

                var row = sheet.CreateRow(rowIndex);
                row.CreateCell(0).SetCellValue(item.Key);

                var cell = row.CreateCell(1);
                cell.SetCellValue($"{item.Key}{item.Value}");
                cell.RichStringCellValue.ApplyFont(0, 1, superscript);
            }

            using var ms = new MemoryStream();
            workbook.Write(ms);
            return ms.ToArray();
        }
@JanKallman
Copy link
Contributor

RichText creates a xml document for each cell, so the performance might not be that good, if you have a lot of data. I'll have a look to see if we can improve it somehow.

@JanKallman
Copy link
Contributor

Yes, this has bad performance if you have many cells with rich text. We'll need to rewrite this and remove the dependence on the XML DOM. I'll add this as an enhancement to a future version.

@JanKallman JanKallman added the enhancement New feature or request label Jun 8, 2022
@StevenQuickGS1
Copy link
Author

Great thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants