Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoadFromCollection enhancements #1058

Open
swmal opened this issue Sep 14, 2023 · 26 comments
Open

LoadFromCollection enhancements #1058

swmal opened this issue Sep 14, 2023 · 26 comments
Assignees
Labels
enhancement New feature or request

Comments

@swmal
Copy link
Contributor

swmal commented Sep 14, 2023

Working on fixes for the following issues:

Issue Description Status Available in version
#1042 Support for DisplayAttribute done 6.2.10, 7.0
#1052 EPPlusNestedTableColumn not expanding when filtering members done 6.2.10, 7.0
#969 Support for Dictionaries with attributes in LoadFromCollection done 7.0.2
#946 HyperlinkAttribute for LoadFromCollection done 6.2.10, 7.0
#876 Add support for the MetadataTypeAttribute in LoadFromCollection
#946, #1167 Support for sort order attribute on nested classes done 6.2.13, 7.0.2
Support for ColumnFormatter method on ExcelTableColumn attribute (see below) will start in a few days
@swmal swmal added the enhancement New feature or request label Sep 14, 2023
@swmal swmal self-assigned this Sep 14, 2023
swmal added a commit that referenced this issue Sep 14, 2023
@grosch-intl
Copy link

@swmal The only item not listed that I can remember is that I'd like a runtime way of formatting. For example, when a column is a DateTime, I usually want to format it according to the user's locale with something like a short date string. I can't do that with an attribute. It would be awesome to have a more generic FormatMethod type property on the EPPlus column attribute where I could give the name of a method that, at runtime, would be called to format the value before writing to the cell.

JanKallman pushed a commit that referenced this issue Sep 15, 2023
@swmal
Copy link
Contributor Author

swmal commented Sep 26, 2023

@grosch-intl Can you extend this class and show how you would like this to work? I assume the FormatMethod would sit on the class that contains the attribute?

public class Actor
{
    [EpplusIgnore]
    public int Id { get; set; }

    [EpplusTableColumn(Order = 3)]
    public string LastName { get; set; }
    [EpplusTableColumn(Order = 1, Header = "First name")]
    public string FirstName { get; set; }
    [EpplusTableColumn(Order = 2)]
    public string MiddleName { get; set; }

    [EpplusTableColumn(Order = 0, NumberFormat = "yyyy-MM-dd", TotalsRowLabel = "Total")]
    public DateTime Birthdate { get; set; }

    [EpplusTableColumn(Order = 4, NumberFormat = "€#,##0.00", TotalsRowFunction = RowFunctions.Sum, TotalsRowNumberFormat = "€#,##0.00")]
    public double Salary { get; set; }

    [EpplusTableColumn(Order = 5, NumberFormat = "0%", TotalsRowFormula = "Table1[[#Totals],[Tax amount]]/Table1[[#Totals],[Salary]]", TotalsRowNumberFormat ="0 %")]
    public double Tax { get; set; }
}

swmal added a commit that referenced this issue Sep 26, 2023
…s.DisplayAttribute in LoadFromCollection
@grosch-intl
Copy link

@swmal Something like this:

public class Example : Actor {
    [EpplusTableColumn(Order = 6, ColumnFormatter = nameof(Formatter))]
    public DateTime SomeDate { get; set; }

    public void Formatter(ExcelTableColumn column) {
        column.DataStyle.NumberFormat.Format = DateTimeFormatInfo.CurrentInfo.ShortDatePattern;
        column.DataCellStyleName = _dateColumnStyle;
    }
}

JanKallman pushed a commit that referenced this issue Sep 27, 2023
swmal pushed a commit that referenced this issue Sep 29, 2023
JanKallman added a commit that referenced this issue Oct 2, 2023
* WIP on LoadFromCollection hyperlinks

* Fixed an issue with auto filters

* Fixed failing test

* #1058, #946 added hyperlink style to hyperlinks created by the LoadFromCollection method

---------

Co-authored-by: JanKallman <[email protected]>
Co-authored-by: swmal <{ID}+username}@users.noreply.github.com>
@swmal
Copy link
Contributor Author

swmal commented Nov 8, 2023

@grosch-intl - does this look ok for #969?

    [TestClass]
    public class LoadFromCollectionDictionaryPropertyTests
    {
        [EpplusTable]
        public class TestClass
        {
            [EpplusTableColumn(Order = 2)] 
            public string Name { get; set; }

            [EPPlusDictionaryColumn(Order = 1, ColumnHeaders = new string[] { "A", "B", "C"})]
            public Dictionary<string, object> Dict { get; set; } 
        }

        [TestMethod]
        public void ShouldReadColumnsAndValuesFromDictionaryProperty()
        {
            var item1 = new TestClass
            {
                Name = "test 1",
                Dict = new Dictionary<string, object> { { "A", 1 }, { "B", 2 } }
            };
            var items = new List<TestClass> { item1 };
            using(var package = new ExcelPackage())
            {
                var sheet = package.Workbook.Worksheets.Add("test");
                sheet.Cells["A1"].LoadFromCollection(items, c => c.PrintHeaders = true);
                
                // Column headers
                Assert.AreEqual("A", sheet.Cells["A1"].Value);
                Assert.AreEqual("B", sheet.Cells["B1"].Value);
                Assert.AreEqual("C", sheet.Cells["C1"].Value);
                Assert.AreEqual("Name", sheet.Cells["D1"].Value);
                
                // values
                Assert.AreEqual(1, sheet.Cells["A2"].Value);
                Assert.AreEqual(2, sheet.Cells["B2"].Value);
                Assert.IsNull(sheet.Cells["C2"].Value);
                Assert.AreEqual("test 1", sheet.Cells["D2"].Value);
            }

        }
    }

swmal added a commit that referenced this issue Nov 8, 2023
@grosch-intl
Copy link

@swmal Unfortunately, no :( The headers aren't defined at compile time as you're showing there. The headers are going to be determined at runtime, so an attribute would tell which property contains the list of headers.

        [EpplusTable]
        public class TestClass
        {
            [EpplusTableColumn(Order = 2)] 
            public string Name { get; set; }

            [EPPlusIgnore]
            public List<string> Headers { get; set; }

            [EPPlusDictionaryColumn(Order = 1, ColumnHeaders = nameof(Headers))]
            public Dictionary<string, object> Dict { get; set; } 
        }

@swmal
Copy link
Contributor Author

swmal commented Nov 9, 2023

@grosch-intl Ok, I understand. I would, however, like to avoid to have the keys provided by another property on the class. We could read the keys/column headers from the first item in the list, but it doesn't feel intuitive.

Not sure if I have understood your requirements fully, but see new suggestion below. I will keep the ColumnHeaders from the previous example (might be useful if you have very static data like quarters of a year or similar), but if there is a HeadersKey and an IDictionaryKeysProvider set it will have higher priority.

Let me know what you think.

New interface IDictionaryKeysProvider in EPPlus

namespace OfficeOpenXml.LoadFunctions
{
    /// <summary>
    /// Provides keys of a property decorated with the <see cref="EPPlusDictionaryColumnAttribute"/>
    /// </summary>
    public interface IDictionaryKeysProvider
    {
        /// <summary>
        /// This function will return keys that will be used as column headers
        /// based on the <paramref name="key"/> that will be read from the <see cref="EPPlusDictionaryColumnAttribute"/>
        /// </summary>
        /// <param name="key"></param>
        /// <returns></returns>
        public IEnumerable<string> GetKeys(string key);
    }
}

New version of the above

[EpplusTable]
public class TestClass
{
    [EpplusTableColumn(Order = 2)] 
    public string Name { get; set; }

    [EPPlusDictionaryColumn(Order = 1, HeadersKey = "1")]
    public Dictionary<string, object> Columns { get; set; } 
}

public class MyKeysProvider : IDictionaryKeysProvider
{
    public IEnumerable<string> GetKeys(string key)
    {
        switch(key)
        {
            case "1":
                return new string[] { "A", "B", "C" };
            case "2":
                return new string[] { "C", "D", "E" };
            default:
                return Enumerable.Empty<string>();
        }
    }
}

[TestMethod]
public void ShouldReadColumnsAndValuesFromDictionaryProperty()
{
    var item1 = new TestClass
    {
        Name = "test 1",
        Columns = new Dictionary<string, object> { { "A", 1 }, { "B", 2 } }
    };
    var items = new List<TestClass> { item1 };
    using(var package = new ExcelPackage())
    {
        var sheet = package.Workbook.Worksheets.Add("test");
        sheet.Cells["A1"].LoadFromCollection(items, c =>
        {
            c.PrintHeaders = true;
            c.KeysProvider = new MyKeysProvider();
        });
        Assert.AreEqual("A", sheet.Cells["A1"].Value);
        Assert.AreEqual("B", sheet.Cells["B1"].Value);
        Assert.AreEqual("C", sheet.Cells["C1"].Value);
        Assert.AreEqual("Name", sheet.Cells["D1"].Value);
        Assert.AreEqual(1, sheet.Cells["A2"].Value);
        Assert.AreEqual(2, sheet.Cells["B2"].Value);
        Assert.IsNull(sheet.Cells["C2"].Value);
        Assert.AreEqual("test 1", sheet.Cells["D2"].Value);
    }

}

@grosch-intl
Copy link

@swmal That doesn't work because again, I'd have to know at compile time what my keys are. To make it work at runtime, I'd have to add a method to the keys provider to pass in the values for a given key, which breaks the single responsibility principle. I'm also not seeing where you're specifying that the table should use the MyKeysProvider anywhere.

Why don't you like the attribute? Seems clean and self-contained.

Another potential option is we could use an OrderedDictionary

@swmal
Copy link
Contributor Author

swmal commented Nov 10, 2023

@grosch-intl - I have looked at the code you added to #969 but since I don't know the structure of the data and data.Entries variables I'm afraid that I might be missing something.

Is this correct?

  • First we need to set an initial list of strings that both will be used as keys in the Dictionary and to generate columns. We should be able to provide this list in runtime and not via the attribute.

  • Then - for each instance of T in LoadFromCollection<T> we will try to get the values of each key from the Dictionary property and put it in the corresponding row/column.

Anything else?

I'm also not seeing where you're specifying that the table should use the MyKeysProvider anywhere.

MyKeysProvider only delivers what your headers list in #969 contains. It is set here:

sheet.Cells["A1"].LoadFromCollection(items, c =>
{
     c.PrintHeaders = true;
     c.KeysProvider = new MyKeysProvider();
 });

Then EPPlus will call the GetKeys(string key) to get the keys/column names once for each time it finds a property with the EPPlusDictionaryColumn with the key argument from the attribute so you know which column you should provide the keys for.

When EPPlus has retrieved this "initial delivery" of the keys (which also will be used as columns) it will loop through the list of objects and for each member that is decorated with the EPPlusDictionaryColumn attribute it will try to get the value from the dictionary exposed by the member and put it in the corresponding column (if the key is missing the cell will be empty).

Why don't you like the attribute? Seems clean and self-contained.

Again, there might be a misunderstanding from my side here. But here are my objections to it based on my current understanding:

  1. The basic principle for LoadFromCollection<T> is that a member on an instance of T should correspond to a cell value. The attributes we have added so far are used to describe various settings on column, row or table/range level. The intention is that you can use the attributes on your existing domain objects, DTO's or what ever and EPPlus will not interfere with the design of the class. We don't want to enforce adding a property that makes no sense outside EPPlus LoadFromCollection method.
  2. We have seen examples where LoadFromCollection<T> is called with 100.000 instances or T or even more. To me this doesn't feel right from a design perspective - all of the instances would have the capability to deliver the keys but only one of them would do so...? Have I understood your suggestion correctly or is this where I'm going wrong?
  3. We want to avoid using strings to reference to members that we read via reflection if possible. Even if you use nameof "on the outside" it is just a string in our internal implementation and this has caused complexity so far when it comes to sorting, recursion when handling nested classes, etc. Ideally we would like to have the attribute point to a method that EPPlus can call, which is what I tried to achieve.

@grosch-intl
Copy link

@swmal I'm so sorry, I thought I had replied to this!

  • First we need to set an initial list of strings that both will be used as keys in the Dictionary and to generate columns. We should be able to provide this list in runtime and not via the attribute.

No, there's no initial requirement. It's just a list of keys that will be determined at runtime:

void Main() {
	var keys = new HashSet<string>();
	
        var rows = GetMyData();

	foreach (var row in rows)
		foreach (var date in row.CountByQuarter.Keys)
			keys.Add(date);
			
	var orderedColumnNames = keys.OrderBy(k => ...).ToArray();
	
	sheet.Cells["A1"].LoadFromCollection(rows, c => {
           c.PrintHeaders = true;
           c.OrderedKeys = orderedColumnNames
        });
}

public class Foo {
        [EPPlusTableColumn(...)]
        public string Something { get; set; }

	public Dictionary<string, object> CountByQuarter { get; set; }
}

public List<Foo> GetMyData() { .... }
}

So I got my data, then I looped through the rows and looked at the dictionary that was there, grabbing each unique key. Some rows might have some, some rows have others, some might have all.

Then I sort that distinct list of keys via some business logic into the order that I want them to appear in the table. Now, when you write each row to the table, and it's time to hit the CountByQuarter property, you would create columns in the order specified via the orderedColumnNames from above.

I hope that's clearer?

MyKeysProvider only delivers what your headers list in #969 contains. It is set here:

sheet.Cells["A1"].LoadFromCollection(items, c =>
{
     c.PrintHeaders = true;
     c.KeysProvider = new MyKeysProvider();
 });

At the point I call this, I'm always going to have that orderedColumnNames array of sorted columns. Can I just pass this directly, instead of needing to create a provider class for it? I'd hate to have to create a providers class where I just have an initializer that stores the values and then the GetKeys returns that list back.

@swmal
Copy link
Contributor Author

swmal commented Nov 17, 2023

Ok, just so we are totally on the same page: you mean that there has to be no "master data" or "initial delivery" of the keys, but rather that EPPlus should loop through all the instances in the collection and create the dictionary with column names itself?

Even if we have an ordered dictionary, autodetecting sorting will in some cases be impossible. T-shirt size is an obvious example (XS, S, M, L, XL), other domains could be various geographic structures, code systems, etc. So I think that we need to have the possibility for the user of our API to define the sort order themselves.

Do you think the following would work?

  1. We keep the HeaderKey of the new EPPlusDictionaryColumn attribute. We should probably rename it to ColumnId
  2. EPPlus will autodetect keys from the collection and sort those keys in ascending order per default. We can add a property to the attribute where you can specify asc or desc.
  3. Instead of the KeysProvider we could add a Lambda expression to the LoadFromCollectionOptions where you can override the sorting. This would most likely be a Func<string, IEnumerable<string>, IEnumerable<string>>. Would look something like this:
sheet.Cells["A1"].LoadFromCollection(items, c =>
{
     c.PrintHeaders = true;
     c.DictionaryKeysSort = (headerKey, columns) => {
             // re-sort columns...
             return columns;
        };
 });

@grosch-intl
Copy link

Ok, just so we are totally on the same page: you mean that there has to be no "master data" or "initial delivery" of the keys, but rather that EPPlus should loop through all the instances in the collection and create the dictionary with column names itself?

No. That's our job. When EPPlus is called I already have the sorted column names.

Even if we have an ordered dictionary, autodetecting sorting will in some cases be impossible.

I'm realizing ordered dictionary wouldn't work because that would require ever row to have every key, which isn't the case.

Do you think the following would work?

I'm not exactly sure what that lambda is taking but I already have the sort order when I call LoadFromCollection,

See my example where I set c.OrderedKeys = orderedColumnNames

@grosch-intl
Copy link

Trying to write this like an agile story:

As a developer, I want to load my collection into a sheet using LoadFromCollection. One of my properties to load might be a Dictionary<string, object>.

I will provide an array of header names to the load call, and when you go to generate cells for the dictionary property, regardless of the contents of the dictionary, you will create a column for each header provided in the array, in the same order as the array of header names.

For each entry in the header array, check to see if that specific string exists as a key in the dictionary. If it does, use the provided value. If it does not, then leave the cell blank.

Any key/value pairs that are in the dictionary, but are not present in the array of header names, should be ignored.

@swmal
Copy link
Contributor Author

swmal commented Nov 20, 2023

Good, let's agree on the wording of this story first.

To clarify what I did above I'm changing these two statements in the text:

One of my properties to load might be a Dictionary<string, object>

I will provide an array of header names to the load call

As a developer, I want to load my collection into a sheet using LoadFromCollection. One or many of my properties to load might be a Dictionary<string, object>.

I will provide an array of header names for each such property to the load call, and when you go to generate cells for the dictionary property, regardless of the contents of the dictionary, you will create a column for each header provided in the array, in the same order as the array of header names.

For each entry in the header array, check to see if that specific string exists as a key in the dictionary. If it does, use the provided value. If it does not, then leave the cell blank.

Any key/value pairs that are in the dictionary, but are not present in the array of header names, should be ignored.

swmal added a commit that referenced this issue Nov 20, 2023
@grosch
Copy link

grosch commented Nov 20, 2023

Sounds good.

@swmal
Copy link
Contributor Author

swmal commented Nov 21, 2023

If we use your code above, could this work?

void Main() {
	var keys = new HashSet<string>();
	
        var rows = GetMyData();

	foreach (var row in rows)
		foreach (var date in row.CountByQuarter.Keys)
			keys.Add(date);
			
	var countByQuarterKeys = keys.OrderBy(k => ...).ToArray();
	
	sheet.Cells["A1"].LoadFromCollection(rows, c => {
           c.PrintHeaders = true;
           c.RegisterDictionaryKeys(keysStore => {
               // keysStore is something EPPlus will provide where you can map
              // ordered keys (for example a string[]) to Dictionary properties
               keysStore.RegisterKeys("cbq", countByQuarterKeys );
               // you can continue to register more keys for other properties here...
           }
        });
}

public class Foo {
        [EPPlusTableColumn(...)]
        public string Something { get; set; }

        [EPPlusDictionaryColumn(Order = 2, KeysId = "cbq")]
	public Dictionary<string, object> CountByQuarter { get; set; }
}

public List<Foo> GetMyData() { .... }
}

@grosch-intl
Copy link

grosch-intl commented Nov 21, 2023

Yes, that would work, though it feels a bit verbose. On c could we just directly have a method that we give a key and the ordered keys to? Maybe even an overload of RegisterDictionaryKeys that takes those two things as params?

The normal case is going to just be a single dictionary like this. I've yet to want more than one, honestly. Even if somebody did want more than one, they could just call it again.

But, if for whatever reason that doesn't work, yes, what you showed seems like what I need!

@swmal
Copy link
Contributor Author

swmal commented Nov 21, 2023

We need to design our interfaces as futureproof as possible to avoid breaking changes later on. So we don't want to limit this to only support one Dictionary property even if that would be enough in your case.

Your feedback is appreciated and I agree that we could make this less verbose. The reason for having the keyStore parameter would be if we believe that this will be a more complex domain in the future so we can extend it with more methods/properties. But that seems unlikely when I think about it.

So when registring multiple sets of keys we can do as you suggested:

	sheet.Cells["A1"].LoadFromCollection(rows, c => {
           c.PrintHeaders = true;
           c.RegisterDictionaryKeys("cbq", countByQuarterKeys);

We could also allow registering the keys with a default (internal) key that we would use if you skip setting the KeysId on the attribute. In that case you would get away with:

void Main() {
	var keys = new HashSet<string>();
	
        var rows = GetMyData();

	foreach (var row in rows)
		foreach (var date in row.CountByQuarter.Keys)
			keys.Add(date);
			
	var countByQuarterKeys = keys.OrderBy(k => ...).ToArray();
	
	sheet.Cells["A1"].LoadFromCollection(rows, c => {
           c.PrintHeaders = true;
           c.RegisterDictionaryKeys(countByQuarterKeys);
     });
}
public class Foo {
        [EPPlusTableColumn(...)]
        public string Something { get; set; }

        [EPPlusDictionaryColumn(Order = 2)]
	public Dictionary<string, object> CountByQuarter { get; set; }
}

public List<Foo> GetMyData() { .... }
}

swmal added a commit that referenced this issue Nov 21, 2023
swmal added a commit that referenced this issue Nov 21, 2023
@grosch-intl
Copy link

Looks good.

JanKallman pushed a commit that referenced this issue Nov 22, 2023
JanKallman pushed a commit that referenced this issue Nov 22, 2023
swmal added a commit that referenced this issue Nov 22, 2023
JanKallman pushed a commit that referenced this issue Nov 22, 2023
* #969, #1058 - implemented EPPlusDictionaryColumnAttribute for LoadFromCollection with attributes

* #969, #1058 - added support for Dictionary property in LoadFromCollection

* #1058, #969 - LoadFromCollection, support for Dictionary property

* #1058, #969 - added testclass

* #1058, #969 - added unit test for reading column headers/keys from attribute
@swmal
Copy link
Contributor Author

swmal commented Nov 22, 2023

@grosch-intl The Dictionary property and SortOrder on nested classes are done now and will be included in the next version (see table in the description of this issue).

Here are unit tests that demonstrates how it works:

@grosch-intl
Copy link

Awesome, you rock! When do you think the next version will be released?

@swmal
Copy link
Contributor Author

swmal commented Nov 23, 2023

Thanks - most likely within a week. If you want to test it you can use our dev nuget feed

@swmal
Copy link
Contributor Author

swmal commented Nov 23, 2023

@grosch-intl we released new versions earlier today.

@grosch-intl
Copy link

@swmal Thank you so much, Mats! That really cleans up my code. Love how invested you guys are in continuing to make this product so useful for your customers.

@swmal
Copy link
Contributor Author

swmal commented Dec 6, 2023

@grosch-intl Sorry for late response to this and thanks for your kind words. Just the formatting piece left, I am currently working on cleaning up some pieces related to #1196.

If you have upgraded to EPPlus 7 the new Take and Skip functions can be useful for formatting columns. You could do something like this:

var range = sheet.Cells["A1"].LoadFromCollection(items);
if(whatever)
{
   range.TakeSingleColumn(3).Style.Numberformat = "#,###.00";
   // or...
   range.SkipRows(1).TakeColumnsBetween(3,2).Style.Numberformat = "#,###.00";
}

@grosch-intl
Copy link

@swmal Just checking in on this one.

@swmal
Copy link
Contributor Author

swmal commented Jan 10, 2024

Hi Scott,
We have made many changes to LoadFromCollection based in the issues we had with the combination of attributes and filters (when you add an array of System.Reflection.MemberInfo in the call to LoadFromCollection). This was reported by another user and I don't know if you use these, but the fixes will cause some minor behaviour changes when column sort order is defined in multiple places. For this reason we will wait with further changes until the next minor version (7.1) which we hope to release in a few weeks.

I will pick up the remaining formatting functionality in the near future and hope to have it included in 7.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants