The call to Enum.IsDefined in EvaluatableTreeFindingExpressionVisitor.IsCurrentExpressionEvaluatable has been identified as expensive during routine profiling of EF Core. EF Core invokes EvaluatableTreeFindingExpressionVisitor on each query execution (as part of query parameterization) and so this code path is highly perf sensitive.
I think the solution is to stop calling Enum.IsDefined, and instead switch on NodeType, just like the net35 version of the code already does.
If this seems reasonable I will send a PR.
Sure, a PR would be greatly appreciated
Regarding the design: We moved away from the switch statement because of forward comaptibility/maintainability (in .NET 3.5, there were ~50 enum values, now theres 80+). I would like to try loading the defined values into a dictionary during type initialization and then just do a dictionary lookup. Whould that be fast enough?
The dictionary approach would likely be fast enough, certainly much faster than what we have now.
An alternative that would use less memory would be to one-time discover the the max enum value and check against that.
Yes, I've thought of the numeric approach too, but am a bit ambivalent about it since it isn't guaranteed to always work in the future, i.e. if a new enum value where to be introduced with a whole in the enum value numbers. I guess, that's probably not likely with System.* types but is this convention it safe enough to take a dependency on? (Honest question)
Pesonally, unless there's an actual impact tht needs to be further optimized, I'd prefer to go with the safe and guaranteed-to-be-always-true design because it means less mental overhead when reasoning over the code.
In theory, we could also go with the complicated solution and factor this out into a factory that returns a numeric strategy for 'typical' enums and only falls back to the dictionary-based strategy when it's instantiated for an enum type that actually does have holes in the enum values between 0 and max. But as I said, that's more work.
Agree the factory approach would be over-complicated
Yes, I believe this particular optimization would be safe. I cannot envisage a scenario where we would introduce gaps in this enum - there would just be no reason to do so.
We can go with the dictionary, it will likely be fast enough. But this could be one of those times where it makes sense to sacrifice some clarity to eek out just a bit more perf. WRT to EF, this is our most perf-critical re-linq code path.
Okay then. The check should of course test for upper and lower bounds in case someone casts a negative integer to the ExpressionType enum. Also, please add a test to the EvaluatableTreeFindingExpressionVisitorTest test fixture that asserts that the enum doesn't have holes. I usually do those to ensure the build is red if such an invariant is ever broken and add an explanatory fail message to the assertion so I know what needs to be done.
Just for fun sake I got another idea: We could populate a boolean array, using the enum value as an index and set the value to true if the value exists. That way, we still have the two bounds-checks plus one array look-up but I do admit, I haven't measured it Would be future proof and should be faster than the dictionary. But as I said, just an idea.
Looking forward to the PR