Introduction
My story starts, as many good stories do, with the awesomeness of Entity Framework. Anyone who has used it knows that you can replace those ugly SQL WHERE clauses with beautiful lambda predicates:
var adultCustomers = db.Customers
.Where(c => c.DateOfBirth < DateTime.Now.AddYears(-18));
Once you get the hang of that, it usually doesn’t take long to start wondering “Can I somehow pass in those predicates at runtime?” Because it would enable a generic search method – something like:
IEnumerable<Customer> SearchCustomers(string filterLambda)
{
return db.Customers.Where( <somehow turn filterLambda into a lambda> );
}
First of all, what exactly would we be converting filterLambda
to? Pop Quiz: What’s the type of Where
‘s argument? If, like me, you guessed Func<Customer, bool>
then you’re close but not quite there. It’s actually Expression<Func<Customer, bool>>
where Expression
comes from the Linq namespace.
It makes sense when you think about it. Entity Framework isn’t going to run your lambda. It wants to convert it to a SQL WHERE clause. That’s no easy task (did I mention EF is awesome?) but it’s at least contemplatable when the lambda logic is spelled out in an expression tree.
Compiling with Roslyn – first attempt
If you’re like me, at this point you think “Wait a minute, can’t Roslyn do that?” After all, filterLambda
is a snippet of C# code. And now your online searches, with keywords like “Roslyn lambda expression tree”, will lead you straight to Roslyn’s CSharpScript.EvaluateAsync
method. You need a bit of padding for the equivalent of C#’s using
statements and assembly references, but basically the magic is just one line of code:
var filterExpression = await CSharpScript.EvaluateAsync<Func<Customer, bool>>(filterLambda, options);
Wow – problem solved, right? Well, yes – with a but. And potentially a big one.
The memory leak problem
The problem is that this Roslyn magic comes with an unfortunate side effect. During compliation EvaluateAsync
generates a new assembly in memory – and it stays there forever. At this point in time there’s no way to unload it. The assembly is not huge but you get another one every time you call EvaluateAsync
. I built a RESTful service with a search function based on EvaluateAsync
and one day I came in to find it had crashed with OutOfMemoryException
.
Unloading assemblies
I went through the five stages – Denial, Anger, Depression, etc, and eventually reached Acceptance: EvaluateAsync
is incredibly convenient but sadly it isn’t suitable for what I’m doing.
In researching the problem I saw hints of a potential solution in a new feature of .NET Core 3 called “collectible AssemblyLoadContexts”. AssemblyLoadContext
has been around for a long time, but collectible ALCs, with an Unload
method, are new.
If I could somehow generate the Roslyn assembly inside one of these babies then I could unload it. But EvaluateAsync
doesn’t give me access to its generated assembly; I would need to use Roslyn at a lower level. The solution that follows is based on a great article by Filip W.
Last things first: instantiating an Expression with reflection
When using full Roslyn, instead of CSharpScript
, we’ll have to compile a full class library instead of a snippet. Let’s first contemplate how we would use a class library to get the Expression we want, assuming that we can somehow build such a library.
Fortunately we wouldn’t need much code. Imagine we had this class in our project:
using System.Linq.Expressions;
using My.EF.Model; // for Customer
static class Wrapper {
public static Expression<Func<Customer, bool>> expr = c => c.DateOfBirth < DateTime.Now.AddYears(-18);
}
We would be able to write db.Customers.Where(Wrapper.expr)
! Still, if Wrapper
will be built on the fly then it would be cleaner to put some distance between it and the rest of our code. Reflection can do that:
var wrapper = typeof(Wrapper);
var exprField = wrapper.GetField("expr");
var expr = (Expression<Func<Customer, bool>>)exprField.GetValue(null); // pass null for a static field.
… and we would be able to write db.Customers.Where(expr)
.
So the plan becomes clear. We want a new Evaluate
method that does the following:
- Take in the user’s filter lambda as a string.
- Instantiate a collectible AssemblyLoadContext.
- Prepare the
Wrapper
class source code in a string. - Somehow turn that string into an assembly (this is the key step).
- Load that assembly in the collectible ALC.
- Use reflection to extract the value of
Wrapper.expr
. This is the expression tree we need. - Unload the ALC.
The signature of this method will be T Evaluate<T>(string lambdaOfTypeT)
. It’s not async
because the Roslyn compiler we will use is not async.
Notice something interesting about T
. Not only will we use it directly in the method (when we cast the result of GetValue
), but we also need to “inject” it into our Wrapper
source code. In other words, if T
is Expression<Func<Customer, bool>>
then we need to inject “Expression<Func<Customer, bool>>” into our dynamic source code. To no-one’s surprise, someone has already posted the function we need on Stack Overflow. Thanks to Adam Sills.
Roslyn compiling – second attempt
The first chore is to define a collectible AssemblyLoadContext; there’s not much to it:
private class CollectibleAssemblyLoadContext : AssemblyLoadContext, IDisposable
{
public CollectibleAssemblyLoadContext() : base(true)
{ }
protected override Assembly Load(AssemblyName assemblyName)
{
return null;
}
public void Dispose()
{
Unload();
}
}
And here, at long last, is the magic memory-leak-free Roslyn function (note that I’m using C# 8.0):
public T Evaluate<T>(string lambda)
{
var returnTypeAsString = GetCSharpRepresentation(typeof(T), true);
var outerClass = StandardHeader + $"public static class Wrapper {{ public static {returnTypeAsString} expr = {lambda}; }}";
var compilation = CSharpCompilation.Create("FilterCompiler_" + Guid.NewGuid(),
new[] { CSharpSyntaxTree.ParseText(outerClass) },
References,
new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary));
using var assemblyLoadContext = new CollectibleAssemblyLoadContext();
using var ms = new MemoryStream();
var cr = compilation.Emit(ms);
if (!cr.Success)
{
throw new InvalidOperationException("Error in expression: " + cr.Diagnostics.First(e =>
e.Severity == DiagnosticSeverity.Error).GetMessage());
}
ms.Seek(0, SeekOrigin.Begin);
var assembly = assemblyLoadContext.LoadFromStream(ms);
var outerClassType = assembly.GetType("Wrapper");
var exprField = outerClassType.GetField("expr", BindingFlags.Public | BindingFlags.Static);
// ReSharper disable once PossibleNullReferenceException
return (T)exprField.GetValue(null);
}
There are three things in that code that I still haven’t defined for you:
- StandardHeader is simply a string full of
using
statements. - References is a
PortableExecutableReference[]
– essentially a list of the assemblies that our dynamic assembly references. - GetCSharpRepresentation is the function I described earlier that turns a type into a C# type string.
I bundled it all up into a class that you can use like so:
var referencedTypes = new[] { typeof(Customer) };
var filterCompiler = new SearchFilterCompiler(referencedTypes, referencedTypes.Select(t => t.Namespace));
var expr = filterCompiler.CSharpScriptEvaluate<Expression<Func<Customer, bool>>>("c => c.DateOfBirth < DateTime.Now.AddYears(-18)");
The complete file is here (yes, I put this link at the bottom to make you read the article first).