Compiling expression trees with Roslyn… without memory leaks

Introduction

My story starts, as many good stories do, with the awesomeness of Entity Framework.  Anyone who has used it knows that you can replace those ugly SQL WHERE clauses with beautiful lambda predicates:

var adultCustomers = db.Customers
                          .Where(c => c.DateOfBirth < DateTime.Now.AddYears(-18));

Once you get the hang of that, it usually doesn’t take long to start wondering “Can I somehow pass in those predicates at runtime?”  Because it would enable a generic search method – something like:

IEnumerable<Customer> SearchCustomers(string filterLambda)
{
   return db.Customers.Where( <somehow turn filterLambda into a lambda> );
}

First of all, what exactly would we be converting filterLambda to?  Pop Quiz: What’s the type of Where‘s argument?  If, like me, you guessed Func<Customer, bool> then you’re close but not quite there.  It’s actually Expression<Func<Customer, bool>> where Expression comes from the Linq namespace.

It makes sense when you think about it.  Entity Framework isn’t going to run your lambda.  It wants to convert it to a SQL WHERE clause.  That’s no easy task (did I mention EF is awesome?) but it’s at least contemplatable when the lambda logic is spelled out in an expression tree.

Compiling with Roslyn – first attempt

If you’re like me, at this point you think “Wait a minute, can’t Roslyn do that?”  After all, filterLambda is a snippet of C# code.  And now your online searches, with keywords like “Roslyn lambda expression tree”, will lead you straight to Roslyn’s CSharpScript.EvaluateAsync method.  You need a bit of padding for the equivalent of C#’s using statements and assembly references, but basically the magic is just one line of code:

var filterExpression = await CSharpScript.EvaluateAsync<Func<Customer, bool>>(filterLambda, options);

Wow – problem solved, right? Well, yes – with a but.  And potentially a big one.

The memory leak problem

The problem is that this Roslyn magic comes with an unfortunate side effect.  During compliation EvaluateAsync generates a new assembly in memory – and it stays there forever.  At this point in time there’s no way to unload it.  The assembly is not huge but you get another one every time you call EvaluateAsync.  I built a RESTful service with a search function based on EvaluateAsync and one day I came in to find it had crashed with OutOfMemoryException.

Unloading assemblies

I went through the five stages – Denial, Anger, Depression, etc, and eventually reached Acceptance: EvaluateAsync is incredibly convenient but sadly it isn’t suitable for what I’m doing.

In researching the problem I saw hints of a potential solution in a new feature of .NET Core 3 called “collectible AssemblyLoadContexts”.  AssemblyLoadContext has been around for a long time, but collectible ALCs, with an Unload method, are new.

If I could somehow generate the Roslyn assembly inside one of these babies then I could unload it.  But EvaluateAsync doesn’t give me access to its generated assembly; I would need to use Roslyn at a lower level.  The solution that follows is based on a great article by Filip W.

Last things first: instantiating an Expression with reflection

When using full Roslyn, instead of CSharpScript, we’ll have to compile a full class library instead of a snippet.  Let’s first contemplate how we would use a class library to get the Expression we want, assuming that we can somehow build such a library.

Fortunately we wouldn’t need much code.  Imagine we had this class in our project:

using System.Linq.Expressions;
using My.EF.Model; // for Customer

static class Wrapper {
  public static Expression<Func<Customer, bool>> expr = c => c.DateOfBirth < DateTime.Now.AddYears(-18);
}

We would be able to write db.Customers.Where(Wrapper.expr)! Still, if Wrapper will be built on the fly then it would be cleaner to put some distance between it and the rest of our code.  Reflection can do that:

    var wrapper = typeof(Wrapper);
    var exprField = wrapper.GetField("expr");
    var expr = (Expression<Func<Customer, bool>>)exprField.GetValue(null); // pass null for a static field.

… and we would be able to write db.Customers.Where(expr).

So the plan becomes clear.  We want a new Evaluate method that does the following:

  1. Take in the user’s filter lambda as a string.
  2. Instantiate a collectible AssemblyLoadContext.
  3. Prepare the Wrapper class source code in a string.
  4. Somehow turn that string into an assembly (this is the key step).
  5. Load that assembly in the collectible ALC.
  6. Use reflection to extract the value of Wrapper.expr.  This is the expression tree we need.
  7. Unload the ALC.

The signature of this method will be T Evaluate<T>(string lambdaOfTypeT).  It’s not async because the Roslyn compiler we will use is not async.

Notice something interesting about T.  Not only will we use it directly in the method (when we cast the result of GetValue), but we also need to “inject” it into our Wrapper source code.  In other words, if T is Expression<Func<Customer, bool>> then we need to inject “Expression<Func<Customer, bool>>” into our dynamic source code.  To no-one’s surprise, someone has already posted the function we need on Stack Overflow.  Thanks to Adam Sills.

Roslyn compiling – second attempt

The first chore is to define a collectible AssemblyLoadContext; there’s not much to it:

        private class CollectibleAssemblyLoadContext : AssemblyLoadContext, IDisposable
        {
            public CollectibleAssemblyLoadContext() : base(true)
            { }

            protected override Assembly Load(AssemblyName assemblyName)
            {
                return null;
            }

            public void Dispose()
            {
                Unload();
            }
        }

And here, at long last, is the magic memory-leak-free Roslyn function (note that I’m using C# 8.0):

        public T Evaluate<T>(string lambda)
        {
            var returnTypeAsString = GetCSharpRepresentation(typeof(T), true);
            var outerClass = StandardHeader + $"public static class Wrapper {{ public static {returnTypeAsString} expr = {lambda}; }}";

            var compilation = CSharpCompilation.Create("FilterCompiler_" + Guid.NewGuid(),
                                                        new[] { CSharpSyntaxTree.ParseText(outerClass) },
                                                        References,
                                                        new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary));

            using var assemblyLoadContext = new CollectibleAssemblyLoadContext();
            using var ms = new MemoryStream();

            var cr = compilation.Emit(ms);
            if (!cr.Success)
            {
                throw new InvalidOperationException("Error in expression: " + cr.Diagnostics.First(e =>
                    e.Severity == DiagnosticSeverity.Error).GetMessage());
            }

            ms.Seek(0, SeekOrigin.Begin);
            var assembly = assemblyLoadContext.LoadFromStream(ms);

            var outerClassType = assembly.GetType("Wrapper");

            var exprField = outerClassType.GetField("expr", BindingFlags.Public | BindingFlags.Static);
            // ReSharper disable once PossibleNullReferenceException
            return (T)exprField.GetValue(null);
        }

There are three things in that code that I still haven’t defined for you:

  1. StandardHeader is simply a string full of using statements.
  2. References is a PortableExecutableReference[] – essentially a list of the assemblies that our dynamic assembly references.
  3. GetCSharpRepresentation is the function I described earlier that turns a type into a C# type string.

I bundled it all up into a class that you can use like so:

    var referencedTypes = new[] { typeof(Customer) };
    var filterCompiler = new SearchFilterCompiler(referencedTypes, referencedTypes.Select(t => t.Namespace));
    var expr = filterCompiler.CSharpScriptEvaluate<Expression<Func<Customer, bool>>>("c => c.DateOfBirth < DateTime.Now.AddYears(-18)");

The complete file is here (yes, I put this link at the bottom to make you read the article first).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s