Friday, October 16, 2015

Constructing and compiling C# using Roslyn

The optimal path for my current project requires the ability to create and use new classes at runtime. The alternatives all look rather clumsy while this approach looked elegant.

.NET has had some capacity to compile arbitrary code since the beginning but a lot of effort has recently gone in to a new compiler – known as Roslyn. On the face of it there are two ways towards my goal – use the CodeDom which is language agnostic or try out the newer (Microsoft) open source Roslyn technology that is designed to compile C# and VB.NET.

CodeDom and Roslyn are only partly related and after some research it looked like Roslyn was the way ahead as it provides a feature complete compiler for my language of choice C#. The other major plus is that Roslyn runs “in process” while CodeDom runs compiles through another executable with (perhaps) a rather clunky API.

If you fancy going the well tried CodeDom way then MSDN has a good code example towards the bottom of this article.

To add the Roslyn tooling to a project use the NuGet Package Manager option on the Tools menu and Choose “Manage NuGet Packages for Solution…” Select as the package source and then search for “Microsoft.CodeAnalysis”. Click on Microsoft.CodeAnalysis.CSharp , click the “Install” button and when offered, accept the license.

The actual development task turned out to be pretty steady.

I started with a class to define the properties of the class to be constructed

internal class ClassProperty {     public string PropertyName { get; set; } = "";     public string DataType { get; set; } = "";     public ClassProperty() { }     public ClassProperty(string propertyName, string dataType)     {         PropertyName = propertyName;         DataType = dataType;     } }
And supplied a populated List<ClassProperty> to a new class that I was going to use to construct and compile the code. In fact I created a little method to loop through the fields of a database table and populate the list (as that was the direction I was going in and I am lazy) but a List can be quickly populated with just a few lines of code.

The new class also had properties exposed for a namespace, class name, using directives, inheritance and interfaces. (The latter two are part implemented in the following code in that they are added to the class declaration but any related methods remain unimplemented at this stage.)

I backed up the property creation by creating a Dictionary of default values and you can see how it is applied after this.

private Dictionary<string, string> defaultValues = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase) {     ["string"] = " = String.Empty;",     ["Int64"] = " = 0;",     ["Int32"] = " = 0;",     ["Int16"] = " = 0;",     ["DateTime"] = " = new DateTime(1970,1,1);",     ["double"] = " = 0;",     ["single"] = " = 0;",     ["Decimal"] = " = 0;",     ["bool"] = " = false;",     ["Boolean"] = " = false;",     ["int"] = " = 0;",     ["long"] = " = 0;" };

The method to create the code is pretty short.
private string buildClassCode() {     StringBuilder sb = new StringBuilder();     foreach(string uses in usings)     {         sb.AppendLine("using " + uses + ";");     }     sb.AppendLine("namespace " + nameSpace);     sb.AppendLine("{"); // start namespace     string classInherits = (inherits.Length > 0 || interfaces.Count > 0) ? " : " + inherits : "";     foreach(string inface in interfaces)     {         classInherits += (classInherits.Length > 3) ? ", " + inface : inface;     }     sb.AppendLine($"public class {tableName}{classInherits}" );     sb.AppendLine("{"); // start class     sb.AppendLine($"public {tableName}()"); // default constructor     sb.AppendLine("{}");     foreach (ClassProperty newProperty in classProperties)     {         sb.AppendLine($"public {newProperty.DataType} {newProperty.PropertyName} {"{ get; set;}"}{defaultValues[newProperty.DataType]}");     }     sb.AppendLine("}"); // end class     sb.AppendLine("}"); // end namespace     return sb.ToString(); }
I tested this by popping the output into a multi-line textbox. while the code lacks indentation it is perfectly clear as the image below shows

Now to get that code compiled and run an initial test using VS Debug mode to step through the code and watch the results.

private void buildinstance() {     string instance = buildClassCode();     SyntaxTree syntaxTree = CSharpSyntaxTree.ParseText(instance);     // inspect the tree     //SyntaxNode root = syntaxTree.GetRoot();     //foreach (var node in root.DescendantNodes())     //{     //    var x = node.GetText();     //}     string assemblyName = tableName; //Path.GetRandomFileName();     MetadataReference[] references = new MetadataReference[]     {         MetadataReference.CreateFromFile(typeof(object).Assembly.Location),         MetadataReference.CreateFromFile(typeof(Enumerable).Assembly.Location)     };     CSharpCompilation compilation = CSharpCompilation.Create(         assemblyName,         syntaxTrees: new[] { syntaxTree },         references: references,         options: new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary)); // you can also build as a console app, windows.exe etc     Assembly assembly = null;     var ms = new MemoryStream();     EmitResult eResult = compilation.Emit(ms);     if (eResult.Success)     {                  ms.Seek(0, SeekOrigin.Begin);         assembly = Assembly.Load(ms.ToArray());         Type type = assembly.GetType(nameSpace + "." + tableName);         object newObject = Activator.CreateInstance(type);         // now we can prove that worked by stepping through the code while iterating over the properties         foreach (PropertyInfo propertyInfo in newObject.GetType().GetProperties())         {             string pName = propertyInfo.Name;             string dataType = propertyInfo.PropertyType.Name;             // and test we can assign a value             if (dataType.Equals("String", StringComparison.OrdinalIgnoreCase))             {                 propertyInfo.SetValue(newObject, "foo", null);             }         }     }     else     {         IEnumerable<Diagnostic> failures = eResult.Diagnostics.Where(diagnostic =>             diagnostic.IsWarningAsError || diagnostic.Severity == DiagnosticSeverity.Error);         string msg = "";         foreach (Diagnostic diagnostic in failures)         {             msg+= $"{diagnostic.Id}: {diagnostic.GetMessage()}" + "\r\n";         }         // do something useful with the message     }     ms.Close();     ms.Dispose(); }
Ran first time, although I went back and added the diagnostics you can see just to step throght things as they happened.

The SyntaxTree built a structure for the code elements from the string. The CSharpCompilation object took that syntax tree and compiled the code as C#. The compiled assembly was saved into a memory stream and the compilation results checked for errors. Assuming no errors, the code was loaded into an Assembly object and a new instance of the compiled object (class in this instance) created and inspected.

Next I checked that I could execute a method on the compiled class. I added the following line to the class code creation method:

sb.AppendLine("public string getBar(string foo) {return foo + \" bar\";}");

and a code line to call the method after the object is created in the buildinstance() method tested above

string res = (string)type.InvokeMember("getBar", BindingFlags.Default | BindingFlags.InvokeMethod,     null, newObject, new object[] { "foo" });

which returned the string “foo bar” to the variable res.

Job done.


Thought I would check that the default class constructor was executed when the compiled object instance was created. Thought it must be but…

Adjusted the class constructor to set one of the properties and then read that back when looping over the properties. Worked as expected.

The Activator.CreateInstance() method can be called with a parameter list to actuate alternate constructors with their parameters thus:

public static object CreateInstance(Type type, params object[] args)

which probably covers that issue as an Interface can’t contain a constructor but can contain any other methods I want to expose. Clearly, there is a wee bit more code to write to get a practical implementation but it is also clear that the overall approach is very practical.

And Then The Real World

In the real world your compiled class will probably reference one or more System/.NET dll, a custom library or your new class may inherit from another class already defined in the main program assembly.

You can add such references but you need to supply the path to the relevant dll or exe file. Apparently this changed from early Roslyn releases so (like me) you might find many obsolete methods if you Google around the subject. You also need to include the relevant "using" statements in the code itself of course (see above).

In my first stab at this the CSharpCompilation constructor was supplied with a default array of MetadataReference objects. This array can be extended to include whatever is required.

The default location for the main System dlls can be found with this line of code:

var assemblyPath = Path.GetDirectoryName(typeof(object).Assembly.Location);

You can then construct a path to (say) System.dll and Syatem.Data.dll like so:

string sysTm = Path.Combine(assemblyPath, "System.dll"); string sysData = Path.Combine(assemblyPath, "System.Data.dll");

You can similarly construct the path to the classes in your own program as well (assuming they are public):

string mePath = Path.Combine(Application.StartupPath, "MyProg.exe");

The paths can then be used to add the references like so:

MetadataReference[] references = new MetadataReference[] {     MetadataReference.CreateFromFile(typeof(object).Assembly.Location),     MetadataReference.CreateFromFile(typeof(Enumerable).Assembly.Location),     MetadataReference.CreateFromFile(sysTm),     MetadataReference.CreateFromFile(sysData),     MetadataReference.CreateFromFile(mePath) };

And that is how to add those vital references to your Roslyn compiled assembly.

No comments: