Getting the field backing a property using Reflection

Posted by Jb Evain Fri, 01 May 2009 08:46:00 GMT

Pluie, Pluie

Let’s consider you’re writing a LINQ provider. And that you need to opimize the following LINQ query:

from Person p in db where p.Age > 18 select p;

Let’s add a constraint. The underlying storage engine stores data according to the field name. That would mean that when generating the query for the underlying storage system, you’ll have to map

p.Age
into something that the underlying storage system will understand. In that case, a field. And all you have is a MemberExpression, giving you a PropertyInfo.

The issue here is that you have no way to get the FieldInfo backing the property. If you think about it, it’s normal. The setter and the getter of a property being traditional methods, they can contain any kind of code. Meaning that you can’t always find a field backing the property.

But in that case, it’s ok, we’re only interested in those forms of properties:

public int Age { get; set; }

private string name;

public string Name {
    get { return name; }
    set { name = value; }
}

Of course here what’s interesting is how to actually get the field. I’ve used the Reflection based CIL reader I wrote about yesterday. I disassemble the body of either a getter or a setter of the property, and if it matches a simple IL pattern, that is, if it looks to be a property backed by a field, I simply return the field.

To do the actual IL matching, I re-implemented something Rodrigo and I wrote when we were working on instrumenting assemblies at db4o. The code itself is pretty neat.

Anyway, that’s another opportunity to write a simple extension method:

public static FieldInfo GetBackingField (this PropertyInfo self)

Again, you’re more than welcome to have a look at the implementation. Don’t forget that it depends on the Reflection based CIL reader.

Reflection based CIL reader

Posted by Jb Evain Thu, 30 Apr 2009 20:25:00 GMT

Le diamant

As I was writing, earlier this month, when I worked on a static aspect weaver, the first library we used, to programmatically retrieve the CIL bytecode, was a library published by Lutz Roeder (the original author of the most famous Reflector tool), called ILReader.

It suffered from a number of limitation, and you were tied to the whole System.Reflection infrastructure. Which, during the .net 1.0 time, was somewhat limited, and lacked a few features required to get access to every single detail in an assembly, including the CIL bytecode. It evolved since, for instance, starting from .net 2.0, there’s a GetILAsByteArray on a MethodBody used to get the raw CIL code.

Anyway, most of those concerns were addressed by Cecil, but still, for some use-cases, it could be nice to be able to have access to the CIL bytecode at a higher level of abstraction than a plain raw byte array.

On .net, you can use a library also named ILReader, but it has a few checks that are specific to .net, there’s no information about a license of the code, and also, I’m not especially fond of the way instructions are represented.

So last time, for an hack I’ll soon write about, I extracted Mono.Cecil’s Instruction type, and wrote a cute extension method, or rock, as I like to call them. Its signature:

public IList<Instruction> GetInstructions (this MethodBase self)

I would have loved to declare the extension method on the System.Reflection.MethodBody type, to make things more consistent with the methods it already has, but there’s no cross platform way to get a System.Reflection.MethodBase from a System.Reflection.MethodBody.

Anyway, it’s terribly easy to use if you’ve already used Cecil. The only difference is that for branches, the operand is the offset as an integer, not the target instruction. As a sample usage, here’s a (very) incomplete CIL reflection based disassembler:

static void PrintByteCode (MethodInfo method)
{
    foreach (Instruction instruction in method.GetInstructions ())
        PrintInstruction (instruction);
}

static void PrintInstruction (Instruction instruction)
{
    Console.Write ("{0}: {1} ",
        Labelize (instruction.Offset),
        instruction.OpCode.Name);

    switch (instruction.OpCode.OperandType) {
    case OperandType.InlineNone :
        break;
    case OperandType.InlineSwitch :
        var branches = instruction.Operand as int [];
        for (int i = 0; i < branches.Length; i++) {
            if (i > 0)
                Console.Write (", ");
            Console.Write (Labelize (branches [i]));
        }
        break;
    case OperandType.ShortInlineBrTarget :
    case OperandType.InlineBrTarget :
        Console.Write (Labelize ((int) instruction.Operand));
        break;
    case OperandType.InlineString :
        Console.Write ("\"{0}\"", instruction.Operand);
        break;
    default :
        Console.WriteLine (instruction.Operand);
        break;
    }

    Console.WriteLine ();
}

And of course, you’re welcome to have a look at the implementation, under the MIT/X11 license.

pdb2mdb and Mono.Cecil.Pdb 3

Posted by Jb Evain Mon, 27 Apr 2009 18:07:00 GMT

Sainte Anne

I always complained about the fact that debug symbols were not portable between different CLR implementations. The .net CLR consumes pdb files, which is an undocumented format. Another file format was added to the ECMA-335 in a late revision. I wrote about this file format a while ago.

To sum up, it was added very late while Mono already started to use its own format (mdb) and the .net CLR doesn’t understand it anyway. So even if it’s not a bad format (it could use some improvements, like a GUID heap similar to the one in a .net assembly), basically no one uses it in the real world.

As mentioned in a recent post, the CCI contains an interesting piece of code, a managed pdb reader, licensed under the Ms-PL. I extracted it, and used it to be able to better share debug symbols between the .net CLR and Mono.

pdb2mdb

Robert Jordan, a long time Mono contributor, first wrote a tool named pdb2mdb, to convert a pdb to a mdb. The issue is that it was based on a combination of COM and the mixed mode assembly ISymWrapper which comes with the .net framework. All in all, it means that this version of pdb2mdb could only run on on the .net framework on Windows.

With the managed pdb reader, it was very easy to write a fully managed pdb2mdb tool. It’s now available in svn, and it will come with every other developer tool, such as ilasm or the linker. It’s very easy to use. Say you’re deploying a .net application on Linux, you have an assembly Foo.dll, and a Foo.pdb file, just use:

pdb2mdb Foo.dll

And the tool will generate a file Foo.dll.mdb, that Mono can use to display line information in stack traces.

Mono.Cecil.Pdb

Mono.Cecil.Pdb is an assembly that you use together with Cecil, to have line information at the IL level. It’s used by tools such as Gendarme, or MoMA, to help diagnose and locate issues.

I’ve integrated the managed reader, and the folks from NDepend were kind enough to beta test it. After a few fixes, the managed reader passed all NDepend tests, and was performing a lot better than its unmanaged counterpart. It’s now the default, and only the pdb writer uses the ISymWrapper approach.

It would be an interesting challenge for someone to try to write a managed writer from the information gathered in the reader. It may not be easy though.

Converting Delegates to Expression Trees 9

Posted by Jb Evain Wed, 22 Apr 2009 19:03:00 GMT

Plage du diamant

Back when I was working at db4o, we had fun implementing a mechanism somehow similar to LINQ, to have strongly typed queries expressed using code itself. The implementation uses Mono.Cecil and Cecil.FlowAnalysis to decompile a delegate into an AST, that db4o’s query optimizer can process.

Since .net 3.5, an API, System.Linq.Expressions, can be used to get a representation of a C# lambda expression into an object graph. An expression tree. .net 4.0 will add support for statements to this API, but as far as I know, the language itself hasn’t been updated to produce those new nodes.

Anyway, a few days ago, someone on Stack Overflow, asked how to turn a delegate into a LINQ expression tree. There’s no builtin feature to do that, it’s not a straightforward process. You basically have to decompile the compiled method. I guess it’s a good thing that I’m working on a decompiler, if I need to decompile something.

Tonight I wrote a short spike to verify the feasibility of my idea, and it turns out to be pretty simple. Sample:

static void Main ()
{
    Func<int, int> magic = i => i * 42;

    Expression<Func<int, int>> expression =
        DelegateConverter.ToExpression (magic);

    Console.WriteLine (expression.ToString ());
    // prints: i => i * 42
    Console.WriteLine (expression.Compile ().Invoke (1));
    // prints: 42
}

DelegateConverter is implemented as a simple visitor which walks over a Cecil.Decompiler AST, and generates, if possible, the according Linq Expression Tree. Pretty cool isn’t it?

You can browse the code of the spike. Keep in mind that it’s nowhere to be complete, and that it’s just a proof of concept. Still, I think it’s a pretty cool usage of the Cecil.Decompiler library.

Older posts: 1 2 3 ... 35