pdb2mdb and Mono.Cecil.Pdb 3
I always complained about the fact that debug symbols were not portable between different CLR implementations. The .net CLR consumes pdb files, which is an undocumented format. Another file format was added to the ECMA-335 in a late revision. I wrote about this file format a while ago.
To sum up, it was added very late while Mono already started to use its own format (mdb) and the .net CLR doesn’t understand it anyway. So even if it’s not a bad format (it could use some improvements, like a GUID heap similar to the one in a .net assembly), basically no one uses it in the real world.
As mentioned in a recent post, the CCI contains an interesting piece of code, a managed pdb reader, licensed under the Ms-PL. I extracted it, and used it to be able to better share debug symbols between the .net CLR and Mono.
pdb2mdb
Robert Jordan, a long time Mono contributor, first wrote a tool named pdb2mdb, to convert a pdb to a mdb. The issue is that it was based on a combination of COM and the mixed mode assembly ISymWrapper which comes with the .net framework. All in all, it means that this version of pdb2mdb could only run on on the .net framework on Windows.
With the managed pdb reader, it was very easy to write a fully managed pdb2mdb tool. It’s now available in svn, and it will come with every other developer tool, such as ilasm or the linker. It’s very easy to use. Say you’re deploying a .net application on Linux, you have an assembly Foo.dll, and a Foo.pdb file, just use:
pdb2mdb Foo.dll
And the tool will generate a file Foo.dll.mdb, that Mono can use to display line information in stack traces.
Mono.Cecil.Pdb
Mono.Cecil.Pdb is an assembly that you use together with Cecil, to have line information at the IL level. It’s used by tools such as Gendarme, or MoMA, to help diagnose and locate issues.
I’ve integrated the managed reader, and the folks from NDepend were kind enough to beta test it. After a few fixes, the managed reader passed all NDepend tests, and was performing a lot better than its unmanaged counterpart. It’s now the default, and only the pdb writer uses the ISymWrapper approach.
It would be an interesting challenge for someone to try to write a managed writer from the information gathered in the reader. It may not be easy though.
Converting Delegates to Expression Trees 9
Back when I was working at db4o, we had fun implementing a mechanism somehow similar to LINQ, to have strongly typed queries expressed using code itself. The implementation uses Mono.Cecil and Cecil.FlowAnalysis to decompile a delegate into an AST, that db4o’s query optimizer can process.
Since .net 3.5, an API, System.Linq.Expressions, can be used to get a representation of a C# lambda expression into an object graph. An expression tree. .net 4.0 will add support for statements to this API, but as far as I know, the language itself hasn’t been updated to produce those new nodes.
Anyway, a few days ago, someone on Stack Overflow, asked how to turn a delegate into a LINQ expression tree. There’s no builtin feature to do that, it’s not a straightforward process. You basically have to decompile the compiled method. I guess it’s a good thing that I’m working on a decompiler, if I need to decompile something.
Tonight I wrote a short spike to verify the feasibility of my idea, and it turns out to be pretty simple. Sample:
static void Main ()
{
Func<int, int> magic = i => i * 42;
Expression<Func<int, int>> expression =
DelegateConverter.ToExpression (magic);
Console.WriteLine (expression.ToString ());
// prints: i => i * 42
Console.WriteLine (expression.Compile ().Invoke (1));
// prints: 42
}
DelegateConverter is implemented as a simple visitor which walks over a Cecil.Decompiler AST, and generates, if possible, the according Linq Expression Tree. Pretty cool isn’t it?
You can browse the code of the spike. Keep in mind that it’s nowhere to be complete, and that it’s just a proof of concept. Still, I think it’s a pretty cool usage of the Cecil.Decompiler library.
Cecil and the CCI 5
Quite a number of friends pinged me about the recent release of the CCI, under the Ms-PL, and were curious about my take on it, and its effect on Cecil and its ecosystem.
First of all, there’s a bit of a story here, and I’ll write it here for those who like me, love software history. Back in the years 2003 and 2004, I was working with Thomas Gil, one of my mentor and programming hero, on one of the first static aspect weaver on .net, AspectDNG, now abandoned. I was actively researching better ways to do CIL injection.
We went from raw IL text manipulation, to Reflection and Reflection.Emit using Lutz’s ILReader library, to RAIL, until I decide to work on Cecil.
In the meantime, I’ve stumbled upon ILMerge, a tool from Mike Barnett, and mailed him to ask what powered the tool, and he put me in contact with Herman Venter, the man behind the CCI effort. I wrote Herman a couple of mails, in a terrible English, and begged him to push for a release of the CCI under a license we could use in AspectDNG. That was in March 2004.
As you can guess, it quite didn’t work out at that time, so I started working on Cecil. A few weeks after, Miguel blogged about the need of such library. He already had the Mono Linker in mind. I mailed him, got SVN access, checked in the beginning of Cecil, got Sébastien interested, etc.
I had the opportunity to be invited by Microsoft to attend an informal AOP workshop the year later, and to met with Herman, which I remember as a very nice person. I am not sure he remembers the terribly shy kid that did a terrible presentation in a terrible English. But all in all, I’m happy that five years later, my request went through.
Now the CCI release in its own CodePlex page is not really a big event, as it was already released and licensed under the Ms-PL, as it’s part of Sandcastle.
Anyway, Cecil is quite mature in its current form, it’s used by a fair number of (known) applications (please help to improve the list), and I’m currently working on two things.
The first one is a refactoring of Cecil, which vastly reduces memory consumption as well as reading/writing time. Hopefully I’ll have a beta in a month or so. We have great plans for this version of Cecil, and it’s consuming a lot of my time, more on this later.
The second one is an extensible decompiler, Cecil.Decompiler, that will greatly benefit from the Cecil refactoring. The time I dedicate to it is a bit phagocyted by the Cecil refactoring right now, but it’s certainly one of my favorite project.
The CCI is a combination of Cecil, the decompiler, and something to write a decompiled AST back, which will be the natural evolution of the Cecil decompiler. Note that the CCI decompilation/compilation process is not extensible. Now that it’s open source, you can hack it yourself, sadly, the CCI code is well, a bit messy to be polite, or not exactly a joy to read. Also you probably won’t be able to contribute back to the CCI.
Anyway, it does its job alright, and so does Cecil. Choice is always good, let’s welcome the CCI in the small family of such tools. I, for one, will surprisingly keep hacking on and with Cecil :)
To conclude on a very positive note, the fantastic thing about this release is that the CCI contains a fully managed PDB reader and writer. That’s great news as so far, we failed to get any details about this file format. This means that we can now implement a fully managed Mono.Cecil.Pdb support, and that’s just great.
UPDATE: it appears that only the PDB reader is fully managed, the PDB writer is just a wrapper over the COM stuff, just like the current implementation of Mono.Cecil.Pdb. Well, at least it’s a start.
Mono embedder managed tool-chain 2
Miguel blogged a list of iPhone applications that were made using Unity3D, and scripted with Mono.
I’m delighted to have cute applications using Mono to show, and one involves raptors!
One of the concern for Unity3D is to reduce the size of the download as much as possible. And Mono certainly weight a little bit in the download. We have a page on our wiki which describes how to reduce the size of the runtime, but am writing today about what Unity3D uses to reduce the size of the different managed parts, be it the core libraries, or the managed part of the game itself.
They are using two different tools to reduce the size of the assemblies. So first of all, once the application is compiled, the first tool they use is the Mono Linker. I already had the occasion to write about the linker, as it is a tool that I started writing during the second Google Summer of Code I spent as a student, and that I’ve worked on when I joined Novell. The linker is today a mature piece of code, that is exercised during every single Mono build, as it is used to produce the Moonlight 2.0 version of our class library.
So they are using the linker, and this makes sure that everything that is not needed by the game or the engine is removed from the assemblies.
When the linking stage is achieved, they pre-compile the assemblies to native code using our AOT (Ahead of Time) compiler. This is necessary, as the iPhone prevents any JIT to run.
After AOT, you end up with a native binary, and the original managed binary, which still contains the intermediate code. If the assembly has been completely AOTed, this intermediate code is no longer necessary.
Here comes the second tool they’re using. This is a tool which is pretty new, and that I wrote for this specific usage. It’s called `mono-cil-strip`, and is now built along the traditional tools that we ship, such as ilasm or the linker. It uses a special mode of Cecil I hacked on, which preserves the original metadata structure of the assembly, but empties every single method body. It’s necessary to keep the native binary in sync with the managed binary, while still removing parts of it.
If you’re compiling your assemblies ahead of time, and looking for some bytes to save, here’s a neat way to do so.
Sadly I don’t have numbers handy, so I’ll encourage you to give it a try, but here we are, every single iPhone application produced using Unity3D went through Cecil (twice!).
And that’s pretty cool :)



