Rewriting tool metagen with LLVM Clang

Current tool metagen is written in Java and it uses Doxygen as C++ parser. Doxygen is a great documenting tool and it has a decent C++ parser, but it also has its limitations, such as it's difficult to get all information from Doxygen XML. I still remember I had hard time to try to solve a type's full qualified name and I had to write simple code parser to parse the type relationship. Though current metagen was successfully used to generate meta data for Irrlicht which is a very complicated 3D graphics engine, it's very hard to make metagen suits as an all-purpose tool to work with any library. I tested that it's very hard to make metagen works with wxWidgets.

I believe LLVM Clang is the way I should switch to. There are two primary reasons I made this decision:

  • Clang is a real and very powerful C++ compiler. So Clang can produce every bit information that we need from any C++ code.
  • Clang is designed with tooling in mind. It's easy to develop tools based on Clang.

I started rewriting metagen during our China Dragon Boat Festival. After about two days work, I have almost finished the front end part, which parses Clang AST (abstract syntax tree) to metagen internal data structures. What I should do next is to re-organize the internal data structures and output to meta data code, which is not very much Clang relevant.

Up to now I'm very happy to work with Clang compiler. My only concern is that Clang is lacking documentation and there are not very much open source tools using Clang.

My goal for metagen is still the same: make metagen works with very complicated code base, such as wxWidgets, without too much human interventions. I believe Clang is able to meet my goal. :)


Sarunas Valaskevicius, 2014/02/08 11:32
Hi Qi,

Just out of interest, have you tried doxygen clang support?

Qi, 2014/02/08 12:03
The ultimate goal is to eliminate the dependence on third party tool, such as Doxygen.
Rob, 2013/12/28 15:29
Very cool project! Will you be releasing the source?

I've been meaning to try this for over a year but never find the time to get around to it. Would be so nice as a pre-build step to give us reflection without the bloat of reflection definitions! Some sort of middle step might be eventually necessary to insert some RTTI information (like descriptions of classes / funcs), but I'm really interested to see what kind of results you get!

Nice work, cheers from the USA! :)
Qi, 2013/12/29 09:06

Any sub projects in cpgf are also open source, so yes, I will release the source code.
However, beside waiting for the Clang version metagen, did you try current Java version metagen? It works decently.
Szymon Gatner, 2013/07/18 08:53
That is why I am not very happy with clReflect also. I didn't read the code but I strongly suspect that generated metadata is platform (compiler even) dependent. In the end it would probably have to be somehow embedded in executable (possibly as a resource on Win) to make reflection / serialization work.

My game engine (of course;)) already uses similar (to cpgf) exporting code because of luabind. I suspect you know it too. I actually always wanted to make more of use of it than just for Lua bindings. I was wondering how well cpgf could work with luabind. Could cpgf metadata be used to automatically generate luabind bindings? I can see that cpfg can generate direct Lua bindings but I would really like to keep luabind as it provides additional features like function calls, classes and inheritance.

So how is Clang-based metgen going? Will it allow to use C++11 in the code? C++11 support is disabled in Clang by default and clReflect suffers from that too, making it unusable for me.

Qi, 2013/07/18 09:58
I think you, same as some others, had misunderstood metagen. I have to explain to you again.

Metagen is only a productive tool, it's not a must-have tool. Without metagen, the reflection and script binding can still work because you can create the meta code manually, with syntax quite similar to Luabind. Also metagen only generates code recognized by cpgf, so it can't be used with Luabind.

If your requirement is only to use C++ from Lua, then the lua binding in cpgf supports more features than Luabind, such as function default argument, etc. Also cpgf Lua binding supports function calls, classes and inheritance.

I didn't test with C++11. I believe cpgf can work with most C++11 features, though metagen can't for now (the new Clang based metagen should work).
Szymon, 2013/07/18 10:12
I do understand the purpose metagen ;) My question was if I can use this metacode (whether it is generated or manually created) to run luabind exporting in the run-time. Pseudo code (run before main):

foreach (CpgfClass c in allCpgfExportedClasses())
foreach (CpgfMethod m in c)
exportLuabindMethod(c, m.pointerToMethodOfAClass)
// same for fields etc.

I hope this "snippet" is giving the idea of what I have in mind. The idea is just to reuse cpgf metadata/code to automate luabind bindings. I am atually about to try it myself ;)

I don't want to just switch to cpgf Lua binding because I am fairly sure this would break exiting scripts and there is a -lot- of them.

I understand that cpgf can work OK with C++ that is why I asked about Clang-based metagen. As I mentioned Clang has C++11 support disabled by default and that is why clReflect which also uses Clang can't parse C++11 code.
Qi, 2013/07/18 10:43
You can't use cpgf meta data for Luabind. Luabind has its own internal meta data which is not compatible with cpgf.

For C++11, don't worry, I can enable it in Clang. :-)
Szymon, 2013/07/18 10:48
"metadata" is a runtime thing in case of cpfg and luabind. I still see no reason why cpgf metadata can't be used to generate luabind metadata at run-time but maybe I am missing something important here. Sill I will experiment a bit and probably get back with more questions ;)

Anyway, very cool project :)
Szymon Gatner, 2013/07/17 22:56
Have you seen ? It has very similar goals and also uses clang to generate metadata. What are your thoughts on clReflect?
Qi, 2013/07/18 01:45
Yeah I did hear clReflect and I read its code. I got inspired from it about how to get Clang work. :-)
clReflect has its advantage, which the runtime should be quite small since it uses an external meta data database. The disadvantage is, it stores function address and field offset in database, which is not quite flexible and standard.
The meta data in cpgf is generated by compiler with template technique, which is quite powerful and standard, though the executable is much larger.
Enter your comment: