This program generates code for many languages from an UML Dia Diagram.
This program is a small utility that makes code from a Dia diagram. Supported languages are: ActionScript3, Ada, C, C++, C#, IDL, Java, PHP(4,5), Python, Ruby, shapefiles, and SQL create statement files.
Its intended purpose is to ease the programmer's work by generating the structure of the classes in an Object Oriented language (like C++, Java and C#) from a graphical representation of them (a Dia Diagram).
Dia2Code generates ActionScript, Ada, C, C++, C#, IDL, Java, PHP, Python, Ruby, and shape files. It can also generate a file with SQL's CREATE TABLE statements and .bat files for creating shapes. Templates and packages are supported but need testing.
The basic functionality can now be considered complete. Minor releases should fix bugs and add small improvements.
The generation of #include and import directives is done considering the types used in each class but, generally, only if those types are declared themselves in the diagram. Classes are searched in the types of: attributes, parents, method's return types, method's parameters, dependencies and associations.
Dia2Code looks for Generalizations, Realizations, Implements, Dependencies and Associations and adds that information to the class list.
Packages are implemented considering the geometry information of the objects (packages and classes).
Feel free to have a try and send me your comments.
This program is distributed under the GNU GPL. Read the COPYING file for details.
I've only tested it under Linux/i386 and Win32. I have notices of successful compilation on *BSD and GNU Hurd. It should work with other platforms, but your mileage may vary.
- Dia (I work with 0.88 but older versions may be OK). Not for the compilation part but to make some nice diagrams with it.
- libxml2 headers and development libraries (I use 2.4.1 but it may be OK if you use any version greater than 2.0.0).
- A C compiler.
- automake and autoconf (actually, optional)
Usually:
$ ./configure
$ make
# make install
Read the INSTALL
file for details.
If that doesn't work for you, try (from the innermost dia2code directory):
$ cc -I/usr/include/libxml *.c -O2 -o dia2code -lxml2 -ldl
$ strip dia2code
I've modified Makefile.am and configure.in so "configure" will hopefully find everything it needs.
As with any open-source software, compilation requires a copy of the
Xcode command-line tools.
To run ./configure
the user may be required to run ./reconfig.sh
, if
so you must obtain a copy of libtool from Homebrew or
MacPorts.
$ dia2code <parameters>
Where <parameters>
can be any combination of:
-h|--help
prints help on parameters and exits inmediately
-t (ada|as3|c|cpp|csharp|idl|java|php|php5|python|shp|sql)
tells dia2code to either create Ada (ada), ActionScript (as3), C (c), C++
(cpp), C# (csharp), Java (java), PHP4 (php), PHP5 (php5), Python (python),
shape files (shp), or SQL "create table" files (sql).
The default is C++.
-d <outdir>
tells it to put the files in the specified directory.
The default is to output files in the current directory.
-nc
asks dia2code to not overwrite existing files.
- Default return value for non-void methods.
-cl <classlist>
generates code only for the classes specified in the
comma-separated <classlist>.
-v
inverts the class list selection. When used without -cl,
prevents any file from being created.
-l <licensefile>
prepends the contents of the specified file to each source file
generated.
<diagramfile>
Name of the dia file (compressed or not) that contains the
UML diagram to be parsed.
The only mandatory parameter is the diagram file name.
Note: Parameters can be specified in any order.
$ dia2code -t java test.dia
Will generate the *.java files for the classes in the test.dia diagram and put them in the current directory.
$ dia2code -nc -d ~/C++ hw.dia
Will generate the *.cpp and *.h files for the classes in the test.dia diagram and put them in the directory ~/C++. It won't overwrite any existant file.
$ dia2code -t java -cl Base,Derived test.dia
Will create the Java files only for classes "Base" and "Derived".
$ dia2code -cl B*,*Foo test.dia
Will create the *.cpp
and *.h
files for classes that begin with "B" or
end with "Foo".
NOTE: You can only specify one asterisk and either at the beginning or end
of the pattern.
NOTE2: You may have to quote the asterisks when typing on the shell
(e.g. "\" instead of "")
$ dia2code -cl Foo -v foobar.dia
Will create C++ code for all classes but "Foo".
$ dia2code -v test.dia
Will not create any files. Don't know if it may be useful, but it surely is syntactically correct.
- Parse the diagram file with xmlParseFile().
- Parse the tree generated in 1 for UML classes to build an umlclasslist (type defined here).
- Parse the tree again for UML Generalizations, Realizations, Implements, Dependencies, Associations and packages to add information to the class list.
- Generate the structure of the classes (write it into files) from the class list.
Steps 1-3 are done in parse_diagram()
.
Step 4 is done in generate_code_*()
.
Both functions are called from main()
.
What you should put into your diagram
These are mandatory:
- A non-empty class name.
- For each attribute, the name and type.
- For each method, the name.
- For each method's parameter, the name and type.
These are optional:
- The stereotype of the class.
- The default value of attributes.
- The default value of parameters.
- The return type of a method. If no type is declared, dia2code will output no type at all for it.
In IDL, C++, and Ada, some stereotypes are supported that all begin with "CORBA". These are: CORBAConstant, CORBAEnum, CORBATypedef, CORBAStruct, CORBAUnion, CORBAValue. When generating IDL, the corresponding IDL syntax is generated for the UML class. When generating other languages, a rough mapping from the IDL to the other language is generated.
The following conventions are followed by the code generation for CORBA stereotypes:
Constant First attribute's type contains the const type; first attribute's value contains the const's value. Everything else unused. Enum Attribute names contain the enum literals, everything else unused. Typedef First attribute's type contains the original type; first attribute's value optionally contains the dimensions in case of an array. Everything else unused. Struct Attribute names and types contain the struct members. Everything else unused. Union First attribute's type contains the switch type. Only enum types are supported as the switch type. Following attributes contain the union cases as follows: Attribute value contains the switch value for the case (only exactly one value possible per case); attribute name and type contains the union member name and type for that case. Everything else unused. Value Use much like a normal class. Staticness not supported.
In C++, an abstract class is not explicitly declared as it is in Java. To have a class declared as "abstract" you have to set its "abstract" flag.
For generating Java, dia2code recognizes two stereotypes: "Interface" and "JavaBean". The former is a standard stereotype name, the latter a convention we use here. Both interfaces are only meaningful when generating Java code. If the stereotype is "JavaBean" then dia2code will output, for each attribute a of the class:
- A "T getA();" method, with return type conformant to the type of the attribute. If the type of the attribute is "boolean" then the name will be: "boolean isA()".
- A "void setA(T value);" method, with a paramter, "value", with type conformant to the type of the attribute.
Note: actually, dia2code will not output these things, but it will create the methods and paste into each one the suggested implementation. These methods will be treated as the rest. The generate_code_java outputs an implementation if one's avaliable. The generate_code_cpp just happens to do the same, but I feel it is most useful when generating Java code.
Dia2Code does not handle the "implementation" visibility for methods (yet). So when you have a class that implements a method that was declared abstract in a base class, you have to give it the same visibility of the parent class' method. The visibility of the method is not printed if it is "implementation"; this may be a source of bugs.
If you leave the "type" entry in the method declaration empty, then no type will be declared for it. This is useful with constructors, when the return type is implicit (both in C++ and Java). If the return type is void, you should explicitly declare it in the UML diagram. I don't know if this is a good practice, I just thought it was reasonable. Everyone is welcome to discuss it.
The UML standard states that there are (mostly) two ways of representing packages: a large box with all the elements inside (Large Package in Dia) or a small box (Small Package in Dia) to which the package's elements connect with a symbol that Dia does not (yet?) include: a line with a crossed out circle in the end of the container. You can put a normal line with a "Coordiate origin" arrow at one end, of course, and will be aestetically correct, but of no use to Dia2Code.
Dia2code, currently, checks only for intersections of packages (Large or Small) with other objects like classes and other packages, so it will be mostly useful with diagrams that use Large Packages instead of Small ones.
As of version 0.8.2, the Java, IDL, C++, and Ada code generators use package information. We should upgrade the other generators so they will generate better import/include directives. The Java code generator currently outputs all the files to the specified directory, does not create one for each package. The IDL, C++, and Ada generators generate files only for the outermost packages/classes. UML elements nested inside packages are generated as items nested with the IDL module, C++ namespace, or Ada package.
Code Generators:
A code generator function is a function that takes a pointer to a batch structure. The structure looks like this:
struct batch {
umlclasslist classlist;
char * outdir;
int clobber;
namelist classes;
int mask;
};
The classes to generate are in classlist.
The output directory is outdir. The files that the generate_code opens for writing MUST have this output directory as a prefix, UNLESS it is NULL, in which case, it will default to ".". The function SHOULD check the length of the directory for possible buffer overflows inside the function.
If the clobber flag is set, any already existing file in the output directory MUST NOT be overwritten.
Classes is a list of the classes to create code for. Mask is a flag that inverts this selection.
Some example code:
#include "dia2code.h"
#include "decls.h"
#include "includes.h"
generate_code_foo (batch *b)
{
umlclasslist tmplist;
declaration *d;
tmplist = b->classlist;
while (tmplist != NULL) {
if (! (is_present (b->classes, tmplist->key->name) ^ b->mask)) {
push (tmplist, b);
/*
The class is pushed onto the global `decls'.
We do this in order to support packages:
A single package may contain multiple classes
and these classes must be generated in the correct
order, i.e. declare classes that depend on other
classes towards the end.
*/
}
tmplist=tmplist->next;
}
/* Generate a file for each outer declaration. */
d = decls;
while (d != NULL) {
char *name;
char outfilename[256];
if (d->decl_kind == dk_module) {
name = d->u.this_module->pkg->name;
} else { /* dk_class */
name = d->u.this_class->key->name;
}
sprintf (outfilename, "%s.h", name);
spec = open_outfile (outfilename, b);
if (spec == NULL) {
d = d->next;
continue;
}
/* Determine include files needed. */
includes = NULL;
determine_includes (d, b);
if (includes) {
namelist incfile = includes;
while (incfile != NULL) {
if (!eq (incfile->name, name)) {
emit ("#include \"%s.h\"\n", incfile->name);
}
incfile = incfile->next;
}
emit ("\n");
}
/*
We generate all the code here
*/
fclose (spec);
d = d->next;
}
}
Generating import/include clauses:
includes.h contains the function determine_includes() and the global variable `includes'. Handling is as shown in the example above.
determine_includes() assumes that a file is generated for each outer package and for each class not nested inside a package.
If this does not make sense for the programming language you are targeting, then there are two lower-level functions that can help the code generator know which classes are used inside the current class: find_classes() and list_classes(). The former is the eldest of the two. It returns a namelist (a list of char*) with the names of the classes used that are also declared in the same diagram (are on the classlist). The latter is a new addition in version 0.7. It does basically the same search but instead of a list of char* it returns a classlist with pointers to the diagram classes themselves. Which to use is up to you. For generators of C and SQL code the find_classes() will suffice. For generators of Java code, that use heavily the package information, the list_classes() is more suitable.
Note: some bugs may not be listed here.
- No checking for full filesystem while writing files.
- Does not follow the DTD of a Dia Diagram, rather parses as it sees fit.
- Allocates memory proportionally to the size of the diagrams. It should work OK with small diagrams but may slow down with BIG ones.
Javier O'Hara [email protected]
Richard Torkar [email protected]
(in alphabetical order, by last name)
- Cyrille Chepelov [email protected] Pyhton code generation, Debian package management, Hurd conformance.
- Harald Fielker [email protected] PHP code generation.
- Oliver Kellogg [email protected], CORBA stereotype and package support
- bansan [email protected] Improvements to C++ code generation
- Ophir Lojkine (@lovasoa), bug fixes and support for database diagrams
- Ruben Lopez [email protected], C code generation.
- Steffen Macke [email protected] batch shapefile generation, win32 installer.
- Chris McGee [email protected] Dependencies, Associations, C++ Templates, SQL, IDL code generation.
- Takashi Okamoto [email protected] License inclusion mechanism.
- Thomas Preymesser [email protected] Ada code generation.
- Dmitry V. Sabanin [email protected] Ruby code generation.
- Jérôme Slangen [email protected] Wildcard class list mechanism.
- Takaaki Tateishi <> Dynamic Shared Objects for dynamic code generator modules.
- Martin Vidner [email protected] Porting to libxml2.
- Thomas Hansen [email protected] C# code generation.
Thanks to Collin Starkweather and Slush Gore for the extra help. Also, thanks to all the people that have contacted me with suggestions and bug reports.