Frequently Asked Questions
Whereas there are many commercial data generators, (some of which are quite good), there seems to be a lack of data generators in the open source domain. Most of the free data generators are just inadequate and they are not the software I would like to use. To this end, dbMonster nicely fills in a gap. Nevertheless, I had quite many ideas about developing a data generator; ideas that did not seem to be fitting into the dbMonster concept.
Firstly, the data generator I had in mind, should be able to produce data in various formats. If the application is appropriately layered and modular, generating data for xml, databases or csv files, should not be a problem. What is merely different in these cases is the presentation layer. This is not to say that handling generated data for a database is the same as handling data for a text file, but once the data generators are there, what you choose to do with them, should not be much of a problem. Maybe you want your generated data to end up in a database, in an xml or text file, or to be transmitted over a network socket, it doesn’t matter.
Secondly, an appropriate GUI would be nice, since it would allow the user to have a better view of what is going on, without the need of enormous typing or digging into xml configuration files. The GUI however, is merely a front end to an engine, and as such, it can be bypassed at free will. Data generator related information is saved in xml files and therefore an appropriately modular data generator could also function from the command line.
Thirdly, there is always a need to generate a wide range of different types of data. Surely, one can generate the basic sql types of data, but one would also like to generate meaningful data such as names, streets, emails, etc. Since it is impossible to know in advance what each user needs, the data generator should be extensible. This is a major requirement: developers should be able to develop new data generators and plug them in the main application easily.
My idea/vision for this application goes beyond the user-requirements mentioned above. For example, one could also develop composite generators, (generators that return more than one value), or hook-up Java code to certain events (listeners that are triggered before or after data is inserted to the database, etc), or even develop some kind of scripting that combines different data generators.
Of course a data generator is about generating data, therefore there is always a need for gathering realistic data, this will goal will be a continuous requirement, one that is likely to continue, even after most of the developing is completed.
To sum it up, I believe dgMaster will fill-in a gap on the topic of generating data in a way that will be helpful to developers.
Currently, dgMaster supports most of the “primitive” Java data types plus some “higher” level data:
Boolean, date, string, numerical types (Integer, Long, Float, Double), sql date, sql time, timestamp, English first names and last names, emails, English text, String values from user-defined lists, unique strings (for db keys), numerical increment values.
Each generator can be tweaked, but of course, one may want to use these generators as a basis for writing new ones. More data generators are on the upgrades list…
Currently, only the text format is supported. Nevertheless, the text module supports enclosing the fields in user-defined characters, predefined widths and alignment for the fields, comma, tab or other character separation of fields, etc.
Of course, db support is high on the upgrades list…(see next point).
The next feature to make it into dgMaster will be support for database operations. By and large, any database for which JDBC drivers exist will be supported. dgMaster will be able to generate data respecting referential integrity. This means that the user will have a way to tell dgMaster how entities relate to each other, (1:1, 1:n, m:n), so that the application can generate appropriate primary and foreign keys.
It is probable that single data generators are not so versatile in a number of cases. For example, when generating a first name, a last name and an email, one would ideally want the email to be somehow related to the generated full name. Similarly, English post codes are always related to a town and a street. Single generators have no way of knowing what data other generators generated. A compound/composite data generator would be able to generate a number of values rather than a single one. This means that the first name, the last name and the email would all be generated by the same generator. Similarly, the town, the post code and the address would be generated from the same generator.
After having defined quite many generators, it is likely that the generators tree on the left of the main form will have to be re-organised. Currently, there are 5 main nodes: Built-in generators, user-defined generators, text files, xml files and database files. Ideally, the user should be allowed to create his/her own structure for displaying the generators. A user may want to organise the generators depending on the returned data type, or the semantics of the generator.
More features will include on-line help, functions from the command line, events, etc.
For now, I am offering this first milestone version so as to signal the birth of a new open source data generator project. I am perfectly aware of the fact that the current release may fall below your expectations. I have set a goal of building a robust, versatile and helpful application to be used by fellow-developers out there. It will be a long task, since I want to do it "appropriately", (building web site, offering tehnical documentation, user guides, etc), so for the moment being, you need to be a little bit patient :)