Writing Code Is Stupid
Summary: Code generation is the next software development paradigm
Note: This topic is based on an paper I wrote and published after I left Microsoft in 2003. The original article can be found on my weblog. I've broken it up and altered the text to fit it into the Wiki style.

My goal in this paper is to point out that software development today is a labour intensive cottage industry and that our approach to developing software needs to be industrialised. Indeed, I propose it can be mechanised. I believe that code-generation from software requirements is possible for a large class of applications. Rather than model the implementation in UML, a better approach is to model requirements that are consumed by an extensible code generation platform to produce the final application.

Yes, it’s a hard problem and I don’t have the solution worked out. Rather than solve this for the general case of any application, I believe it can be solved readily for the class of business information systems that mostly read and write data to a database invoking a thin layer of business logic and process. To be sensational, writing code is stupid for this type of application. From my experience as a Microsoft Consultant, there are a great many of these kinds of systems in corporations.

Rationale

3/16/2005 3:18:30 AM - -213.120.112.19
SoftwareDevelopmentIsAMess

3/16/2005 3:18:34 AM - -213.120.112.19
TheCodeBaseProblem

4/24/2008 11:01:14 PM - 60.48.224.6
SoftwareProductionIsACottageIndustry

3/16/2005 3:18:16 AM - -213.120.112.19
MostProgrammingIsNotArt

3/16/2005 3:18:36 AM - -213.120.112.19
TraditionalMethodologiesAreNotTheAnswer

3/16/2005 3:18:28 AM - -213.120.112.19
ReduceTheCodeBase

3/16/2005 3:17:20 AM - -213.120.112.19
DumpTheCodeBase

3/16/2005 3:17:16 AM - -213.120.112.19
CodeGenerationIsAnOptimisationProblem

4/24/2008 11:04:11 PM - 60.48.224.6
ForgetAboutObjects

Solve the Real Problem

How can you model software requirements so that you can derive and generate the code for it? Another way to think of it is how do you create a set of extensible requirement models and then extend a code generation platform to create your application? I don’t know but I do think it’s solvable.

There are two barriers I would like to acknowledge. Today’s application platforms are geared for humans to work against. A programming model geared for use by code-generating tools would probably look quite different. I also recognise that debugging will be difficult because of the huge semantic gap between the generated code and tracing it back to the requirements model.

A Paradigm Shift?

So why hasn’t this been done already? I don’t know; I’m still in the process of researching the problem. I believe we might be rapidly approaching an inflection point in software development where the paradigm shifts over to code generation. Some of the contributing factors include:

  • Broad adoption of implementation standards around HTML, XML, SQL and a consolidation of application platforms: .NET, Java, J2EE.
  • A greater emphasis on modelling applications and the rise of UML
  • Massive computing power available at the desktop and better programming tools.
  • Programming models are getting more declarative, more abstract and easier to code generate against.

Conclusion

So let’s restate how we got here. The current real-world state of writing enterprise applications is grossly inadequate for numerous reasons. However, it is fundamentally flawed due to the creation of large code bases in which the domain logic gets entangled and obscured by the platform logic. Current technology trends try to address this by implementing the same functionality in less code against a higher level application model.

In this paper, I propose that we need to industrialise and mechanise the software development process to a much greater degree. The process should follow the following pattern:

  • Iteratively gather requirements and evolve a way to capture them in models.
  • Extend a code generation platform to derive and generate the application code for a specific application platform.
  • Where it isn’t cost effective to model and generate, write components that get weaved into the generated code and called at the appropriate points.

I believe some of the benefits of this approach are:

  • Requirements are explicitly captured and kept separate from implementation making it easier to inspect them and ascertain correctness.
  • The application could be rapidly changed and regenerated.
  • Application development gets more rapid as requirements models and code generators get developed.
  • The application could be retargeted at a new application platform or to take advantage of new platform features more quickly than evolving a code-base.
  • The application could be developed with an extreme programming like process enabling rapid customer feedback.
  • Quality of code should be increased due to mechanisation and consistency.

This approach would finally bring software development into the industrial age by creating a true software factory.


[Additional Text by Joe]

The approach described above sounds a lot like Domain-Specific Modeling: for each narrow problem domain, come up with a modeling language that captures what you want an application to do, and a generator that turns those models into code. (A domain can be as narrow as a given range of products for a given company, but could also be as broad as CRUD apps with a db backend and web or GUI frontend.)

DSM has been used to good effect over the last 10-15 years, e.g. by Nokia to build their cell phone software, Lucent for telecom switches, USAF, NASA, chemical companies, insurance companies etc., increasing productivity each time by 500% to 1000%. The figures are big, but not really surprising: after all, if your application is finished once you have your requirements, that's going to be pretty fast!

There are good tools out there to help you come up with your own modeling language, build it, and get modeling tool support for it. The last part is what had earlier slowed the adoption of DSM: building a modeling tool from scratch takes at least several man-years, but with the best of today's DSM environments, it takes just a couple of man-weeks. You just specify the modeling language's concepts, rules and symbols to the DSM environment, and you automatically get a fully working modeling tool. The DSM environments also include their own generator language / framework, which makes building the generators a lot faster.

If you want to see some industrial cases of DSM, take a look at http://www.dsmforum.org - the industry organisation for DSM. If you want to try out building your own DSM tool in an hour, go to http://www.metacase.com, where you can download an evaluation version of our MetaEdit+ (plugged simply because it's easiest to install and get started with - MS DSL Tools, Eclipse EMF and GEF, XMF Mosaic, GME and DOME are all also options).


[Ian's Reply]

Yes my approach has much in common with DSM's. The important difference is that I believe the computer should be used to infer and derive as much of the design as possible. The current modelling approach is still geared to explicitly modelling everything in the design. The computer winds up being just a CAD tool. I don't see any intelligence being applied. The modelling tool should be actively figuring stuff out.

The biggest weakness I have found with DSM languages is that you can't query the domain model! It's just a graph and you have to write code to traverse it. You can't ask a question (query) about the domain model. At least, as far as I know. Thanks for the links. I'll check them out.