Monday, December 10, 2007

Introducing EMF

Simply put, the Eclipse Modeling Framework (EMF) is a modeling framework for Eclipse. By now, you probably know what Eclipse is, given that you either just read Chapter 1, or you skipped it, presumably because you already knew what it was. You also probably know what a framework is, since you know what Eclipse is, and Eclipse is itself a framework. So, to understand what EMF really is, all you need to know is one more thing: What is a model? Or better yet, what do we mean by a model?

If you're familiar with things like class diagrams, collaboration diagrams, state diagrams, and so on, you're probably thinking that a model is a set of those things, probably defined using UML (Unified Modeling Language), a (the) standard notation for them. You might be imagining a higher-level description of an application from which some, or all, of the implementation can be generated. Well, you're right about what a model is, but not exactly about EMF's spin on it.

Although the idea is the same, a model in EMF is less general and not quite as high-level as the commonly accepted interpretation. EMF doesn't require a completely different methodology or any sophisticated modeling tools. All you need to get started with EMF are the Eclipse Java Development Tools. As you'll see in the following sections, EMF relates modeling concepts directly to their implementations, thereby bringing to Eclipse—and Java developers in general—the benefits of modeling with a low cost of entry.
Unifying Java, XML, and UML
To help understand what EMF is about, let's start with a simple Java programming example. Say that you've been given the job of writing a program to manage purchase orders for some store or supplier.[1] You've been told that a purchase order includes a "bill to" and "ship to" address, and a collection of (purchase) items. An item includes a product name, a quantity, and a price. "No problem," you say, and you proceed to create the following Java interfaces:

[1] If you've read much about XML Schema, you'll probably find this example quite familiar, since it's based on the well-known example from XML Schema Part 0: Primer [2]. We've simplified it here, but in Chapter 4 we'll step up to the real thing.

public interface PurchaseOrder
{
String getShipTo();
void setShipTo(String value);

String getBillTo();
void setBillTo(String value);

List getItems(); // List of Item
}

public interface Item
{
String getProductName();
void setProductName(String value);

int getQuantity();
void setQuantity(int value);

float getPrice();
void setPrice(float value);
}

Starting with these interfaces, you've got what you need to begin writing the application UI, persistence, and so on.

Before you start to write the implementation code, your boss asks you, "Shouldn't you create a 'model' first?" If you're like other Java programmers we've talked to, who didn't think that modeling was relevant to them, then you'd probably claim that the Java code is the model. "Describing the model using some formal notation would have no value-add," you say. Maybe a class diagram or two to fill out the documentation a bit, but other than that it simply doesn't help. So, to appease the boss, you produce this UML diagram (see Figure 2.1)[2] :

[2] If you're unfamiliar with UML and are wondering what things like the little black diamond mean, Appendix A provides a brief overview of the notation.

Figure 2.1. UML diagram of interfaces.


Then you tell the boss to go away so you can get down to business. (As you'll see below, if you had been using EMF, you would already have avoided this unpleasant little incident with the boss.)

Next, you start to think about how to persist this "model." You decide that storing the model in an XML file would be a good solution. Priding yourself on being a bit of an XML expert, you decide to write an XML Schema to define the structure of your XML document:


targetNamespace="http://www.example.com/SimplePO"
xmlns:PO="http://www.example.com/SimplePO">




minOccurs="0" maxOccurs="unbounded"/>












Before going any further, you notice that you now have three different representations of what appears to be pretty much (actually, exactly) the same thing: the "data model" of your application. Looking at it, you start to wonder if you could have written only one of the three (that is, Java interfaces, UML diagram, or XML Schema), and generated the others from it. Even better, you start to wonder if maybe there's even enough information in this "model" to generate the Java implementation of the interfaces.

This is where EMF comes in. EMF is a framework and code generation facility that lets you define a model in any of these forms, from which you can then generate the others and also the corresponding implementation classes. Figure 2.2 shows how EMF unifies the three important technologies: Java, XML, and UML. Regardless of which one is used to define it, an EMF model is the common high-level representation that "glues" them all together.

Figure 2.2. EMF unifies Java, XML, and UML.


Imagine that you want to build an application to manipulate some specific XML message structure. You would probably be starting with a message schema, wouldn't you? Wouldn't it be nice to be able to take the schema, press a button or two, and get a UML class diagram for it? Press another button, and you have a set of Java implementation classes for manipulating the XML. Finally, press one more button, and you can even generate a working editor for your messages. All this is possible with EMF, as you'll see when we walk through an example similar to this in Chapter 4.

If, on the other hand, you're not an XML Schema expert, you may choose to start with a UML diagram, or simply a set of Java interfaces representing the message structure. The EMF model can just as easily be defined using either of them. If you want, you can then have an XML Schema generated for you, in addition to the implementation code. Regardless of how the EMF model is provided, the power of the framework and generator will be the same.
Modeling vs. Programming
So is EMF simply a framework for describing a model and then generating other things from it? Well, basically yes, but there's an important difference. Unlike most tools of this type, EMF is truly integrated with and tuned for efficient programming. It answers the often-asked question, "Should I model or should I program?" with a resounding, "both."

"To model or to program, that is not the question."

How's that for a quote? With EMF, modeling and programming can be considered the same thing. Instead of forcing a separation of the high-level engineering/modeling work from the low-level implementation programming, it brings them together as two well-integrated parts of the same job. Often, especially with large applications, this kind of separation is still desirable, but with EMF the degree to which it is done is entirely up to you.

Why is modeling interesting in the first place? Well, for starters it gives you the ability to describe what your application is supposed to do (presumably) more easily than with code. This in turn can give you a solid, high-level way both to communicate the design and to generate part, if not all, of the implementation code. If you're a hard-core programmer without a lot of faith in the idea of high-level modeling, you should think of EMF as a gentle introduction to modeling, and the benefits it implies. You don't need to step up to a whole new methodology, but you can enjoy some of the benefits of modeling. Once you see the power of EMF and its generator, who knows, we might even make a modeler out of you yet!

If, on the other hand, you have already bought into the idea of modeling, and even the "MDA (Model Driven Architecture) Big Picture,"[3] you should think of EMF as a technology that is moving in that direction, but more slowly than immediate widespread adoption. You can think of EMF as MDA on training wheels. We're definitely riding the bike, but we don't want to fall down and hurt ourselves by moving too fast. The problem is that high-level modeling languages need to be learned, and since we're going to need to work with (for example, debug) generated Java code anyway, we now need to understand the mapping between them. Except for specific applications where things like state diagrams, for example, can be the most effective way to convey the behavior, in the general case, good old-fashioned Java programming is the simplest and most direct way to do the job.

[3] MDA is described in Section 2.6.4.

From the last two paragraphs, you've probably surmised that EMF stands in the middle between two extreme views of modeling: the "I don't need modeling" crowd, and the "modeling rules!" crowd. You might be thinking that being in the middle implies that EMF is a compromise and is reduced to the lowest common denominator. You're right about EMF being in the middle and requiring a bit of compromise from those with extreme views. However, the designers of EMF truly feel that its exact position in the middle represents the right level of modeling, at this point in the evolution of software development technology. We believe that it mixes just the right amount of modeling with programming to maximize the effectiveness of both. We must admit, though, that standing in the middle and arguing out of both sides of our mouths can get tiring!

What is this right balance between modeling and programming? An EMF model is essentially the Class Diagram subset of UML. That is, a simple model of the classes, or data, of the application. From that, a surprisingly large percentage of the benefits of modeling can be had within a standard Java development environment. With EMF, there's no need for the user, or other development tools (like a debugger, for example), to understand the mapping between a high-level modeling language and the generated Java code. The mapping between an EMF model and Java is natural and simple for Java programmers to understand. At the same time, it's enough to support fine-grain data integration between applications; next to the productivity gain resulting from code generation, this is one of the most important benefits of modeling.

No comments: