UML Best Practice: Attribute or Association

Short

Use Associations for Classes and Attributes for DataTypes

Purpose

Make an informed choice between Attributes and Associations when modeling a relation between two Classifiers.

Details

When modeling the structure of your system there are basically two ways to express a structural relationship between two Classifiers. You could use an Association between the two Classifiers, or you could create an Attribute owned by one Classifier with it’s type set to the other Classifier.

Both ways, Association or Attribute are pretty much equivalent. There’s not really a big difference between the two except for personal preferences.

The problem with modeling teams working on the same model is of course that you can’t allow personal preferences, you have to make a clear choice what to use in which circumstances.

To explore the details of the two approaches it is best to have a look at the UML meta model.

In this meta diagram we see that both the Attribute as the Association use the same Property object to link to a type.

The association has two or more Properties as MemberEnd. Each of these Properties has a Type, so that is the way the association links two or more Classes. The derived link from Association to EndType is derived from the type of the Properties in the memberEnds.

The Attribute of a class is in fact a Property in the ownedAttributes of a class.  Again through the fact that a Property is a TypedElement an thus has a Type as “type” we get the relation to another Classifier.

In the years I’ve been working with different modelling teams I’ve found that the rule that works best is to use Associations for Classes and Attributes for DataTypes.

Now whats the difference between a Datatype and a Class? Well, they are actually pretty similar. The UML specification states:

A data type is a special kind of classifier, similar to a class. It differs from a class in that instances of a data type are identified only by their value.

So that means that DataTypes are much like the primitive types and enumerations we know in the programming world. This concept is generally referred to as being immutable. So you can think of things like Integer, Date, MoneyAmount, but also enumerations such as Color, DayOfTheWeek etc..

If we add Datatype and Enumeration to the meta diagram we get following

You can see that DataType is a subtype of Classifier, and that Enumeration is a subtype of Datatype.

Following example shows how to use Classes and Datatypes when following this best practice.

In this diagram we see two Enumerations: Currency and ProductCategory. ProductCategory is being used as the type of the attribute Product.Category while Currency is being used by the Datatype MoneyAmount.

I’ve added dependencies to visually express which Datatype is being used by which Classifier, but those dependencies are usually not there in a production model.

More UML best practices

31 thoughts on “UML Best Practice: Attribute or Association

  1. I found that more people have problem using association in class diagrams. It is very important what type of model we creating: semantic or implementation. In DDD method (domain model) both of them is “similar” (but not the same) … In this way I permanently comments my class diagrams as “conceptual model” or “domain model” (means e.g. business logic model for systems).

    very good article 🙂
    (P.S. sorry my English ;))

    1. I guess you are right, and it indeed depends on the audience, although I haven’t met many people that didn’t understand the concept of an association.
      Most of my UML experience is in the “functional” and “business” area, where class diagrams are used to model either business concepts, domain models or logical data models. In all of these instances I found that this best practice works out great. In other areas (more technical) that might be less appropriate.

  2. Hi Gerd, thanks very much for sharing.

    Could you tell us why this is the best practice? Did you notice some improvements in the model readability?
    Anyway, I generally agree with your approach…

    I’d like that tools like EA would be able to automatically show the associated classes as attributes of the class. This would be very helpful when dragging a class to another diagram.

    Cheers,
    Davide

    1. For me this is a best practice because
      a) nobody has to doubt whether to use an attribute or association
      b) feels most natural
      c) improves the consistency of the model

      I guess any rule is better then no rule, but this way seems to work the best.

  3. “I guess you are right, and it indeed depends on the audience, although I haven’t met many people that didn’t understand the concept of an association.”

    100% agree 🙂

  4. Nice post, Geert. Your recommendation is consistent with H.S. Lahman’s advice in the recently published “Model-Based Development Applications.” Lahman describes knowledge and behavior responsibilities. Knowledge responsibilities are expressed as attributes which are instances of ADTs. ADTs are scalars in the context of the subsystem’s level of abstraction. In a different subsystem, at a different level of abstraction, the element that was an instance of an ADT may become full fledged objects. Based on Lahman’s advice, I’d modify the short answer to add “in the context of the subsystem’s level of abstraction.” Sound right?

    Associations represent a path for collaboration. Collaboration is expressed in the context of the subsystem and is imposed on the objects. That is, the fact that class A accesses knowledge or behavior from class B is not up to A or B but the context of the usage of A and B together (the subsystem).

    I don’t know if I actually just said anything coherent but it was fun to say it. 8).

  5. The concepts of “Association or Relation between Objects” and “attribute or characteristic or an Object” are distinctly different and CANNOT be use interchangeably. UML meta model may have interpreted it this way but these are general concepts more fundamental and applicable in many branches of science and mathematics.

    Associations or relation exist between objects of the same or different sets. See any standard book on mathematics (set theory). There are many more details which can be looked up.

    Attributes or Characteristics or Properties are “descriptions or measurements of features of the objects”. They cannot be physically removed from the objects and separated as physical entities. In fact this is a test to check if one object is mistaken as the attribute of another object.

    This piece needs a careful revision. Sorry if this appears as offensive.

    putchavn@yahoo.com

    1. Putcha,

      First of all, I’m not offended, don’t worry.

      I don’t really get your criticism. I’m clearly talking about the concepts of attributes and associations as defined in UML, regardless of how they are defined in other fields.
      And even then I don’t really see your point.

      My “Order” class in the example has an attribute of type MoneyAmount. Would you consider it wrong when I modelled it using an association between MoneyAmount and Order? And if so why?

      Or the other way around, my “Order” class has an association with the “Product” class. Would it be wrong in your eyes to model this by adding an attribute to my Order class with it’s type as Product?

      For me both ways are not in violation of the UML specification, and I don’t really see why either of them is “wrong”.

      1. Thanks Geert, nice of you.

        For “Order” class, OrderVale is a valid & useful attribute.
        MoneyAmount is just a value in some currency but not an object in any sense of object or entity. Such modeling is not consistent with the concept of an object.

        However a 100 Dollar Bill is an object of MonetaryValue 100 US Dollars. So, one may model CurrencyBills” class with many attributes that define / describe the characteristics of bills / notes.

        This came up in the Linkedin Gropu discussion started by you and I posted my response there. You may see tht also.

        Regards,

    1. Indeed in theory these are different concepts, but the one usually goes by the other.
      Most datatypes (such as string, integer, date, etc…) in most OO programming languages are in fact immutable.

      I think there reasoning behind this is as follows:
      – If an instance of my Datatype is only identified by its value that means that there can only exist one instance with a specific value; another instance with the same value would imply that it is the same instance. (There can only be one Date object with value 01/01/2011)
      – Suppose I have two objects that both use an instance of my datatype with the same value (both use Date 01/01/2011)
      – If Date is not immutable and I would change the date of my first object to 01/01/2050, then the date of my second object would be changed as well, since they both use the same instance. Making the Date datatype immutable avoids these kind of problems since you cannot ever change the value of an existing Date instance.

      Phewww… I hope that sort of explains my strain of thought on that one.
      I specifically tried to avoid putting too much details about the Datatype in the post because I find the concept of DataType very hard to explain, and I didn’t want to shift the focus too much.

  6. Some languages provide unary increment operator for integers. Integers use by-value equivalence, but you can have two integer attributes with same value and use the operator on one and the second won’t change.

    What you describe works in Java e.g. with interned Strings (String#intern()). But once again, that is not result of strings being equivalent by value, but rather a choice made because of the immutability and practical space saving benefits.

    I don’t want to go too much off-topic, but it seems wrong to me make those concepts interchangeable. If I am mistaken, please reply.

    1. By unary increment operator you mean something like i++?
      In that case, isn’t that just syntactic sugar for i = i+1 ? (which would be assigning a new integer to the variable i rather then increment the actual integer)

      You are probably right in some way however; I never meant to say both concepts are interchangeable. I just wanted to illustrate how the concept of an UML DataType is generally implemented in a programming language.

      1. Geert, you should check out H.S. Lahman’s Model-Based Development Applications book. I know you are a guru already so I don’t mean to suggest you need to educate yourself. I think you will find Lahman’s insights interesting. Your comment about how UML DataType is typically implemented shows an interest in understanding the low level code generated from the model. I, myself, have been distracted by that concern and Lahman (and others in the executable model camp) has helped me. I need to focus on what the language is saying in the model level of abstraction and not worry about how that will be translated. The purpose of the model is to express the “truth” of the problem at hand so that the implementation will also be true. Concerns over the translation step increase the risk that the resulting model will be inaccurate wrt to the problem. As good as it sounds, it very unlikely two birds will be killed with one stone. I’m convinced that I’ll do better to focus modeling the problem in order to capture the truth and worry about translation to the Turing machine as a separate step.

        Respectfully, Dan

  7. Yes, I ment something like i++. However, I don’t think it is just syntactic sugar, because you can use it as integer expression as opposed to statement. The operator can be even overloaded in C++, similarly augmented assign operator (+=) can be overloaded in C++ and Python.

    Generally I agree with the post and I thank you for writing it, this is just a small detail.

  8. Actually, including an attribute in a class definition is merely a shortcut for a Composition relationship. Attribute is Composition, not Association. Geert’s piece is clearly misleading, see “UML notation guide”, http://bit.ly/psuCgb fig. 25, p. 70.
    Attributes should be implemented using composition semantics.

    1. Rémi, I think you meant fig 25, p. 64. Geert’s recommendation was that attributes should be DataTypes and Associations should be used to represent relationships to classes. DataTypes are scalars at the level of abstraction being modeled. They are not things the object will interact with but things that represent object knowledge responsibility (a stronger coupling than even composition). Associations make interactions clear and allow objects to access each other’s knowledge and behaviors. Geert’s suggestion is very good guidance.
      Your final sentence needs more context to be meaningful. How attributes are implemented is completely irrelevant to a model expressed in the UML.

    2. Rémi,

      Isn’t a composition just a “kind of” association?
      I’m not sure if I fully agree with the statements in that document, but I don’t think my article contradicts them.

      Also, the notation guide you are referring to is a 14 year old document on UML 1.1 back from before UML was adopted by the OMG…

    3. Reni,
      I agree with you – Class-Attribute ‘relationship’ is much more similar – in regard to semantics – to “composition” then to ordinary Association. However we should keep in mind that – as geertbelle had noticed – “composition” is only special case of Association – with one end being ‘compose’ kind.
      But much more important difference between Attribute and (any kind) Association is that Association _is a_ Classifier, which means that when we connect Classes C1 and C2 by Association A, we assume that in next metamodel layer (M0) not only Objects will exist (instances of C1 and C2 type), but also _Links_ (instances of A type) wiil exist as separate ‘being’. By ‘separate being’ I mean there should be some kind of storage slots – in object or outside of them (in links themselves) – where ‘id’-s of all linked objects are kept (id-s can take form of references, handles, memory addresses, DB primary keys and so on – it’s implementation and target platform specific).
      Assumed existence (or not) of such Links IMHO could be clue for choice attribute vs, composite association.

  9. Dan George :

    Geert, you should check out H.S. Lahman’s Model-Based Development Applications book. I know you are a guru already so I don’t mean to suggest you need to educate yourself. I think you will find Lahman’s insights interesting. Your comment about how UML DataType is typically implemented shows an interest in understanding the low level code generated from the model. I, myself, have been distracted by that concern and Lahman (and others in the executable model camp) has helped me. I need to focus on what the language is saying in the model level of abstraction and not worry about how that will be translated. The purpose of the model is to express the “truth” of the problem at hand so that the implementation will also be true. Concerns over the translation step increase the risk that the resulting model will be inaccurate wrt to the problem. As good as it sounds, it very unlikely two birds will be killed with one stone. I’m convinced that I’ll do better to focus modeling the problem in order to capture the truth and worry about translation to the Turing machine as a separate step.

    Respectfully, Dan

    Thanks Dan,

    I’ll check out that book.
    I think you are absolutely right that we should focus on the semantics of the abstraction level we are working in, and not worry about semantics on another abstraction level too much.
    I was just trying to explain the concept of Datatype in a easy way.

    PS. Thanks for the compliment, but I don’t consider myself a UML guru (yet) 😉

  10. Entities vs Values: Great article you nailed this down.

    As EA Belgian experts, we may do something together one day, having a beer or two for example! 🙂

  11. Geert,
    “…the rule that works best is to use Associations for Classes and Attributes for DataTypes”
    That’s not a best practice but a truism because if …
    “A data type is a special kind of classifier, similar to a class. It differs from a class in that instances of a data type are identified only by their value.”
    Then there is no key to be used for an association.
    Remy

    1. caminao,
      IMHO its not truism. You can make an Association between Class and Datatype. The key (id) for Datatype instance (namely: Value) is the value itself (and it have not be simple scalar value like integer – can be eg. complex number – real and imaginary components).

  12. SysML advocates the use of associations also for ValueTypes (which is the SysML equivalent of DataType). Does anybody know why the concept of attributes is repressed in SysML?

  13. Bit late, as this discussion stems from 2011/2012. But I read it just now.

    We (my customers) have the need to separate classes based on:
    * Identifier: the clas has one or more identifying attributes and (therefore) may be target in an association.
    * Grouping of attributes, that “belong together” and may be available 0..n times within any class. It cannot be target in an association, as it has no identity. (This aligns with the composition staments made in this thread.)
    * Self-identity: Scalar datatype, that identifies itself and is fundamentally a string.
    * Self-identity with structure: Complex datatype, which consists of two or more data aspects (attributes), that _ en groupe_ identifies itself.

    Especially the Grouping concept is bothersome. It is really needed and applied but cannot be well defined in terms of Classes or datatypes. It’s somewhere inbetween. Therefore we have given it a place on its own.

    We have also decided that Groupings can be assigned to atributes, “as attributes”, which differs from Geerts proposition.

    Any serious reason not to allow this?

    1. My article just expresses a best practice, not a hard rule.
      If you have found a way to express what you need, and you have documented the rules to follow then there is no reason why that should be a bad thing.
      The most important thing is that everyone uses the same rules, and everyone in your audience understands what the model means.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.