Wednesday, October 19, 2011

Taming the wildcards

Generics support in java added a fairly complex twist to your everyday collection habits. While at their most functional, generics are sensible, certain features that are part of generics implementation might get you cast away from your comfort zone (pun intended).

Wildcards, erasure , the potential to generate hard to read code, all sparked arguments about its complexity vs usefulness. Ken Arnold , a former Sun senior engineer, puts it in his "Generics considered harmful" article, written back in the days when generics first made it to java with the Tiger(1.5) release.

Anyway , since we're using generics , I'm going to give you a few tips here to help you get through some issues that confuse many newcomers, in particular, when dealing with wildcards.

Note that this is not a generics/wildcards tutorial - If you need a quick tutorial to get you started , checkout the sun/oracle generics topic in the Java tutorials. If you're after an inside-out comprehensive reference , I recommend Angelika Langer's "Java Generics FAQs".

These points are based on JSE5/6. JSE7 made few enhancements to java's generic implementation , didn't get the chance to review these yet.

1- Using bounded wildcards with bounded types

The thing about wildcards is that they are merely placeholders for concrete parameterized types , so the compiler is a bit loose about them ,and will allow you to use any wildcard that has the potential to be replaced with a compatible parameterized type .

To rephrase that , you can use a wildcard on a bounded type IF the hierarchy denoted by the bound , and the hierarchy denoted by the wildcard have an intersection.

The compiler allows that because its a safe bet , you wont be able to obtain an instance of a bounded generic type that is compatible with the wildcard but not the bound.

Let's take a simple example - Let's say you are building a robot that picks and boxes fruit in a farm (don't worry , we're not going to discuss how robots work) , consider the following hierarchy :

Plant (class) --> Fruit (class) --> Apple (class) | Orange (class).

And since a box of apples should only contain apples, and a box of oranges should only contain oranges, and our boxes are only suitable for fruit , the class Box should be generic, and upper bound to Fruit.

class Box<T extends Fruit>{
T getNextPiece(){..}
void AddPiece(T Piece){..}
}

Now when handling BOXes , the following wildcards are legal , and can be used as types of references or method parameters:

 Box<? extends Apple> (Classes that intersect the bound : Apple>
 Box<? super Orange>  (Classes that intersect the bound : Orange,Fruit>
  Box<? extends Plant> (Classes that intersect the bound : Fruit,Orange,Apple>

 While Box<? super Plant> is not a legal wildcard for our Box class as plant and above (wildcard) does not intersect Fruit and below (BOX's bound), this gets an error from the compiler.

As a side effect of using the bounded Box , you can get elements from a lower-bounded wildcard box as Fruit. Because T is guaranteed to be Fruit (and gets erased to Fruit), The following is legal :

Box<? super Orange> obox=new Box<Orange>();
         Fruit myFruit=obox.getNextPiece();


Wildcards bounded by interfaces

  When dealing with class-based wildcards and bounds , the compiler is certain about whether a type and its derivatives extend another type or not .Since Orange extends Fruit and not Apple, any type that extends orange will always be a Fruit , and is guaranteed to never be an Apple, no matter how many orange derivatives you add , they will never be Apples, so a Box<? extends Orange>=new Box<Apple>(); is illegal.

Interfaces are different. Since interfaces are effectilvely multiple inheritance ,derivatives of interfaces might appear at any level of the hierarchy even if their super types do not implement the same interfaces.
To illustrate , let's extend our example with the following :

interface Seasonal{}

Is the following declaration valid ? 

Box<? extends Seasonal> myBox;

Yes it is .  Although at the time of compilation ,Fruit and all known derivatives are not Seasonal , There is no reason why a Fruit can't be Seasonal. Although Fruit itself is not Seasonal , a derivative CAN be both a Fruit and Seasonal at the same time - So the compiler will allow you to use a Box<? extends Seasonal> , consider the following addition to our example :

class Watermelon extends Fruit implements Seasonal{}

A Watermelon is a nice fit here , since it is both a Fruit , and Seasonal , the assignment
Box<? extends Seasonal> myBox=new Box<Watermelon>(); is legal.

Note that if the bound was based on a final class (allowed when defining bounds , 'extends' in bounds means the bound class or derivatives), then using an interface-based wildcard will not compile -  For example if Fruit was final , then Box<? extends Seasonal> myBox; would be a compile-time error , becuase Fruit itself is not Seasonal (neither directly nor any of its ancestors) , and there can't be any derivatives to potentiate the use of this wildcard.

Of course , using a lower bounded wildcard here (e.g. Box<? super Seasonal>) is a compile-time error , because everything that is super to a Seasonal is an interface , and that can't possibly conform to our bound of Box<? extends Fruit> , which extends a concrete class.

What if the type bound was an interface , and the wildcard had a concrete bound ?

Pretty much the same applies , the general rule is that if there could ever exist an object that satisfies both the bound and the wildcard , the compiler will be happy.

For example , lets have the following :

class Box2<? extends Seasonal>{}

the declaration Box2<? extends Fruit> will pass compilation with no errors , and that is becuase there could exist a Fruit that is Seasonal . Again , if Fruit was a final class , and then the declaration would generate a compile-time error (unless Fruit or higher implement Seasonal).

What if I use type bounds that mix a class with one or more interfaces ?

The easiest thing to do is to break the elements of your type bound into several individual-element bounds and see if the wildcard is compatible with every one of them. 
If your type bound includes a class and a single interface , the wildcard must be compatible with the same type if it only had the class as a bound , and at the same time compatible with the type if it only had the interface as the bound.

For example :

class Box3<T extends Fruit & Seasonal>{}

Now a Box3<Orange> is legal (it is compatible with both Box3<T extends Fruit> , and Box3<T extends Seasonal>) , while a Box3<? super Orange> is illegal , because although it is compatible with  Box3<T extends Fruit>  , it is not compatible with the  Box3<T extends Seasonal> part).

Same applies if you have several interfaces in your bound , check each individually against the wildcard , if the wildcard is compatible with them all , the wildcard is legal.

What if I'm using all-interfaces bounds and wildcards ?
If your type bound has multiple interfaces , break them as noted above , and test the wildcard against each one individually, using hierarchical intersection (As you would do with all-classes bounds and wildcards).



2- Bounded wildcard-wildcard assignment


Can you assign a reference of a bounded wildcard type , to a reference/method parameter of another wildcard bound?

Sure. For any Wildcard B , to be assignable to A , B's bound must have an 'IS A' relation with A's bound.

Let's follow on our example :

A- Upper bounds:

    Box<? extends Fruit> box1 = null;
    Box<? extends Apple> box2 = null;

    box1=box2; // Legal
    box2=box1; // Illegal

This is legal becuase anything that extends an Apple does extend a Fruit (an Apple derivative IS A Fruit derivative) ,

remember that this is an upper-bounded wildcard ,so you can get elements from box1 as Fruit , and you can still do that with the assignment box1=box2;

Of course , the other way around is invalid and would produce a compile-time error , you can't make the assignment box2=box1,because it breaks the 'IS A' relation.


B- Lower bounds :

   
    Box<? super Fruit> box1 = null;
    Box<? super Apple> box2 = null;

    box1=box2; // Illegal
    box2=box1; // Legal

The second assignment , box2=box1 is legal , because anything that is an ancestor of Fruit 'IS A' (an) ancestor of Apple , and since before the assignment you could add apples into box2 with no compiler errors , the assignment does not break that , you can still pass an apple as a fruit.


Another approach , which you might find simpler , is to substitute the B with a compatible concrete type and see if it assigns to A , in the example above:

    Box<? super Fruit> box1 = new Box<Fruit>();

    now is Box<? super Apple> box2 = new Box<Fruit>() valid ? . Yes , so the assignment is valid.



3- Generics with generic type parameters , with wildcards in between

Checkout the following :

    List<Box<? extends Fruit>> boxList=new ArrayList<Box<Orange>>();

At first sight you might think that the assignment is legal . It is not, the compiler will whine with an error.

The problem here is the Type parameter of the List , which is "Box<? extends Fruit>" , we  know that unless you use a wildcard , you have to use the exact type parameter on the other end of the assignment . Which means that a List<Fruit> = List<Orange> is an invalid assignment , although an Orange is a Fruit - NOT when treated as a type parameter .to make it legal you use a wildcard: List<? extends Fruit> = List<Orange> is legal.

The same applies here , you have to forget for a second that Box<? extends Fruit> is a wildcarded type itself , and think about it as the type of the List , and List is NOT using a wildcard on this type , so you have to make an exact match - only a Box<? extends Fruit> matches a Box<? extends Fruit> as a type parameter. You can use a wildcard with the List as follows :

    List<? extends Box<? extends Fruit>> boxList=new ArrayList<Box<Orange>>();

This assignment is now legal.

So how can you resolve such wildcards ?

1- When using upper bounded wildcards (extends) : remember that when B extends A , it means that B is assignable to A, which means that someA = SomeB is legal , but not vice versa , and thats why the statement above compiles with no error , because you are asking for something that extends Box<? extends element> , and indeed a Box<Orange> is assignable to that ( Box<? extends Fruit> box1= Box<Orange> box2).

2- When using lower bounded wildcards (super) :  The expression :

List<? super Box<? extends Fruit>> boxList=new ArrayList<Box<Orange>>();

will generate a compilation error , because this is the reverse case of the above - you're asking for something that is super to Box<? extends Fruit> , which means something that Box<? extends Fruit> can be assigned to , and not the other way around. The above means Box<Orange> = <Box ? extends Fruit> , which is not legal.

It would be legal if it were the following :

List<? super Box<Orange>> boxList=new ArrayList<Box<? extends Fruit>>();

Because as you can see , a Box<? extends Fruit> = Box<Orange> . This is a legal assignment.

Of course , same rules apply to more than 2 levels of nested generics- but gets really hard to read and trace.

I hope that resolves some of the confusion surrounding wildcards , good luck !