API's that Suck

May 26, 2009

Immutable Collections Are Easy, What We Need are Immutable Data Structures

Filed under: Uncategorized — Grauenwolf @ 10:40 pm

Time and time again I see examples of immutable collections. These are invariable a waste of time because immutable collections are a solved problem. People often say things such as:

A pure function over immutable stacks can be safely called from multiple threads with no need for locking.

API Design

Ok, that’s true. But why use a so-called “immutable stack”? Consider these two declarations:

int Foo( ReadOnlyStack<Record> records)
int Foo( IEnumerable<Record> records)

As a developer using someone else’s API, what would you prefer? Would you like to explicitly convert your collection into a ReadOnlyStack? Or would you want the API just to make it’s own stack based on what you pass?

I know some people are thinking “but I need it to be immutable so I don’t have to take a lock”. Fine. If you really can’t trust the developer then go ahead and demand the standard read-only collection. In .NET this would be:

int Foo( ReadOnlyCollection<Record> records)

You still don’t have to make your users muck about with some sort of “ReadOnlyStack”, which is just an implementation detail anyways.

Performance

The next argument I see is the claim that this will somehow make for better performance because you don’t have to make a copy. That is clearly a violation of the premature optimization rule.

Advantages of using a shared linked-list

  1. Initial copies are essentially free.
  2. Stacks can grow forever without expensive copy operations

Disadvantages of using a shared linked-list

  1. Initial size is larger than array-based stacks
  2. As they are used, they quickly diverge and stop sharing nodes
  3. They are less likely to be sequential in memory, especially as they age. (Can you say cache miss?)

Clearly there are going to be times where pre-allocating an array based stack is going to give you better performance than a stack that is scattered across memory. I can’t say array-based stacks are always going to be better, but why take the risk? If you just request an IEnumerable at the beginning, you can use whatever type of stack your performance testing dictates. You can even special-case the ReadOnlyStack if you feel it is necessary.

So what is the problem?

Complex, immutable data types are the interesting problem. Currently in platforms like Java and .NET, there is no easy way to make a class that has both mutable and immutable variants. 

More often than not developers want mutable objects when they are constructing it. Data may be coming from multiple sources or even be aggregated as it comes in. Only once the class is finalized do we want

Currently we can do this by using interfaces, which are hard to change, and tons of boilerplate code.

Const in .NET

One option would be to add the concept of a Const modifier to C# and Visual Basic. The rules would be simple, if you specify const than you get a compiler error when accessing non-pure methods and properties.

I don’t really like this one for two reasons.

  1. Cheating is too easy. Anyone can simply cast away the const either directly or by
  2. There is no guarantee that someone won’t hold onto the mutable reference.

Still, there is something nice about the syntax.

Boxing for Purity?

In .NET every value type actually has two classes. There is the class named for the value type such as Integer or Point. Then there is the unnamed class that is used when the value is “boxed” and placed on the stack. Developers only have to deal with the first class, the runtime takes care of the second.

What if we had the same concept for purity in classes? Say your Customer class had an implied method called MakeImmutable. This would copy it into another class called “ImmutableCustomer” that was identical except:

  1. The class definition is created by the compiler or runtime.
  2. All methods/properties not marked as “Pure” are removed.
  3. All child objects are mapped to their Immutable variants.
  4. All Customer classes can be implicitly cast to a ImmutableCustomer.
  5. A mutable copy can be created by calling MakeMutable. This won’t return the original or allow modifying the read-only variant.

This cannot be done in the language itself, the runtime needs to support it in order to work out the inheritance/casting rules.

Performance Lessons from StringBuilder

The StringBuilder class has an interesting property. It’s underlying string buffer is mutable, but only until you call ToString. Once you do that, any further changes will cause a new string buffer is created from the now immutable String.

The same thing should be done for these auto-generated classes. When you call MakeImmutable, it sets a flag. The copy of the underlying data isn’t actually made until a future call to mutating method/property is made.

This will be tricky when dealing with child classes. There will be times when some classes have been cloned but other are still simply marked.

What happens when you implicitly cast objects to their ImmutableVersion?

I don’t know.

Calling MakeImmutable is out of the question. In a mutable context, that call would be way too expensive.

Perhaps we just do nothing. If the caller is passing a mutable object to a method, the caller has to take responsibility for any rules.

What happens when you assign a mutable object to an Immutable reference without calling MakeImmutable?

Again, I don’t know.

One could argue the caller should be responsible for discarding the mutable references or suffer the consequences.

Alternately the receiver could call MakeImplicit itself and thus be safe from future changes. But this may break the caller’s expectations that the objects are the same.

I’m gong with compiler warnings for now. The caller must either explicitly cast the object, keeping it mutable, or call MakeImmutable which creates a copy.

Advertisements

8 Comments »

  1. So, how would you use an immutable stack?
    You can’t push or pop.

    What good is a stack that you can’t push things on to and pop things off of?

    Comment by Paul — May 27, 2009 @ 12:53 am

    • When you call Push or Pop, you get a new stack. I guess that you just assign it to the variable that was holding the original (thus making it mutable after all).
      It almost makes more sense in “functional languages” where you would pass the new stack to a recursive function. But for the most part, I think it is just silly.

      Comment by grauenwolf — May 27, 2009 @ 2:17 am

      • almost all threading issues are caused by shared state. immutable objects are safe from this. don’t share state almost no threading issues (at least the hard to understand ones).

        once you learn how to use them (and it does take a shift in view point) they are amazing. the functional languages are built off this so the language and syntax is designed around it. this doesn’t mean non functional languages can’t be assisted by there use.

        Remember the first time you where exposed to lambda expressions? the head scratch they caused….then the big eyes when you finally got how you could use them in interesting ways, never mind the way a smart compiler could speed up the operations.

        Comment by arthur — May 27, 2009 @ 6:35 pm

      • You are not paying attention. This isn’t mutable vs immutable objects, it is about not wasting time on immutable stacks when we should be focusing on ways to make the hard ones immutable.

        Comment by grauenwolf — May 28, 2009 @ 8:20 am

  2. The .NET team has released immutable collections now, FWIW. Check them out here: http://blogs.msdn.com/b/bclteam/archive/2012/12/18/preview-of-immutable-collections-released-on-nuget.aspx

    Comment by Andrew Arnott — January 4, 2013 @ 10:59 pm

    • But only for .Net 4.5 😦

      Comment by William — January 11, 2013 @ 11:37 am

      • William, would targeting .NET 4.0 do it for you, or are you looking for something further downlevel like v3.5?

        Comment by Andrew Arnott — January 11, 2013 @ 8:38 pm

  3. Since this blog post’s main point is we need more convenient dual mutable/immutable objects in .NET, you may find my latest post interesting since it provides a tool to solve this problem: http://blogs.msdn.com/b/andrewarnottms/archive/2013/01/08/simple-immutable-objects.aspx

    Comment by Andrew Arnott — January 11, 2013 @ 3:04 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: