Sunday, September 30, 2012

Primitive Obsession, Custom String Types, and Self Referencing Generic Constraints

Primitive Obsession, Custom String Types, and Self Referencing Generic Constraints:
I was once accused of primitive obsession. Especially when it comes to strings. Guilty as charged!
There’s a lot of reasons to be obsessed with string primitives. Many times, the data really is a just a string and encapsulating it in some custom type is just software “designerbation.” Also, strings are special and the .NET Framework heavily optimizes strings through techniques like string interning and providing classes like the StringBuilder.
But in many cases, a strongly typed class that better represents the domain is the right thing to do. I think System.Uri and its corresponding UriBuilder is a prime example. When you work with URLs, there are security implications that very few people get right if you treat them just as strings.
But there’s a third scenario that I often run into now that I build client applications with the Model View ViewModel (MVVM) pattern. Many properties on a view model correspond to user input. Often these properties are populated via data binding with input controls on the view. As such, these properties often need be able to hold invalid values that represent the input the user entered.
For example, suppose I want a user to type in a URL. I might have a property like:


public string YourBlogUrl {get; set;}




I can’t change the type of that property to a Uri because as the user types, the intermediate values bound to the property are not valid instances of Uri. For example, as the user types the first “h” in “http”, trying to bind the input value to the Uri property would fail.

But this sucks because suppose I want to display the host name on the screen as soon as one becomes can (more or less) be determined based on the input. I’d love to be able to just bind another control to YourBlogUrl.Host. But alas, the string type does not have a Host property.

Ideally I would have some middle ground where the type of that property both has structure, but allows me to have invalid values. Perhaps it has methods to convert it to a more strict type once we validate that the value is valid. In this case, a ToUri method would makes sense.

But string is sealed, so can’t derive from it. What’s a hapless coder to do?


Custom string types through implicit conversions



Well you could use the StringOr<T> class written as an April Fool’s joke. It was a joke, but it might be useful in cases like this! But that’s not the approach I’ll take.

Or you can follow the advice of Jimmy Bogard in his post on primitive obsession that I linked to at the beginning (I’m sure he’ll love that I dragged out a post he wrote five years ago) and write a custom class that’s implicitly convertible to string.

In his post, he shows a ZipCode example which I will include below, but with one change. The very last method is a conversion overload and I changed it from explicit to implicit.



public class ZipCode
{
private readonly string _value;

public ZipCode(string value)
{
// perform regex matching to verify XXXXX or XXXXX-XXXX format
_value = value;
}

public string Value
{
get { return _value; }
}

public override string ToString()
{
return _value;
}

public static implicit operator string(ZipCode zipCode)
{
return zipCode.Value;
}

public static implicit operator ZipCode(string value)
{
return new ZipCode(value);
}
}




This allows you to write code like:



ZipCode zip = "98008";




This provides the ease of a string to initialize a ZipCode type, while at the same time it provides access to the structure of a zip code.

In the interest of full disclosure, many people have a strong feeling against implicit conversions. I asked Jon Skeet, Number one dude on StackOverflow and perhaps as well versed in C# as just about anybody in the world, to review a draft of this post as I didn’t want to propagate bad practices without due warning. Here’s what he said:


Personally I really dislike implicit conversions. I don't even like explicit conversions much - I prefer methods. So if I'm writing XML serialization code, I'll usually have a FromXElement static method, and a ToXElement instance method. It definitely makes the code longer, but I reckon it's ultimately clearer. (It also allows for several conversions from the same type with different meanings - e.g. Duration.FromHours, Duration.FromMinutes etc.)


I don’t think I’d ever expose an implicit conversion like this in a public API that’s meant to share with others. But within my own code, I like it so far. If I get bitten by it, maybe I’ll change my tune and Skeet can tell me, “I told you so!”


Taking it further



I like Jimmy’s approach, but it doesn’t go far enough for my needs. For example, this works great when you employ this approach from the start. But what if you already shipped version 1 of a property as a string? And now you want to change that property to a ZipCode. But you have existing values serialized to disk. Or maybe you need pass this ZipCode to a JSON endpoint. Is that going to serialize ok?

In my case, I often want these types to act as much like strings as possible. That way, if I change a property from string to one of these types, it’ll break as little of my code as possible (if any).

What this means is we need to write a lot more boilerplate code. For example, override the Equals method and operator. In other cases, you may want to override the addition operator. I did this with a PathString class that represents file paths so I could write code like this:



// The code I recommend writing.
PathString somePath = @"c:\fake\path";
somePath = somePath.Combine("subfolder");
// somePath == @"c:\fake\path\somePath";

// But if you did this by accident:
PathString somePath = @"c:\fake\path";
somePath += "subfolder";
// somePath == @"c:\fake\path\somePath";




PathString has a proper Combine method, but I see code where people attempt to concatenate paths all the time. PathString overrides the addition operator creating an idiom where concatenation is equivalent to path combination.  This may end up being a bad idea, we’ll see. My feeling is that if you’re already concatenating paths, this can only make it better.

I also implemented ISerializable and IXmlSerializable to make sure that, for example, the serialized representation of PathString looks exactly like a string.

Since I have multiple types like this, I tried to push as much of the boilerplate into a base class. But it takes some tricky tricky tricks that might be a little bit evil.

Here’s the signature of the base class I wrote:



[Serializable]
public abstract class StringEquivalent<T>
: ISerializable, IXmlSerializable where T : StringEquivalent<T>
{
protected StringEquivalent(string value);
protected StringEquivalent();
public abstract T Combine(string addition);
public static T operator +(StringEquivalent<T> a, string b);
public static bool operator ==(StringEquivalent<T> a, StringEquivalent<T> b);
public static bool operator !=(StringEquivalent<T> a, StringEquivalent<T> b);
public override bool Equals(Object obj);
public bool Equals(T stringEquivalent);
public virtual bool Equals(string other)
public override int GetHashCode();
public override string ToString();
// Implementations of ISerializable and IXmlSerializable
}




The full implementation is available in my CodeHaacks repo on GitHub with full unit tests and examples.


Self Referencing Generic Constraints



There’s some stuff in here that just seemed crazy to me at first. For example, taking out the interfaces, did you notice the generic type declaration?



public abstract class StringEquivalent<T> : where T : StringEquivalent<T> 




Notice that the generic constraint is self-referencing. This is a pattern that Eric Lippert discourages:


Yes it is legal, and it does have some legitimate uses. I see this pattern rather a lot(**). However, I personally don't like it and I discourage its use.

This is a C# variation on what's called the Curiously Recurring Template Pattern in C++, and I will leave it to my betters to explain its uses in that language. Essentially the pattern in C# is an attempt to enforce the usage of the CRTP.

…snip…

So that's one good reason to avoid this pattern: because it doesn't actually enforce the constraint you think it does.

…snip…

The second reason to avoid this is simply because it bakes the noodle of anyone who reads the code.


Again, Jon Skeet provided an example of the warning that Lippert states in regards to the inability to actually enforce the constraint I might wish to enforce.


While you're not fulIy enforcing a constraint, the constraint which you have got doesn't prevent some odd stuff. For example, it would be entirely legitimate to write:

public class ZipCode : StringEquivalent<ZipCode>

public class WeirdAndWacky : StringEquivalent<ZipCode>

That's legal, and we don't really want it to be. That's the kind of thing Eric was trying to avoid, I believe.


The reason I chose to against the recommendation of someone much smarter than me in this case is because my goal isn’t to enforce these constraints at all. It’s to enable a scenario. This is the only way to implement these various operator overloads in a base class. Without these constraints, I’d have to reimplement them for every class class. If you know a better approach, I’m all ears.


WPF Value Converter and Markup Extension Examples



As a bonus divergence, I thought I’d throw in one more example of a self-referencing generic constraint. In WPF, there’s a concept of a value converter, IValueConverter, used to convert values from XAML to your view model and vice versa. However, the mechanism to declare and use value converters is really clunky.

Josh Twist provides a nice example that cleans up the syntax with value converters that are also MarkupExtension. I decided to take it further and write a base class that does it.



public abstract class ValueConverterMarkupExtension<T> 
: MarkupExtension, IValueConverter where T : class, IValueConverter, new()
{
static T converter;

public override object ProvideValue(IServiceProvider serviceProvider)
{
return converter ?? (converter = new T());
}

public abstract object Convert(object value, Type targetType
, object parameter, CultureInfo culture);

// Only override this if this converter might be used with 2-way data binding.
public virtual object ConvertBack(object value
, Type targetType, object parameter, CultureInfo culture)
{
return DependencyProperty.UnsetValue;
}
}




I’m sure I’m not the first to do something like this.

Now all my value converters inherit from this base class.


Back to Primitives



Back to the original topic, I used to supplement primitives with loads of extension methods. I have a set of extension methods of string I use quite a bit. But more and more, I’m starting to prefer dialing that back a bit in cases like this where I need something to be a string with structure.







DIGITAL JUICE

Strange outdoor sculptures in Google Earth

Strange outdoor sculptures in Google Earth:
The folks at Curious World have put together a list of some of the strangest outdoor sculptures they could find, and there are certainly some interesting items in here. For example, here is the "Spoonbridge and Cherry" near Minneapolis, MN:


spoonbridge-and-cherry.jpg


Another good example is the "Angel of the North", a $1.4 million sculpture in the United Kingdom:


angel-of-the-north.jpg


The full list is quite interesting, and most of the items are in 3D in Google Earth thanks to talented SketchUp modelers around the world. Check out the full list at on their site and scroll to the bottom to find the locations of all of the items on a map.




DIGITAL JUICE

AntiVir Personal 13.0.0.2681

AntiVir Personal 13.0.0.2681: Avira AntiVir Personal - FREE Antivirus is a reliable free antivirus solution, that constantly and rapidly scans your computer for malicious programs such as viruses, Trojans, backdoor programs, hoaxes, worms, dialers etc. Monitors every action executed by the user or the operating system and reacts promptly when a malicious program is detected.




DIGITAL JUICE

Flash Player 11.5.500.80 Beta (Non-IE)

Flash Player 11.5.500.80 Beta (Non-IE): Adobe Flash Player is the high performance, lightweight, highly expressive client runtime that delivers powerful and consistent user experiences across major operating systems, browsers, mobile phones and devices.




DIGITAL JUICE

Flash Player 11.5.500.80 Beta (IE)

Flash Player 11.5.500.80 Beta (IE): Adobe Flash Player is the high performance, lightweight, highly expressive client runtime that delivers powerful and consistent user experiences across major operating systems, browsers, mobile phones and devices.




DIGITAL JUICE

Security Essentials 4.1.522 Vista

Security Essentials 4.1.522 Vista: Microsoft Security Essentials provides real-time protection for your home PC that guards against viruses, spyware, and other malicious software.




DIGITAL JUICE

ACDSee 15.0.169

ACDSee 15.0.169: ACDSee - the most powerful photo manager around is now even faster. Way faster. No other photo software saves you so much time. Enjoy the freedom to find, organize and edit your photos faster, easier and with better results than ever before. Instantly share your pictures online or on your cell phone. Create quality prints or Flash and PDF slideshow...




DIGITAL JUICE

CCleaner 3.23.1823

CCleaner 3.23.1823: CCleaner is a freeware system optimization, privacy and cleaning tool. It removes unused files from your system - allowing Windows to run faster and freeing up valuable hard disk space. It also cleans traces of your online activities such as your Internet history. Additionally it contains a fully featured registry cleaner. But the best part is that...




DIGITAL JUICE

PostgreSQL 9.2.1

PostgreSQL 9.2.1: PostgreSQL is a powerful, open source object-relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness.




DIGITAL JUICE