would probably be too strong of a title for this post. Something more appropriate would be “Extension Methods Should Be Used Carefully.” But that just doesn’t excite the masses as much, does it? I bet your pulse is already spiked. I wonder if I’ll get any “No they don’t!” comments. Anyways, let us continue on. In one of my recent posts I wrote a Map method that looked like this:
public static IEnumerable<TResult> Map<TArg, TResult>(
this IEnumerable<TArg> list,
Func<TArg, TResult> func)
{
if (list == null || list.Count() == 0)
{
return new List<TResult>();
}
else
{
List<TResult> result = new List<TResult>();
result.Add(func(list.First()));
result.AddRange(Map(list.Skip(1)
.Take(list.Count() - 1).ToList(), func));
return result;
}
}
As one of my readers, James Curran, pointed out the implementation of the method was horribly inefficient. And yes, I was aware of that. I was trying to write it in a truly functional manner, but I agreed and updated the method with a more efficient one. Case closed…but not really. James’ comments said that Count() was linear and so using it twice was really making an already inefficient algorithm the code equivalent of a beached blue whale (this is not a direct quote, he said it much more eloquently).
My first reaction was “Psssssha, whatever. Count() isn’t linear, everyone knows that most every collection tracks the count as items are put in and taken out, and so when I call Count() I’ll just get that integer returned to me. It doesn’t count the total every time! Duuuuh. That would be a fairly naive implementation of Count! If someone implemented a collection with a Count() method that didn’t operate in constant time I would slap them in their stupid face!” But then I stopped for a second. And I realized that this extension method was being made for IEnumerable and not ICollection. So, do you see the problem here? IEnumerable doesn’t have a Count() method, so how was I calling it on the variable then? I swear I compiled it!
Well, the short answer is that Count() is an extension method in the System.Linq namespace and it in fact can be linear time. It does try to cast to ICollection first, just in case you are using a collection it doesn’t want to actually loop through the whole thing. But the point is that this method could have been linear time.
Check this bad boy out (this is the source from Reflector, come on Microsoft, get that .net framework sourcecode out!):
public static int Count<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
ICollection<TSource> is2 = source as ICollection<TSource>;
if (is2 != null)
{
return is2.Count;
}
int num = 0;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
num++;
}
}
return num;
}
Doh!
So, now we are back to the title of this article. Be careful when using and implementing extension methods. Especially when you are implementing extension methods that are named the same as other commonly used methods. In this instance I just used Count() because I am so used to using it on collections and when the code compiled and the tests succeeded, I went with it (my bad). Little did I know that I was turning my already slow algorithm into a lumbering aquatic mammal. So in conclusion, please oh please use them carefully, extension methods are awesome and powerful but they really do reduce discoverability of code. So, next time you want to implement an extension method with a name like Count(), Length(), Remove(), etc… just think WWJD? (What would Justin Do?)
Loved the article? Hated it? Didn’t even read it?
We’d love to hear from you.
I don’t get it – what’s wrong with the Count() method? If it’s an ICollection, it uses Count property and it’s super fast. If it’s not, it loops over the collection and counts the elements. Slow? Come on…
Nope, nothing wrong with the Count() method at all. The issue is that I used the Count() method without knowledge of it. The extension method appeared to be part of the object I was working with, and I used it based on that knowledge never realizing that I was in fact using a completely different method. Sure, I should have been more careful, but I am just pointing out how easy it is to make a mistake like that and how you should take special considerations when creating extension methods that use common names. What if the Count() method on IEnumerable had not accounted for the ICollection interface?
Well, then you’d have a less-than-ideal-but-still-working Count method.
Is that a big deal to you?
Likewise, I could create a collection called Monkeys and the Eeek() method could do some nasty processing, and unless you’ve got the source code, you’re none the wiser. (Barring reflector et al) In short, extension methods don’t really change this.
What your real problem is here, is knowledge. If you don’t know that extension methods aren’t part of the type (keep in mind they look different in Intellisense), you can be easily fooled into thinking they exist on the actual type. I’m not sure how much of a problem this is, however. I certainly wouldn’t say extension methods suck because of this, in fact, I’d say they rock heartily. But I realize you used that headline to generate emotion and get people like me reading and commenting on your blog. 🙂
I agree with you entirely. I think that extension methods are very powerful and like anything that is powerful, it should be used responsibly. And yes, my carelessness at not spotting that I was dealing with an IEnumerable instead of a Collection object was the root of my problem, but that isn’t to say that other people won’t make these kinds of mistakes. If I inspire one person to pause and think for just one second the next time that they go to implement an extension method, then I have done my job. 🙂
Heheh, ok, fair enough.
Lesson:
"Be careful when using and implementing extension methods"
or
"I think that extension methods are very powerful and like anything that is powerful, it should be used responsibly."
Wow…u can say that for about 100 other C# features anywhere from delegates to String.IsNullOrEmpty
Yep, I can say it for bulldozers and hot sauce as well. 🙂 These are universal maxims.
what’s wrong with this simple implementation?
public static IEnumerable<TResult> Map<TArg, TResult>(this IEnumerable<TArg> list, Func<TArg, TResult> func)
{
foreach(var item in this)
yield return func(item);
}
no count. no problem.
Well, I did already post a faster implementation in the other post which I referenced, which is similar to your function. It doesn’t use yield return though since that would return a single item and the Map method returns an IEnumerable. Map should run the function on the entire IEnumerable for each call, not just one item. And you referenced "this" in a static method.
Extension Methods are rather obviously not polymorphic or carried at runtime (if you want this behavior use mixins, there are libraries that support them in .NET).
The real key is viewing extension methods as "Context Sensitive" sugar: http://codebetter.com/blogs/gregyoung/archive/2007/12/05/a-use-for-extension-methods.aspx
On the Visual Studio 2008 intellisense there’s a little down-arrow by the extension-methods name (which differentiates them from class methods). I think it might be helpful if they’ll allow changing the extension-method’s font in the editor as well, to prevent the problem described in this post.
Hi there,
I think that the Title you have choosen is really not good… Extension methods are beutiful they add a lot of value to the language…
I don’t see how you can implement a better implementation of count for and IEnumarable….
Regards
You didn’t use Extension Methods responsibly and say it sucks? You suck… at choosing titles.
Just to point out that .Any() extension method would be a better choice in this case, for the reasons you already expressed.
Instead of :
if( anyenumerable.Count() > 0 )
// you don’t know if anyenumerable is a collection or not
Use:
if( anyenumerable.Any() )
since you just care if the collection is empty or not, you don’t really need the count and in case it’s an enumerator you would be enumerating just the first one.
by the way, yield return does not return "only one item". yield return will effectively return one item for every item in the original enumerator. Check Jakub Sturc’s implementation again, it does work.
Regards.
(Check Jakub Sturc’s implementation, it does work) -> apart from the "this" keyword in the static method of course.
@Regev: Thanks, helpful tip.
@Marlon,Fatih: I’ll keep your comments in mind next time I pick a title!
@Potiguar: Yep, my bad. It does return an iterator that will work with IEnumerable. I looked at it completely wrong as if "yield return" was going to try to return an int each time that method was called. Well, that is totally incorrect. Thanks for pointing this out, and sorry to Jakub Sturc!
Ok, I don’t really have anything to add, but I figured I’d milk my 15 seconds of fame. Since posting my previous message which you cite, I’ve gotten my blog website working again, so I figured I take this oppertunity to post it’s URL.
Extension Methods are the Devil
I agree, stay away from extension methods
the only reason to implement them was to support new features in C# 3.0 without breaking backwards compatibility.
This is (one reason) why you should *always* clear your using namespaces on a new file (or change your new class template) and just add the ones you actually need.
Yeah, that is a really good point, with extension methods we have to be careful which namespaces we include.
A little late, but this article helped me understand something going on with my current project that uses quite a bit of linq. I had an array of about 10,000 elements that I was feeding into a linq query to filter. Then I was looping through the result set and, for each result, creating an XElement object for an xml file reference in the result. As each one was loaded, I was setting a "percentage done" indicator by taking current number of items processed divided by the total number in the result set (results.Count()).
Each iteration through the loop was taking over 500ms, and I had assumed that this was mostly time devoted to loading and parsing the xml file. Turns out almost all of it was coming from the results.Count() call, presumably because a straight array can’t be cast to ICollection. Avoiding that linear Count() call brought each iteration from over 500ms to less than 1ms.
It’s an extreme example to be sure, and something I probably should have been aware of (in my defense, I’ve been doing C# for a couple of months after programming almost exclusively in perl for about 10 years), but it’s an example of what can happen with something reasonably innocuous. Thanks for the article!
Well, that was just a gold star on my day. 🙂 Glad to help out!
really?well,sound interesting.thanks for sharing it