Archive

Archive for the ‘C#’ Category

Working with JSON Values that are C# Reserve Words.

February 17, 2011 Leave a comment

February 17th 2010 | Jimmy Bosse

Working with JSON Values that are C# Reserve Words

While playing around with GitHub’s API for Gists, I discovered I couldn’t simply deserialize into an object because the API used the C# reserve word “public” as a property name. Luckily, the DataMember attribute has a “Name” property that allows to you explicitly map a JSON property to your object’s property of a different name:

   1:  namespace MyGistClient
   2:  {
   3:      [DataContract]
   4:      public class GistsResult
   5:      {
   6:          ...
   7:   
   8:          [DataMember(Name = "public")]
   9:          public bool IsPublic { get; set; }
  10:   
  11:          ...
  12:      }
  13:  }

While C# will let me make a property named “Public”, I decided to use “IsPublic” instead so I don’t have any issues if my object gets consumed by another .NET language that isn’t case sensitive.

Jimmy Bosse is a Senior .NET developer and Team Lead at Thycotic Software, an agile software services and product development company based in Washington DC. Secret Server is our flagship password management software product.

Categories: C#, JSON Tags: , ,

Thinking in Regex A Csharp Regex Tutorial with Examples

March 4, 2010 2 comments

Regular Expressions

March 4th | 2009

Thinking in Regex – A C# Regex Tutorial with Examples

Regular Expressions (regex) can be a difficult language to learn.The terse syntax is one factor-regex are notoriously difficult to read-but another factor is the problem space. Pattern-matching problems require a different mindset than software development in general. However, if you can phrase the question in precise terms, translation to regex becomes easier, even trivial in many cases.

Given this, there are a few things to keep in mind when tackling a problem with regex.

  1. Be specific in your requirements. “I want to prevent bad characters” is not useful. “I want to exclude asterisks” is useful.
  2. Know the text your regex will run on. A regex to match a URL can be very simple, or as complex as this monster. The goal is the simplest regex that will always work with your input text.
  3. Be sure you’re not matching too much. Once your regex is matching what you want it to match, test it against various not-quite-right text samples to avoid embarrassing mistakes.

Let’s say that you have some html that looks roughly like this:

<body>
    <form id="form1" runat="server">
    <div>
        He &amp; I are best buddies.
        <a href="http://www.mywebsite.com/page.aspx?param1=1&param2=2&param3=3">
http://www.mywebsite.com/page.aspx?param1=1&param2=2&param3=3</a>
        <a href="http://www.mywebsite.com/page2.aspx?param1=1&param2=2&param3=3">
http://www.mywebsite.com/page.aspx?param1=1&param2=2&param3=3</a>
        <a href="http://www.mywebsite.com/page3.aspx?param1=1&param2=2&param3=3">
http://www.mywebsite.com/page.aspx?param1=1&param2=2&param3=3</a>
        <a href="http://www.mywebsite.com/page4.aspx?param1=1&param2=2&param3=3">
http://www.mywebsite.com/page.aspx?param1=1&param2=2&param3=3</a>
    </div>
    </form>
</body>

There are many pages like this, some with many more links. You need to fix the ampersands displayed as the text of the link to be &amp; but you do not want to replace ampersands within the href attribute or in the rest of the body of the page. This sounds a little tricky, so let’s refine our requirements.

We want to replace & with &amp; within the text part of an anchor tag only. We don’t want to replace & when it is followed by amp;

Better. But, what exactly does ‘text part’ or ‘anchor tag’ mean in terms of characters?

Specifically, we want to replace & with &amp; between the <a> and </a> character blocks, unless immediately followed by amp;

Now that we have phrased our task this way, it is much easier to form a solution. Let’s build a .Net regex to solve this problem. .Net supports variable length lookbehind (more on this following the example), so we can build our regex in neat groups that apply conditions surrounding the text we want to replace.

Regex.Replace(inputText, "&", "&amp;");

This will replace any & with &amp; Now we want to apply our conditions.

1) Must not be followed by amp;

2) Must come after (be preceded by) <a>

3) Must be followed by </a>

So the resulting regex looks like this:

Regex.Replace(inputText, @"(?<=\<a[^<>]*>[^<>]*)&(?!amp;)(?=[^<>]*</a>)", "&amp;");
Condition Regex
Must not be followed by &amp; (?!amp;)
Must be preceded by <a> tag (?<=\<a[^<>]*>[^<>]*)
Must be followed by </a> (?=[^<>]*</a>)

(?!, (?=, (?<=, and (?<! are lookarounds. &(?!amp;) can be read as “Match &, then check the next four characters. If they are amp; fail the match.” The other two conditions follow a similar pattern, requiring a preceding <a ….. > tag and a following </a> tag before any < or > is encountered. Restricting the text within < …. > blocks from containing additional <> characters prevents our regex from spanning multiple anchor tags.

Two invaluable aids to learning and using regex in .NET are Expresso (a regex tool written in .NET), and the helpful forum community at RegexAdvice. If regex interests you I highly recommend playing with the tool and browsing the forum. And since you’re still reading—this probably means you!

David Cooksey is a Senior .NET Developer at Thycotic Software, an agile software services and product development company based in Washington DC. Secret Server is our flagship password management software product.

Categories: C#

Managed code isnt always the best solution

January 15, 2010 3 comments

Managed Code isnt always the best solution

January 15th | 2009

Managed code isn’t always the best solution

Managed code is cool. In fact most of the code I write at work is managed code in C#. In one case our product team needed to write some unmanaged code as a Log On credential provider. Throughout Microsoft’s documentation on this, it was strictly mentioned that Managed Code is a bad idea. Well, why? It’s not just for performance.

.NET inherently was designed to be version-independent, meaning an application that targets the .NET 1.1 Framework CLR cannot run on 2.0, and 2.0 CLR projects will not be able to run on the 4.0 CLR, when it’s released. This is why extensions for Windows can rarely be written in managed code. It is possible to write a Log On credential provider in managed code since .NET can be exposed to COM and vice-versa. It will even work. But assuming you write a credential provider that is built in .NET 2.0, and another vendor tries to provide one in .NET 1.1, or any version other than 2.0, the logon.exe process will attempt to load both versions of the .NET Framework. This will ultimately fail and the user will not be able to boot without going into safe mode.

The same applies to context menus for Explorer and Add-Ons Internet Explorer. They can both be extended using Managed Code, but all it takes is a version conflict to kill them off.

So, as much as we love our managed code, there are scenarios where it isn’t a good idea to use it. Ultimately, if you are trying to extend any application that uses COM (or any other unmanaged code solution) as its extension method, rather than Managed Code, then stick with Unmanaged Code.

Kevin Jones is a Team Lead at Thycotic Software, an agile software services and product development company based in Washington DC. Secret Server is our flagship password management software product. On Twitter? Follow Kevin

Categories: C#

Are Extension Methods a Code Smell

December 10, 2009 18 comments

Are Extension Methods a Code Smell?

December 10th | 2009

Are Extension Methods a Code Smell?

The extension method is a handy feature that came in C# 3.0/VB.NET 9.0 and .NET Framework 3.5. Quite simply, it allows the appearance of extending a class and giving it additional functionality without actually having to modify that class. Here’s an example:

class OtherClass
{
    public void Foo()
    {
        User user = new User();
        user.DisplayUserName();
    }
}

public class User
{
    public string UserName { get; set; }
}

public static class Extensions
{
    public static void DisplayUserName(this User user)
    {
        string userName = user.UserName;
        //Display userName
    }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

In the “Foo” method we are calling a method “DisplayUserName” which appears to belong to the User class, but is actually a static method in a completely different location.

Extension methods are currently present in the .NET Framework. Their most notable use is in LINQ. All of LINQ’s operators are implemented as an extension method on the IEnumerable<T> interface.

But can they be a code smell? In most cases I would say yes, for several reasons.

Firstly, if it is something simple, why not just implement it in the object? In the case of the User, it may not be his responsibility to display to the interface. Therefore, to avoid violating the Single Responsibility Principle, it would be better to keep the functionality out of the User class. However, this turns something that might have been better as a service into a concrete static implementation, and this could be difficult to test by mocking. By creating a service and adding it to a container, we have a more appropriate solution.

Secondly, new developers—or even experienced ones—might be confused. It’s impossible to tell just by looking at the code if it’s an extension method or an actual method on the object. If you have multiple extension methods the code can be extremely difficult to read. It may even introduce bugs or make code reviews labor intensive.

A third scenario where I see extension methods as problematic is in the fact that they seem to be instance-based. That is, they have the appearance of accessing a method on an instance of an object. In actual fact, the compiler is just translating it to a normal static method. A developer might be tempted to do something like this:

public static class Extensions
{
    private static string _userName;
    public static void DisplayUserName(this User user)
    {
        _userName = user.UserName;
        DoDisplay();
    }

    public static void DoDisplay()
    {
        //Do something with _userName
    }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

This is perfectly legal and working code, but by introducing something outside of the scope that is static, and not tied to an instance of User, we’ve introduced thread safety issues. I believe this is one of the reasons Extension Properties were not introduced—it’s an invitation for this sort of code.

Extension methods can be a slippery slope. While they may have been introduced for a perfectly valid and specific reason, there’s a broken window effect: if there’s one broken window in the neighborhood, it’s an invitation for other windows to be broken too. There’s an obvious invitation for extension methods to be written in such a way that if one extension method is written, then you’re inviting other people to use them in possibly careless ways.

Are they useful in any context at all? Yes, I believe they are. Usually for extending sealed classes for which you don’t have the source. A common pattern is to make string helpers as extension methods. Here’s an example:

class Program
{
    static void Main(string[] args)
    {
        string sample = "the quick brown fox jumped over the lazy dog";
        string titled = sample.ToTitleCase();
        //titled will be "The Quick Brown Fox Jumped Over The Lazy Dog"
    }
}

public static class StringExtensions
{
    public static string ToTitleCase(this string s)
    {
        return System.Globalization.CultureInfo.CurrentUICulture.TextInfo.ToTitleCase(s);
    }
}

 

This might be an acceptable use of an extension method. We are extending a sealed and well-known class “string” and giving it additional functionality.

I could also see it working, on owned-code, with helper methods in unit test projects.

As far as organization goes, try keep extension methods down to a minimum, or none if possible. If you must use them, organize them in way that all the extensions are organized in <ClassBeingExtended>Extensions static class. Or to take it step further, append the extension method with “Extension” so it is easily identified as one.

I think they are cool – but ultimately the possible negatives outweigh the positives.

What say you readers? Are extension methods a quick way to shoot yourself in the foot?

Kevin Jones is a Team Lead at Thycotic Software, an agile software services and product development company based in Washington DC. Secret Server is our flagship password management software product. On Twitter? Follow Kevin

Categories: .NET, C#

An Overview of C# 4.0

December 1, 2009 9 comments

C# Sharp 4.0 features Kevin Jones

December 1rst | 2009

An Overview of C# 4.0

In Some Love for VB.NET 10 Too I focused on the new features in VB.NET 10. Now let’s take a look at C# 4.0. C# already has a strong, rich feature set which will, no doubt, be developed even further. I will discuss the actual C# 4.0 specification itself, not the .NET Framework, so the ever-popular “dynamic” keyword will be left for later. It deserves a post of its own.

COM Cleanup

This feature is pretty cool and it’s definitely going to get a big cheer from those who do a lot of COM automation. C# 4.0 has a few extra features that make working with COM a little easier. Word automation is a popular reason to use COM automation in the .NET Framework. Take this example which runs Word for you and adds a paragraph with bold text:

object oMissing = System.Reflection.Missing.Value;
var word = new Application();
word.Visible = true;
var doc = word.Documents.Add(ref oMissing, ref oMissing, ref oMissing, ref oMissing);
var paragraph = doc.Content.Paragraphs.Add(ref oMissing);
paragraph.Range.Text = "Hello World!";
paragraph.Range.Bold = 1;
paragraph.Range.InsertParagraphAfter();
Console.ReadKey(true);

Something that immediately stands out is all those ref’s to oMissing. What is that? Simply put, C# doesn’t support optional parameters, but COM (and VB.NET) do. To indicate to the COM Callable Wrapper that you want it to be treated like “nothing” was passed in (as opposed to NULL) Missing.Value is required. In C# 4.0, that is no longer the case. Here it is in C# 4.0:

var word = new Application();
word.Visible = true;
var doc = word.Documents.Add();
var paragraph = doc.Content.Paragraphs.Add();
paragraph.Range.Text = "Hello World!";
paragraph.Range.Bold = 1;
paragraph.Range.InsertParagraphAfter();

Much better! What’s actually happening here is just syntactic sugar. The C# compiler automatically adds a ref to Missing.Value for you, which is easy to see with a tool like Reflector. If you want to pass in Missing.Value automatically from some parameters, and actual objects for others, read the section on named parameters farther down in this blog. In addition to adding Missing.Value for you, you can now—optionally—not pass a parameter by reference even if the signature declares it as ref.

A new addition to the C# 4.0 language is support for indexed properties. For example, let’s look at some code that adds a bookmark to the document:

object oMissing = System.Reflection.Missing.Value;
object endOfDoc = "\\endofdoc";
var word = new Application();
word.Visible = true;
var doc = word.Documents.Add(ref oMissing, ref oMissing, ref oMissing, ref oMissing);
var paragraph = doc.Content.Paragraphs.Add(ref oMissing);
paragraph.Range.Text = "Hello World!";
paragraph.Range.Bold = 1;
paragraph.Range.InsertParagraphAfter();
var bookmark = doc.Bookmarks.get_Item(ref endOfDoc);
bookmark.Range.Text = "Goodbye World!";

bookmark is declared as doc.Bookmarks.get_Item(). Bookmarks is actually a property that supports indexing. C# 3.0 doesn’t support it. In C# 4.0 we can now index into it:

object oMissing = System.Reflection.Missing.Value;
object endOfDoc = "\\endofdoc";
var word = new Application();
word.Visible = true;
var doc = word.Documents.Add(ref oMissing, ref oMissing, ref oMissing, ref oMissing);
var paragraph = doc.Content.Paragraphs.Add(ref oMissing);
paragraph.Range.Text = "Hello World!";
paragraph.Range.Bold = 1;
paragraph.Range.InsertParagraphAfter();
var bookmark = doc.Bookmarks[endOfDoc];
bookmark.Range.Text = "Goodbye World!";

Neat. Does this mean C# 4.0 now supports indexable properties? Unfortunately not. As of now it can only consume them, not declare them—but it’s a step in the right direction. Also notice that we didn’t pass the endOfDoc by reference. With COM, you no longer have to pass an item by reference if you prefer not to. This only works on COM wrappers and not regular .NET managed code.

This brings us to the next feature of COM: embedded interop assemblies. A problem that COM presents for some developers, which .NET developers have to deal with when working with COM, is versioning—or making sure that the required interop wrapper is even installed. In .NET 4.0, the compiler will automatically embed your COM interop assemblies into your application if you simply right-click the reference and hit properties. This will prevent your compiled application from referencing the assembly. Instead, it will contain its own types. This ensures that your application is always using the same interop wrapper.

So our C# 3.0 code went from this:

object oMissing = System.Reflection.Missing.Value;
object endOfDoc = "\\endofdoc";
var word = new Application();
word.Visible = true;
var doc = word.Documents.Add(ref oMissing, ref oMissing, ref oMissing, ref oMissing);
var paragraph = doc.Content.Paragraphs.Add(ref oMissing);
paragraph.Range.Text = "Hello World!";
paragraph.Range.Bold = 1;
paragraph.Range.InsertParagraphAfter();
var bookmark = doc.Bookmarks.get_Item(ref endOfDoc);
bookmark.Range.Text = "Goodbye World!";

To This:

var word = new Application();
word.Visible = true;
var doc = word.Documents.Add();
var paragraph = doc.Content.Paragraphs.Add();
paragraph.Range.Text = "Hello World!";
paragraph.Range.Bold = 1;
paragraph.Range.InsertParagraphAfter();
var bookmark = doc.Bookmarks["\\endofdoc"];
bookmark.Range.Text = "Goodbye World!";

That’s cleaner now, huh?

Covariance and Contra-variance

This new feature was added to C# 4.0 to help with generics. In today’s C# code, all generics are invariant. For example, given the type SomeType<T> and SomeType<K> let’s assume that K is a superclass of T. SomeType<T> and SomeType<K> will not have any inheritance relationship at all even though T and K have a relationship. Why can’t we do this?

Ensuring type safety. In this example, allowing this to occur falls short:

List<string> stringList = new List<string>();
List<object> objectList = stringList;
objectList.Add(new object());

Even though string inherits from object, we can’t assign it to a list of objects. Line three is a good example of why not: our objectList would really be a list of strings at run time, then blam! we add an object to it. Delegates have the same problem.

private delegate T Callback<T>();
public void DoCallback()
{
    Callback<object> callback = new Callback<string>(CallbackHandler);
}

private string CallbackHandler()
{
    //Implementation Omitted
}

You would assume that this is OK. After all, why shouldn’t a callback of object be assigned to a callback of string if string always implements object? Well…for the same reason as the List.

Variance to the rescue! Where have we seen variance before? Arrays. This has always compiled:

object[] actuallyStrings = new string[] { "Hello", "World" };
actuallyStrings[0] = new object();

This will throw an ArrayTypeMismatchException, but the compiler allows it. With generics, we can do this in a way that is always compile time safe. If a generic interface or generic delegate has a reference type T as its type parameter and does not have any method or member that takes in a parameter of type T, we can declare it to be covariant on T. On the other hand, if that interface or delegate does not have any method or member that returns T, we can declare it to be contravariant on T.

NOTE: That description was taken from Buu Nguyen’s post on Code Project. It’s an elegant description and kudos to him for thinking of it.

Thus, in C# 4.0 we can declare on our delegate that T is out as we never accept a type of T.

private delegate T Callback<out T>();
public void DoCallback()
{
    Callback<object> callback = new Callback<string>(CallbackHandler);
    object result = callback();
}
private string CallbackHandler()
{
    return "Hello";
}

result in this case, is an object and rightfully so. Since T can only be an output type, and we know string always inherits from object, the compiler can safely assume the implicit cast.

So why is this a feature in C# 4.0? As Buu points out, this feature has actually been available to the CLR since generics were introduced in .NET 2.0, though none of the languages supported it.

Named and Optional Parameters

Named and default parameters partially complete the COM interop enhancements—something that many consider long overdue. With named parameters, you can specify the order in which your parameters are declared. This also comes into play with option parameters. Here is an example:

static void Main(string[] args)
{
    PrintTwoStrings(stringB: "World", stringA: "Hello");
}

static void PrintTwoStrings(string stringA, string stringB)
{
    Console.WriteLine(stringA);
    Console.WriteLine(stringB);
}

Notice that I explicitly declared the parameters, even out of order. Likewise, I can make stringB optional with a default. This code will produce “Hello Japan”

static void Main()
{
    PrintTwoStrings("Hello", stringB: "Japan");
}

static void PrintTwoStrings(string stringA, string stringB = "World")
{
    Console.WriteLine(stringA);
    Console.WriteLine(stringB);
}

Even though stringB has a default, we explicitly set it to Japan. If we were to omit the stringB in the call to PrintTwoStrings:

static void Main()
{
    PrintTwoStrings("Hello");
}

This will print “Hello World”. You can have multiple optional parameters, but they must all be placed after the required ones. This is to avoid confusion with overloads.

This also works with COM. If you have a COM method that takes an exceptional number of arguments, and you want to pass in Missing.Value for all but a few, you can use named parameters and it will default all of the others to Missing.Value.

It’s worth noting that C# and VB.NET are compatible with each other on named and optional parameters. VB.NET has actually supported both these features for years.

Kevin Jones is a Team Lead at Thycotic Software, an agile software services and product development company based in Washington DC. Secret Server is our flagship password management software product. On Twitter? Follow Kevin