Passion for Programming: 2010

Thursday, 9 December 2010

Decimal vs Double in .Net

Due to the increased accuracy, decimal type is often preferred (and recommended by MSDN) to double type when dealing with financial calculations.

Put in layman’s term, the reason that decimal type is more accurate is because decimal types are encoded base 10 (which is the number system humans use), as opposed to base 2 (which is the number system computers use). Some people like to explain the accuracy in terms of bits and bytes, e.g. decimal is 16-byte v.s double 8-byte, but I found such explanation unintuitive.

Base 2 number system cannot represent all numbers in base 10. Therefore, sometimes you get weird result when you perform arithmetic operations on double types. For example,

8.954-7.612 will return 1.3420000000000005

Whereras if you use decimal type, you get the correct result:

8.954m-7.612m returns 1.342

This applies to comparison as well:

1.34200000000000005 > 1.342 returns false

1.34200000000000005m > 1.342m returns true

However, the extra performance comes with extra cost:

1). Decimal takes twice as much the memory space as Double

2). Double is the default type of fractional number literals in .Net. and decimal calculation is exponentially slower than its double counterparts. My test of one million calculations confirms this point.

The above two points are probably not a huge problem, since they can be alleviated by high-end hardware which are increasingly accessible. However, when use decimals, there are a number of things you need to be aware of:

1). Double is the default in .Net and manybuild-in functions returning numeric values return them as double, e.g. Math.Pow(), etc. The implication is that if you declare your variable as decimal, and sometimes when you want to combine your variable with another fractional numeric literal or a function returning double, you have to explicitly convert the other non-decimal party into Decimal because no auto-conversion between double and decimal in C#.

For example:

decimal pow = Math.Pow(2, 4);

The above line will give you a compile error if you don’t explicitly convert.

You have to do explicit conversion every time you work with factional literals (by putting a 'm' to the end of it) and other .Net functions returning double if your variable is decimal.

2). Since Double is favoured by .Net. Microsoft has enhanced Double type with some cool feature - Double calculations do not throw exceptions. It only returns one of the three special values when exception occurs

o Double.NaN

o Double.NegativeInfinity

o Double.PositiveInfinity

This feature is not available in Decimal type. You may argue that: what is the big deal? Can we just catch and deal with these exceptions. Yes you can, but not without a lot more code and without breaking the logic flow (image you are doing a big loop to calculate the returns of 1000 portfolios).

3). Despite being treated conceptually as a primitive type, decimal is technically not a primitive type in .Net. For example:

3.4m.GetType().IsPrimitive returns false

3.4.GetType().IsPrimitive returns true

This is an important point to keep in mind when you use reflection, e.g. to auto-map value objects or data transfer objects, etc.

4). Decimal is more accurate than double, but double has much bigger range (therefore no auto conversion between them). Not only in max/min values:

decimal.MaxValue => 79228162514264337593543950335

double.MaxValue => 1.7976931348623157E+308

but also in the number of decimal places that can be represented:

decimal can only represent maximum 28 decimal places, while double a few times more than that. This is especially important when you want to parse some unusual numbers returned from a database or a 3rd party application. For example:

string str = "0.00000000000000000000000000005";

decimal.Parse(str) returns 0

double.Parse(str) returns 0.00000000000000000000000000005

As a financial software provider, our company has a "decimal only" policy in our code. While using decimals everyday, sometimes tricky bugs occur in our code due to developer's lack of awareness of aforementioned point 3 and 4. Knowing the subtle difference between double and decimal will help you choose which one to use and help you pin down otherwise hard-to-find bugs.

Thursday, 18 November 2010

.Net Rounding - ToEven v.s AwayFromZero

Since the early dates of Office and VBA, there has been some subtle difference/discrepancy between how rounding is handled within the same platform. (Yes, I count Office and VBA as one platform). If this is new to you, try this:

In Excel,

Or in Word

Insert -> Field -> = (Formula)

type the following in the shaded area

then hit Shift + F9, you get this:

However, if you try to round the same number in VBA as below, you get a different result 2.22

To confuse the matter even more, if you call VBA's Format() method, you get 2.23.

Most people only are accustomed to one way of rounding - less than 5 down and equal or more than 5 up. But sometimes, this method creates bias in calculations. I don't think I can explain better than Wikipedia on this topic, so please refer to http://en.wikipedia.org/wiki/Rounding

Clearly, VBA's Math.Round() method is applying the Round To Even way (alternatively called Banker's rounding). Number 22.225 becomes 22.22 for 2 decimal place rounding, because the 3rd digit is 5 and the digit to the right of it is an even number.

Unfortunately, VBA Math.Round() doesn't give you a way to specify different way of rounding. Moving to .Net, Math.Round() has a overloaded version that takes a MidpointRounding enum to allow you to specify either round To Even or Away From Zero (which is the less-than-5-down- and-equal-or-more-than-5-up method).

Math.Round(22.225, 2, MidpointRounding.ToEven) = 22.22

Math.Round(22.225, 2, MidpointRounding.AwayFromZero) = 22.23

If you don't specify the MidpointRounding enum, then the rounding is defaulted to ToEven:

Math.Round(22.225, 2) = 22.22

If you are unaware of this subtlety when programming number intensive application, or application involves Office application, you will run into hard-to-detect bugs sooner or later.

.Net provides some convenience methods (for instance Format() in VB.Net and ToString() in C#) for you to format a number for presentation:

22.225.ToString("##.##") returns a string "22.23"

but the rounding used is Away From Zero - again different from the default Math.Round().

If you are programming with .Net, it is likely that you also use MS SQL Server, which rounds away from zero as well.

I summarise the rounding difference in the following table:

So if you programme using .Net against Excel/Word or SQL Server, make sure you know when to use:

Math.Round(number, decimal_places, MidpointRounding.AwayFromZero)

And if you use .Net formatting methods, make sure the number appears the same anywhere else across your application.

Tuesday, 25 May 2010

No Implicit Conversion When Passing Parameters By Reference

We all know that when passing a parameter to a method by reference using the ref keyword, two things are important:

1). the variable must be explicitly assigned first.

2). the variable doesn't create a new storage location (in terms of reference type) or new copy (in terms of value type).

However, one undocumented (by undocumented I mean it is neither in C# specification nor MSDN) feature of passing parameter by ref is the declaration type of the variable passed in must match exactly the declaration type of the parameter in the method. For example, if you have a method like this:

　 public void RefMe(ref ValueType pop)

{

}

then the following code will give you a compile error:

int number = 3;

RefMe(ref number);

Even though integer is subtype of ValueType, but implicit conversion/casting is not possible when passing parameters reference. The variable declaration type must match the method parameter type exactly.

Don't attempt the following either :

RefMe(ref (ValueType)number); It will emit a compile error.

You have to do the casting before the method call:

int number = 3;

ValueType val = number;

RefMe(ref val);

ValueType number = 3;

RefMe(ref number);

but be aware that for value types, declaring another variable creates a new copy of the data, which may defeat the purpose of passing a variable by reference. However, it is fine for reference type though.

Wednesday, 12 May 2010

C# Array Literal?

C# allows you to declare an array in the following way:

string[] strArray = { "a","b" };

thus gives you the illusion that { "a","b" } is an array liberal, just like "good string" is a string literal, because you can declare a string like this:

string myString = "good string";

However, if you pass { "a","b" } directly to a method that take string array as a parameter, e.g. String.Split(), you will get a compile error. It has to be done like this:

{ "a","b" } is just a short cut for compiler when declaring an array, in the same way that var is used. It is not an array literal. Actually If you use var in combination with { "a","b" } as below, you will get a compile error, as compiler cannot infer the type you want to declare.

var mySring = {"a", "b"};

You have to specify the type explicitly in either side of the equation:

string[] myString = {"a", "b"};

var myString = new[] {"a", "b"};

The easiest way to remember this is that when declaring an array, a pair of square brackets ([]) is mandatory! So, if you want to pass an array as a parameter to a method without giving the array a variable name, use this form:

new[]{ , , ,.....}

Monday, 12 April 2010

When to use var in C#

The keyword "var" was introduced with C# 3.0. It allows you to declare a variable without having to specify the type within a method boy, as long as the variable is initialised in the same statement.

Its usage has been prompted by some parties including Microsoft (implicitly) and RSharper (explicitly). For example, some code snippets, e.g. foreach, in Visual Studio use var; and RSharper by default suggests replacing variable types with var when possible.

The use of var certainly provides convenience (and necessity in the case of anonymous types), especially when the type name is long and apparent, e.g.

ExecutionEngineException exception = new ExecutionEngineException();

You will be delighted to use var knowing that the compiler will infer the type for you in the above instance:

var exception = new ExecutionEngineException();

However, in cases where the type is not apparent, e.g。 when calling your a method, it is better to specify the type explicity, e.g.

List files = myClass.GetOpenFiles();

rather than using var:

var files = myClass.GetOpenFiles();

You may argue that this is not a big deal, since in Visual Studio, if you hover your mouse cursor over the method name, the return type will pop up. But it is an extra effort which you may want to spare when reading the source code, and if you have a few variables declared using var, the efforts build up.

More importantly, some methods may return a type that is counterintuitive to its name. For example, Enum.GetValues() returns an array of object, rather than a collection of values in the enum as the method name suggests.

In the following code snippet, PrintValueRight() which uses type (int) explicitly will print

1
2
3

PrintValueWrong() which uses var will print

Mon
Tue
Wed

So to avoid confusion and bugs, I suggest only use var with anonymous types and variable declarations that initialised through constructors so that the type is unambiguous.