AvocadoSoftware.com

Software For Hardcore Developers
Welcome to AvocadoSoftware.com Sign in | Join | Help
in Search

Derick Baileys old blog archives - go to derickbailey.com for new contents

Federated ID's and checking for equality (Overriding the Equals method)

My current project integrates directly into one of my company's main products through the use of a messaging system. Each installation of the new project has it's own local database, code base, etc., and runs independently of the main product. This project also works with a subset of the data from the main product - some of the information is pulled straight out of the main product, and some of it is allowed to be created and manipulated from the new project. The end result is that the main product is the system of record and this new project is considered to be a volatile snapshot that provides some additional functionality (including disconnected operations).

Given that short overview and the need to operate in a disconnected mode, we chose to implement our own local database IDs for our objects, as well as make use of the master system's IDs for synchronization between the two systems. The end result is a basic federated ID setup - the code in the new project can uniquely identify an object based on it's local database instance ID, the master system's ID (if the item has been sync'd to / pulled from the master system), or a combination of both. This federated ID created some very interesting challenges in determining whether or not two objects are equal and has caused some strange data access bugs and user interface bugs.

For example, we regularly get errors from NHibernate stating that an object with the same ID already exists in the session. We have also recently noticed that a lot of the Telerik DataGridViews that we use were showing the correct number of items, but duplicating the data from one item to the next - in other words, there would be 3 unique items in the database and the Telerik grid would show three items, but the three items would appear the same in the grid. Making it even more fun, when we view the details of the items in the grid, it would show the correct details - not the duplicated data from the grid view.

After some debugging and R&D work on the issues, I found the problems to all be caused by our overriding implementation of the "Equals" method in our C# objects. It turns out that the simple checks we were making on the objects did not cover every possible scenario of the federated IDs and the simple scenarios of checking object references.

Here is a simple example of one of the issues that we had:

Code Listing 1: FederatedObject with "Equals" bugs

public class FederatedObject
{
    private int _localId = 0;
    private int? _masterId = null;

    public int LocalId
    {
        get { return _localId; }
        set { _localId = value; }
    }

    public int? MasterId
    {
        get { return _masterId; }
        set { _masterId = value; }
    }

    public override bool Equals(object obj)
    {
        bool areEqual;
        FederatedObject fed = obj as FederatedObject;

        if (fed != null)
        {
            areEqual = (fed.LocalId == LocalId || fed.MasterId == MasterId);
        }
        else
            areEqual = ReferenceEquals(this, obj);

        return areEqual;
    }

    public override int GetHashCode()
    {
        if (LocalId > 0)
            return LocalId;
        else
            return base.GetHashCode();
    }

}

This is fairly simple code and covered a majority of the use cases that we had. It did not cover all scenarios, though. When the original code was written, the number of scenarios to test was misunderstood. Upon further reflection, though, it became apparent that the number of scenarios to test can be represented by a mathematical equation:

Where n is the number of federated IDs, the number of scenarios to test is equal to: O(n^(n+1))

In this simple scenario, we have two fields that can represent the uniqueness of an object: LocalId and MasterId. Thus, our equation becomes 2^3, giving us a result of 8 - we have 8 scenarios that need to be tested. Given that the "Equals" operation returns a boolean result, we will need to validate the results of each test twice - once for the test returning true and once for the test returning false. However, there are a few of the Equality checks that are actually invalid and will only ever produce an unequal result.

I've been trying to work out the actual equation to represent the number of invalid scenarios, but I have not had much luck, yet. For the 2 federated ID's in our scenario, though, the actual number of invalid tests ends up being 4. With the 8 scenarios multiplied by the 2 tests (equal and unequal), subtracting the 4 invalid tests, we are left with a total of 12 tests that need to be performed.

Additionally - C# has the ability for a reference in be null and the off-chance that a non-"FederatedObject" may be passed into the Equals method. Thus, we need to add two additional unit tests, bringing the grand total of tests up to 14

Here is the list of possible scenarios that need to be unit tested, specifying which ones have an equality and/or inequality test:

  1. Comparing two objects that are saved only in the local database
    1. Equality
    2. Inequality
  2. Comparing two objects that are saved in the local and master database
    1. Equality
    2. Inequality
  3. Comparing one object that is saved in only the local database with one transient object (not saved anywhere, yet)
    1. Inequality only
  4. Comparing one object that is saved in the local and master database with one transient object
    1. Inequality only
  5. Comparing one object that is saved in only the local database with one saved in the local and master database
    1. Equality
    2. Inequality
  6. Comparing one object that is saved in only the local database with one saved in only the master database
    1. Inequality only
  7. Comparing two objects are are only saved in the master database
    1. Equality
    2. Inequality
  8. Comparing two transient objects
    1. Inequality only
  9. Comparing one object in any state to a null reference
    1. Inequality only
  10. Comparing one object in any state to a non-null non-FederatedObject reference
    1. Inequality only

The existing code in the FederatedObject falls quite short of handling all of these scenarios. Table 1 shows a matrix of the test scenarios with the check for equality and inequality, run against the code listed above.

Table 1: Unit Test Results from Code Listing 1

Scenario # / Test Equality Inequality
Scenario 1 Pass Fail
Scenario 2 Pass Pass
Scenario 3 n/a Fail
Scenario 4 n/a Pass
Scenario 5 Pass Pass
Scenario 6 n/a Pass
Scenario 7 Pass Fail
Scenario 8 n/a Fail
Scenario 9 n/a Pass
Scenario 10 n/a Pass

The results show that roughly 30% of the tests failed (4 out of 14 failed). Looking back at Code Listing 1, the reason for these failures can be seen in the Equals method, highlighted in red, in Code Listing 2:

Code Listing 2: The issues in the Equals method

    public override bool Equals(object obj)
    {
        bool areEqual;
        FederatedObject fed = obj as FederatedObject;

        if (fed != null)
        {
           areEqual = (fed.LocalId == LocalId || fed.MasterId == MasterId);
        }
        else
            areEqual = ReferenceEquals(this, obj);

        return areEqual;
    }

On their own, there is nothing syntactically or technically wrong with these lines. It is only in the complex scenarios identified by our unit tests that we see the real issues. There are also a few assumptions that we need to make about the data access systems that we are using. Namely, the idea that any ID that is null or less-than-or-equal-to zero, is an object that is not saved in the data storage system. With this in mind, the issues in these highlighted lines become more apparent.

  • fed.LocalId and/or LocalId may be zero or less than zero. This is not accounted for.
  • fed.MasterId and/or MasterId may be zero or less than zero. This is not accounted for.
  • fed.MasterId and/or MasterId may be null (note the use of "int?" as the datatype - c#'s shortcut for a nullable value). This is not accounted for
  • ReferenceEquals should always be checked, for safety, not just when the "fed" reference is null.

To account for these issues in the code, some refactoring needs to be done. Considering the complex nature of the checks, it is also recommended that the checks be abstracted out into smaller method calls, keeping the Equals method as clean as possible. Code Listing 3 shows the corrected Equals method and the additional method calls for clarity.

Code Listing 3: The corrected Equals method and additional methods

public override bool Equals(object obj)
{
    bool areEqual;

    bool referenceIsEqual = ReferenceEquals(this, obj);
    FederatedObject fed = obj as FederatedObject;

    if (fed != null)
    {
        bool localIdsAreEqual = CheckLocalIds(this, fed);
        bool masterIdsAreEqual = CheckMasterIds(this, fed);
        areEqual = (localIdsAreEqual || masterIdsAreEqual || referenceIsEqual);
    }
    else
        areEqual = referenceIsEqual;

    return areEqual;
}

private static bool CheckLocalIds(FederatedObject fed1, FederatedObject fed2)
{
    bool areEqual = false;
    if (fed1.LocalId > 0 && fed2.LocalId > 0)
    {
        if (fed1.LocalId == fed2.LocalId)
        {
            areEqual = true;
        }
    }
    return areEqual;
}

private static bool CheckMasterIds(FederatedObject fed1, FederatedObject fed2)
{
    bool areEqual = false;

    if (fed1.MasterId.HasValue && fed2.MasterId.HasValue)
    {
        if (fed1.MasterId.Value > 0 && fed2.MasterId.Value > 0)
        {
            if (fed1.MasterId.Value == fed2.MasterId.Value)
            {
                areEqual = true;
            }
        }
    }

    return areEqual;
}

The code in the additional methods has been purposely expanded into multiple if-then statements to highlight all of the checks that must be done. Note that the checks for MasterId not only check the value, but also check for nulls, since the data type allows null.

After re-running the 14 unit tests that we identified earlier, we now have 100% passed tests. Table 2 shows the new results.

Table 2: Unit Test Results with corrected Equals method

Scenario # / Test Equality Inequality
Scenario 1 Pass Pass
Scenario 2 Pass Pass
Scenario 3 n/a Pass
Scenario 4 n/a Pass
Scenario 5 Pass Pass
Scenario 6 n/a Pass
Scenario 7 Pass Pass
Scenario 8 n/a Pass
Scenario 9 n/a Pass
Scenario 10 n/a Pass

The end result of these changes is that the new system that my team is working is now has far fewer data access issues and user interface issues. We no longer have data grid rows showing duplicate data, and are not seeing the dreaded "object with the same id already exists" from NHibernate. The path to get to this point was rough and it really opened my eyes in terms of the complexity of equality checking and federated ID systems. Hopefully I'll remember this for the future and not go through the same headaches and issues again.

Published Wednesday, January 09, 2008 3:44 PM by dredge
New Comments to this post are disabled

This Blog

Post Calendar

<January 2008>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
272829303112
3456789

Advertisement

News

this is my old blog archives - go to http://derickbailey.com for updates

Syndication

Advertisement

Powered by Community Server, by Telligent Systems