Recursive Hierarchical Joins in C# and LINQ

source: Recursive Hierarchical Joins in C# and LINQ

this guy made a really good Hierarchical JOINS LINQ method. Have a look!

using System;
using System.Collections.Generic;
using System.Linq;

public static class RecursiveJoinExtensions
{
   public static IEnumerable<TResult> RecursiveJoin<TSource, TKey, TResult>(this IEnumerable<TSource> source,
      Func<TSource, TKey> parentKeySelector,
      Func<TSource, TKey> childKeySelector,
      Func<TSource, IEnumerable<TResult>, TResult> resultSelector)
   {
      return RecursiveJoin(source, parentKeySelector, childKeySelector,
         resultSelector, Comparer<TKey>.Default);
   }

   public static IEnumerable<TResult> RecursiveJoin<TSource, TKey, TResult>(this IEnumerable<TSource> source,
      Func<TSource, TKey> parentKeySelector,
      Func<TSource, TKey> childKeySelector,
      Func<TSource, int, IEnumerable<TResult>, TResult> resultSelector)
   {
      return RecursiveJoin(source, parentKeySelector, childKeySelector,
         (TSource element, int depth, int index, IEnumerable<TResult> children)
            => resultSelector(element, index, children));
   }

   public static IEnumerable<TResult> RecursiveJoin<TSource, TKey, TResult>(this IEnumerable<TSource> source,
      Func<TSource, TKey> parentKeySelector,
      Func<TSource, TKey> childKeySelector,
      Func<TSource, IEnumerable<TResult>, TResult> resultSelector,
      IComparer<TKey> comparer)
   {
      return RecursiveJoin(source, parentKeySelector, childKeySelector,
         (TSource element, int depth, int index, IEnumerable<TResult> children)
            => resultSelector(element, children), comparer);
   }

   public static IEnumerable<TResult> RecursiveJoin<TSource, TKey, TResult>(this IEnumerable<TSource> source,
      Func<TSource, TKey> parentKeySelector,
      Func<TSource, TKey> childKeySelector,
      Func<TSource, int, IEnumerable<TResult>, TResult> resultSelector,
      IComparer<TKey> comparer)
   {
      return RecursiveJoin(source, parentKeySelector, childKeySelector,
         (TSource element, int depth, int index, IEnumerable<TResult> children)
            => resultSelector(element, index, children), comparer);
   }

   public static IEnumerable<TResult> RecursiveJoin<TSource, TKey, TResult>(this IEnumerable<TSource> source,
      Func<TSource, TKey> parentKeySelector,
      Func<TSource, TKey> childKeySelector,
      Func<TSource, int, int, IEnumerable<TResult>, TResult> resultSelector)
   {
      return RecursiveJoin(source, parentKeySelector, childKeySelector,
         resultSelector, Comparer<TKey>.Default);
   }

   public static IEnumerable<TResult> RecursiveJoin<TSource, TKey, TResult>(this IEnumerable<TSource> source,
      Func<TSource, TKey> parentKeySelector,
      Func<TSource, TKey> childKeySelector,
      Func<TSource, int, int, IEnumerable<TResult>, TResult> resultSelector,
      IComparer<TKey> comparer)
   {
      // prevent source being enumerated more than once per RecursiveJoin call
      source = new LinkedList<TSource>(source);

      // fast binary search lookup
      SortedDictionary<TKey, TSource> parents = new SortedDictionary<TKey, TSource>(comparer);
      SortedDictionary<TKey, LinkedList<TSource>> children
         = new SortedDictionary<TKey, LinkedList<TSource>>(comparer);

      foreach (TSource element in source)
      {
         parents[parentKeySelector(element)] = element;

         LinkedList<TSource> list;

         TKey childKey = childKeySelector(element);

         if (!children.TryGetValue(childKey, out list))
         {
            children[childKey] = list = new LinkedList<TSource>();
         }

         list.AddLast(element);
      }

      // initialize to null otherwise compiler complains at single line assignment
      Func<TSource, int, IEnumerable<TResult>> childSelector = null;

      childSelector = (TSource parent, int depth) =>
      {
         LinkedList<TSource> innerChildren = null;

         if (children.TryGetValue(parentKeySelector(parent), out innerChildren))
         {
            return innerChildren.Select((child, index)
               => resultSelector(child, index, depth, childSelector(child, depth + 1)));
         }

         return Enumerable.Empty<TResult>();
      };

      return source.Where(element => !parents.ContainsKey(childKeySelector(element)))
         .Select((element, index) => resultSelector(element, index, 0, childSelector(element, 1)));
   }
}
Advertisements

Query Syntax, Method Syntax, Standard Query Operators, and Extension Methods in LINQ

Query Syntax and Method Syntax in LINQ (C#)

Standard Query Operators Overview

Extension Methods (C# Programming Guide)

* Pop Quiz for developers: what’s an alternative method syntax (using Lambda expression) of this query syntax? the answer is in one of the linked pages above. =).

var query = from word in words
            group word.ToUpper() by word.Length into gr
            orderby gr.Key
            select new { Length = gr.Key, Words = gr };

Type Relationships in LINQ Query Operations (C#)

source: Type Relationships in LINQ Query Operations (C#) from MSDN LINQ (C#) pages

As a developer, you will need to understand what occurs behind the scenes when variables are implicitly typed by using var. LINQ query operations are strongly typed in the data source, in the query itself, and in the query execution. The type of the variables in the query must be compatible with the type of the elements in the data source and with the type of the iteration variable in the foreach statement. This strong typing guarantees that type errors are caught at compile time when they can be corrected before users encounter them.

Queries that do not Transform the source data: The following illustration shows a LINQ to Objects query operation that performs no transformations on the data. The source contains a sequence of strings and the query output is also a sequence of strings.Relation of data types in a LINQ query

1. The type argument of the data source determines the type of the range variable.
2. The type of the object that is selected determines the type of the query variable. Here name is a string. Therefore, the query variable is an IEnumerable<string>.
3. The query variable is iterated over in the foreach statement. Because the query variable is a sequence of strings, the iteration variable is also a string.

Queries that Transform the source data: The following illustration shows a LINQ to SQL query operation that performs a simple transformation on the data. The query takes a sequence of Customer objects as input, and selects only the Name property in the result. Because Name is a string, the query produces a sequence of strings as output.

A query that transforms the data type

1. The type argument of the data source determines the type of the range variable.
2. The select statement returns the Name property instead of the complete Customer object. Because Name is a string, the type argument of custNameQuery is string, not Customer.
3. Because custNameQuery is a sequence of strings, the foreach loop’s iteration variable must also be a string.

The following illustration shows a slightly more complex transformation. The select statement returns an anonymous type that captures just two members of the original Customer object.

A query that transforms the data type

1. The type argument of the data source is always the type of the range variable in the query.
2. Because the select statement produces an anonymous type, the query variable must be implicitly typed by using var.
3. Because the type of the query variable is implicit, the iteration variable in the foreach loop must also be implicit.

Let the compiler infer the type! : Although you should understand the type relationships in a query operation, you do have the option to let the compiler do all the work for you. The keyword var can be used for any local variable in a query operation. The following illustration is exactly equivalent to example number 2 that was discussed earlier. The only difference is that the compiler will supply the strong type for each variable in the query operation:

Type flow with implicit typing