Saturday, May 26, 2012

Code Query and Rule over LINQ

Code Query and Rule over LINQ:
Yesterday, after two years of a relentless development effort, we finally released NDepend v4. Personally, I consider this version as the biggest milestone we’ve ever achieved. The three flagship features are:
  • Code query and rule over LINQ (CQLinq)
  • NDepend.API to let user develop its own static analyzers (+14 OSS Power Tools proposed on top of NDepend.API)
  • VS11 addin support
Developing these features by respecting the following drastic requirements was a real challenge:
  • Features Richness: The code query features set was already rich in prior versions, but we wanted it to be even richer to be able to write easily all the code queries and rules that existing users asked us.
  • Syntax: Queries must be easy to write and to read. Hopefully most NDepend users are aware of the C# LINQ syntax, but the proposed code model API needs to be LINQ-friendly. We started with a three months research effort to define what was the cleanest syntax to express a few dozens of essential code queries. A few natural syntax enhancements were also developed (like the prefix warnif count > 0 to transform a code query into a code rule).
  • Performance: We wanted something like a hundred of queries compiled and executed per second against a large real-world code base, because we want them to be run often in VS without slowing down the IDE.
  • Usability: We wanted query edition to be seamless thanks to code completion, live tooltip documentation and detailed error reporting.  All this without the Roslyn power that is not yet RTM. Also, for v3 users, we’ve developed a CQL to CQLinq automatic converter (and CQL is still supported).
One simple CQLinq code rule that illustrates well all these, is the following one.

// <Name>Base class should not use derivatives</Name>
warnif count > 0
from baseClass in JustMyCode.Types
where baseClass.IsClass && baseClass.NbChildren > 0 // <-- for optimization!
let derivedClassesUsed = baseClass.DerivedTypes.UsedBy(baseClass)
where derivedClassesUsed.Count() > 0
select new { baseClass, derivedClassesUsed }


Hopefully the syntax is simple enough to convey the underlying meaning to any .NET developer. Only the JustMyCode highlighted word might not be clear. It represents a facility proposed to avoid matching generated code elements, that often are pesky false positive matches for code rules.
After  an instantaneous compilation/execution phase (1 millisecond), the result is displayed with facilities for browsing it and exporting it:

Today, I’d like to focus a bit more on the syntax aspect. Developing for LINQ and with LINQ during the last two years has been a great joy. Everybody agrees that LINQ is very elegant, but it is also a super-extensible technology.
We extended LINQ in several different ways. One way has been to develop a LINQ-friendly fluent API. Often we found convenient to write default code rules with a mix of both Query Expression and Query Operator syntaxes. Takes the following rule for example.

// <Name>UI layer shouldn't use directly DB types</Name>
warnif count > 0
// UI layer is made of types in namespaces using a UI framework
let uiTypes = Application.Namespaces.UsingAny(
   Assemblies.WithNameIn("PresentationFramework", "System.Windows",
                         "System.Windows.Forms", "System.Web")
).ChildTypes()
// You can easily customize this line to define what are DB types.
let dbTypes = ThirdParty.Assemblies.WithNameIn("System.Data", "EntityFramework",
                                               "NHibernate").ChildTypes()
              // Ideally even DataSet and associated, usage should be forbidden from UI layer:
              // http://stackoverflow.com/questions/1708690/is-list-better-than-dataset-for-ui-layer-in-asp-net
              .Except(Types.WithNameIn("DataSet", "DataTable", "DataRow"))
from uiType in uiTypes.UsingAny(dbTypes)
let dbTypesUsed = dbTypes.Intersect(uiType.TypesUsed)
select new { uiType, dbTypesUsed }


  • First, we define with two fluent sub-queries (expressed with the operator syntax)  the UI layer (types in namespaces using any UI framework) and the DB layer (types in namespaces using any DB framework)
  • Then we use more fluent query operator syntax to check if the UI layer is using the DB layer.
  • Finally the whole query is structured with the query expression syntax.
In a few lines of code, we are expressing fluently a pretty complex and popular requirement, in a generic way adaptable to any situations.
For a few years now, the code metric C.R.A.P (Change Risk Analyzer and Predictor) became increasingly popular in the Java community thanks to the crap4J plugin. The C.R.A.P metric has been exposed by Alberto Savoia in this Artima article dated from October 2007. The C.R.A.P metric is a mathematical formula that helps to determine which piece of code is both complex and poorly covered by tests. Since both code coverage and cyclomatic complexity code metrics are proposed by the NDepend.API code model, it is fairly easy to write a CQLinq code rule to match crappy code:

// <Name>C.R.A.P method code metric</Name>
// Change Risk Analyzer and Predictor (i.e. CRAP) code metric
// This code metric helps in pinpointing overly complex and untested code.
// Reference: http://www.artima.com/weblogs/viewpost.jsp?thread=215899
// Formula: CRAP(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)
warnif count > 0
from m in JustMyCode.Methods
// Don't match too short methods
where m.NbLinesOfCode > 10
let CC = m.CyclomaticComplexity
let uncov = (100 - m.PercentageCoverage) / 100f
let CRAP = (CC * CC * uncov * uncov * uncov) + CC
where CRAP != null && CRAP > 30
orderby CRAP descending, m.NbLinesOfCode descending
select new { m, CRAP, CC, uncoveredPercentage = uncov*100, m.NbLinesOfCode }


A popular feature of NDepend is the ability to diff two snapshots of a code base to explore what was changed. This feature being completely integrated with CQLinq, it is now possible to write simple code queries that will match complex evolution requirements. One immediate requirement that comes to my mind, is that a class 100% covered by tests should remain 100% covered by tests, no matter whether it has been touched or not. The following CQLinq code rule detects classes that are not anymore 100% covered by tests (since the predefined base-line), and lists the culprit methods, i.e the method that are not 100% covered anymore:

// <Name>Types that used to be 100% covered but not anymore</Name>
warnif count > 0
from t in JustMyCode.Types where
   t.IsPresentInBothBuilds() &&
   t.OlderVersion().PercentageCoverage == 100 &&
   t.PercentageCoverage < 100
let culpritMethods = t.Methods.Where(m => m.PercentageCoverage < 100)
select new {t, t.PercentageCoverage, culpritMethods }


Hopefully CQLinq is a simple answer to many requirements that formerly demanded significant efforts (imagine the effort to develop the tool crap4J compared to the effort of writing a single CQLinq query). You can download freely and try CQLinq live on your code base, and here you can browse all 200 default code rules.
In future posts I’ll dig into the low-level implementation tricks needed to implement all this.


DIGITAL JUICE

No comments:

Post a Comment

Thank's!