The science of MSBuild ⚗️ 6 June 2018

How to express build dependencies using MSBuild Targets, Inputs and Outputs

The science of MSBuild ⚗️

Add this to the list of things I did not expect to be writing…

Dependency graph

I prefer Tundra over MSBuild because it does not hide the fact that the build system is using a directed acyclic graph (DAG) to figure out when to build what.

With MSBuild each target is a node with inputs and outputs. If you define a target you define a DAG node, if you want MSBuild to invalidate this target when dependant inputs change you have to list them in the Inputs attribute to the target. As a quirk, MSBuild will require that you specify Outputs as well (to be able to support incremental builds).

Targets without input and output information are wonky, MSBuild won’t invoke a code generation target correctly without them.

Code generation tool + code generation + code

Let’s say we have two projects in a solution. Project A (PA) is a code generation tool and Project B (PB) depends on code generated by this tool.

The build should progress like this:

  • build PA
  • run PA
  • build PB

The way we accomplish this with MSBuild is to extend the BeforeBuild target, like this [CodeGeneration.targets]:

<Project>
  <PropertyGroup>
    <DataCompiler>$(SolutionDir)sql-data-model-compiler\bin\$(Configuration)\net461\sql-data-model-compiler.exe</DataCompiler>
    <DataCompilerOutput>$(SolutionDir)generated\db.generated.cs</DataCompilerOutput>
  </PropertyGroup>

  <Target Name="BeforeGenerateCode"
          BeforeTargets="GenerateCode"
          Condition="!Exists($(DataCompiler))">
    <Error Text="data compiler '$(DataCompiler)' not found" />
  </Target>

  <Target Name="GenerateCode"
          BeforeTargets="BeforeBuild"
          Inputs="$(DataCompiler)"
          Outputs="$(DataCompilerOutput)">
    <Exec Command="$(DataCompiler) -conn &quot;$(SolutionDir)connection-string.txt&quot; -output &quot;$(DataCompilerOutput)&quot;" />
  </Target>
</Project>

☠️ The careful reader will note that I did not put the file $(SolutionDir)connection-string.txt in the target Inputs, it should be there. This is just me not caring and it will not work correctly if I modify connection-string.txt. If I do modify connection-string.txt I will have to do a full rebuild.

We can then put an Import directive in our original project file to make the file editable without having to unload/load the project file from within Visual Studio.

<Project Sdk="Microsoft.NET.Sdk">
  <Import Project="CodeGeneration.targets" />
  <PropertyGroup>
    <TargetFramework>net461</TargetFramework>
  </PropertyGroup>
</Project>

⚠️ NOTE: DO NOT FORGET TO FIX THE PROJECT DEPENDENCIES… RIGHT-CLICK THE SOLUTION IN VISUAL STUDIO AND OPEN THE ‘PROJECT DEPENDENCIES…’ DIALOG. MAKE SURE THE CODE GENERATION TOOL IS A DEPENDENCY OF THE PROJECT REFERENCING THE OUTPUT.

If you forget to do this, MSBuild will run the code generation tool build in parallel with the project consuming the output and you will be invoking the code generation tool from the previous build (not fresh). You will again find yourself, needlessly, spamming that rebuild button.

With all of this configured MSBuild will build the code generation tool, the generated code and the project that depends on the generated code as needed. A lot of this has to with the fact that we are building the code generation tool from within the same solution that it is also being used to generate code elsewhere. This is something MSBuild doesn’t do very well.

📝 Edit: Oh, and none of this works from within Visual Studio, you have to run msbuild on the command line without the /m option. If anyone knows of a way to fix this please contact me.

The mindfuck that is MSBuild

  • if a Target has no Inputs, i.e. inputs fail to resolve to an existing file, you get nothing. The target is not run and MSBuild does not generate a warning message. The error information is hidden in a 3K log file that you only get if you run msbuild with additional command-line options /v:d /fl1 /noconlog.
  • MSBuild looks like XML but it is definitely not XML. It has it’s own special characters that require additional escaping. Arbitrary XML element names are used to introduce MSBuild concepts like items and properties, making schema validation and statement completion difficult and/or impossible.
  • The order of evaluation matters and it is a complete mystery to me how MSBuild successfully arrives at any order of evaluation, at all. For example, when adding build extension targets, using BeforeTargets="BeforeBuild", project metadata, such as ProjectDir is blank until the Target is run. How then, do I refer to files relative ProjectDir when I have to define Inputs and Outputs for code generation targets to actually work? The only answer I managed to find was, use MSBuildProjectDirectory, because apparently, it has a value… 😞

All this nonsense can be traced back to XML. MSBuild is a mindfuck because declarative alone is insufficient to represent a complex build. Declarative is the right idea but XML is/was the wrong approach. Here Tundra stands a part because it is both declarative and imperative with well defined semantics because it is in-part built with Lua scripting language.