Test-driven development in the Azure cloud

In part one of this series I gave an overview of my current project to recreate the elmcity.info calendar aggregator on the Azure platform. In this installment I’ll focus on test-driven development in Azure.

Because I’m doing the core aggregator in C#, I’m using the popular NUnit software to automate the running of my test suite. It’s standard stuff if you’re familiar with the XUnit approach. But if you’re not a programmer, I’ll briefly explain. I think it’s worthwhile because the ideas that inform test-driven programming are an aspect of computational thinking that everyone could generalize from and apply in a variety of useful ways.

A primer on test-driven development

Let’s focus on one small piece of code, a method called AddTrustedEventfulContributor, which implements part of the trusted-feed mechanism I outlined in Databasing trusted feeds with del.icio.us.

As I explained there, when the aggregator’s scan of Eventful events within 15 miles of Keene finds an unknown contributor, as was true recently for Beau Bristow, it creates a del.icio.us record with the tags new, eventful, and contributor. If I decide to trust Beau, I can just change the new tag to trusted by hand. But eventually I’ll want to automate that, so an administrator needn’t remember the tagging convention or worry about making an error.

So AddTrustedEventfulContributor creates (or updates) a del.icio.us bookmark for the URL eventful.com/users/beaubristow/created/events, and ensures that it’s tagged with trusted, eventful, and contributor.

Once the method is written, and seems to work, how can we be sure that it continues to work? The environment is dynamic. The code supporting the method is evolving. And so is the code supporting the del.icio.us and Eventful services it orchestrates. We want to be able to test the method continuously, and verify that it keeps on doing what we expect.

The code to be tested is defined in a file called Delicious.cs, like so:

public static Utils.http_response_struct 
    AddTrustedEventfulContributor(string contrib)
  {
  return AddTrustedContributor(contrib, "eventful");
  }

private static Utils.http_response_struct 
    AddTrustedContributor(string contrib, string service)
  {
  contrib = contrib.Replace(' ', '+');
  var bookmark_url = build_bookmark_url(contrib, service);
  string tags = "trusted+contributor+" + service;
  string args = string.Format("&url={0}&tags={1}&description={2}", 
    bookmark_url,   tags, contrib);
  var url = string.Format("{0}/posts/add?{1}", apibase, args);
  return do_request_with_url(url);
  }

Tests are defined in a parallel file, DeliciousTest.cs, like so:

[TestFixture]
public class DeliciousTest
  {
  private const string contrib = "xyzas 'dfbyas234";

  [Test]
  public void t1_addTrustedEventfulContributor()
    {
    Utils.http_response_struct response = 
      Delicious.AddTrustedEventfulContributor(contrib);
    Assert.AreEqual(HttpStatusCode.OK, response.normal_status);
    Assert.That(isSuccessfulDeliciousOperation(response));
    Assert.That(Delicious.isTrustedEventfulContributor(contrib));
    }

The test calls Delicious.AddTrustedEventfulContributor with the fictitious contributor xyzas ‘dfbyas234, and makes three assertions about the outcome. First, we should get the expected OK status code from del.icio.us. Second, we should get the expected XML response. And third, the expected tags should actually have been applied to the bookmark for xyzas ‘dfbyas234.

Like other XUnit software, NUnit provides a few different ways to run tests. Everyone’s favorite is the GUI testrunner, which displays a tree of test sets (fixtures) and tests, with green and red indicators for pass and fail. The indicators produce a Pavlovian response: You want to see them stay green, and will work obsessively to keep them that way.

The Azure twist

So far this is all standard stuff, but here’s the Azure twist. For a while I was using the GUI testrunner, and then deploying — first to the local Azure development “fabric” and then to the cloud. But the GUI testrunner’s environment isn’t quite the same as Azure. I was reminded of that fact when I added a serialization method to the aggregator.

The original Python-based service uses a binary serialization technique that Pythonistas call pickling. It’s a convenient way to freeze-dry and rehydrate data structures that don’t need to be stored in a queryable or transactional database. You can do the same thing in other programming environments, including Perl, Java, and .NET.

So I implemented .NET-style binary serialization for some intermediate data, and pushed these binary files into the Azure blob store. My NUnit test of this method ran green, but when I deployed into the local fabric it failed. Oh, right. The fabric’s security rules, as I mentioned last time, are different, and stricter than the defaults on your local machine.

Here’s the original serializer, which works outside Azure but not inside:

public void serialize(string container, string file,
  List<evt> events)
  {
  var serializer = new BinaryFormatter();
  var ms = new MemoryStream();
  serializer.Serialize(ms, events);
  var chars = Encoding.UTF8.GetChars(ms.ToArray());
  ms.Close();
  write_to_azure_blob(container, file, new string(chars));
  }

The line shown in red is the culprit. That’s where Azure throws a security exception. Thanks to a clue provided by Brendan Enrick I found this alternate, XML-oriented approach which doesn’t trigger a security exception:

public void serialize(string container, string file,
  List<evt> events)
  {
  var serializer = new XmlSerializer(typeof(List<evt>));
  var stringBuilder = new StringBuilder();
  var writer = XmlWriter.Create(stringBuilder);
  serializer.Serialize(writer, events);
  byte[] buffer = Encoding.UTF8.GetBytes(stringBuilder.ToString());
  write_to_azure_blob(container, file, buffer);
  }

And that’s how these intermediate files are now being written.

At this point I realized that, in order to test things properly, NUnit would have to migrate into the Azure fabric. It’s designed to be embedded in a variety of hosts, but I’ve never tried doing that. Here’s what I learned.

Running NUnit in Azure

The first step, as expected, was to make sure that the NUnit code could even load in Azure’s partial-trust environment. As shipped, it doesn’t. The DLLs won’t load in Azure’s local fabric, or in the cloud. If you’re wondering whether a DLL will or won’t load, Keith Brown’s FindAPTC tool will tell you. It checks DLLs to see if the Allow Partially Trusted Callers attribute is turned on. As I collect components for use in Azure, I find that they often don’t flip that switch.

The solution is to visit files like this one and change them from this:

using System;
using System.Reflection;

[assembly: CLSCompliant(true)]

[assembly: AssemblyDelaySign(false)]
[assembly: AssemblyKeyFile("../../../../nunit.snk")]
[assembly: AssemblyKeyName("")]

To this:

using System;
using System.Reflection;
using System.Security;

[assembly: CLSCompliant(true)]

[assembly: AssemblyDelaySign(false)]
[assembly: AssemblyKeyFile("../../../../nunit.snk")]
[assembly: AssemblyKeyName("")]
[assembly: AllowPartiallyTrustedCallers()]

The needed assemblies turned out to be nunit.core.dll, nunit.core.interfaces.dll, nunit.framework.dll, and nunit.testutilities.dll. After I rebuilt them with the APTC attribute turned on, they loaded.

But I wasn’t home free. I found a couple of things that triggered runtime security exceptions. Here’s one, in this file:

public class DirectorySwapper : IDisposable
  {
  private string savedDirectoryName;
  public DirectorySwapper() : this( null ) { }
  public DirectorySwapper( string directoryName )
    {
    savedDirectoryName = Environment.CurrentDirectory;
    if ( directoryName != null && directoryName != string.Empty )
      Environment.CurrentDirectory = directoryName;
    }
  public void Dispose()
    {
    Environment.CurrentDirectory = savedDirectoryName;
    }
  }

The lines shown in red fail because the Azure trust policy, a “variation on the standard ASP.NET medium trust policy,” prevents changes to environment variables.

The other offender appears here:

private static Assembly FrameworkAssembly
  {
  get
    {
    if (frameworkAssembly == null)
    foreach (Assembly assembly in AppDomain.CurrentDomain.GetAssemblies())
      if (assembly.GetName().Name == "nunit.framework" ||
        assembly.GetName().Name == "NUnitLite")
          {
          frameworkAssembly = assembly;
          break;
          }
    return frameworkAssembly;
    }
  }

Because the Azure trust policy places restrictions on reflection, whereby code inspects (and perhaps modifies) itself, these calls to GetName trigger security exceptions. In this case, I believe NUnit is using reflection to segregate its own DLLs from the DLLs under test, in order to keep its internal bookkeeping straight.

My solution to both of these problems was naive and heavy-handed. I just commented out the handful of cases where NUnit tries to change the current directory, or find out if a DLL is one of its own or not. With those changes in place, here’s my Azure-embedded testrunner:

private tatic void doTests()
  {
  var suites = new Type[] {
    typeof(BlobStorageTest),
    typeof(DeliciousTest),
    typeof(EventCollectorTest),
    typeof(EventStoreTest),
    typeof(FeedRegistryTest),
    typeof(UtilsTest),
    };
 	
  var fixtures = new List<TestFixture>();
 	
  foreach (var suite in suites)
    fixtures.Add(TestBuilder.MakeFixture(suite));
 	
  string report = string.Format("NUnit Tests at {0}\n\n", 
    DateTime.Now.ToString());

  foreach (var fixture in fixtures)
    {
    TestSuiteResult results = (TestSuiteResult)fixture.Run(
         new NullListener());
      foreach (TestResult result in results.Results)
        {
        report += string.Format("{0}\n",result.Name);
        if ( ! result.IsSuccess )
          report += string.Format("{0}\n",result.Message);
        report += "\n";
        }
      }

  var bs = new BlobStorage();
  bs.put_blob("events", "nunit.txt", Encoding.UTF8.GetBytes(report));
  }

The aggregator is currently running on a 12-hour cycle. Every time it wakes up, it runs tests and writes this report before it collects events. (It’s a no-news-is-good-news-style report, so if all is well you’ll just see a list of tests.)

Conclusions

It’s nice to know that the aggregator will now test itself continuously, in its production environment. When you park a service in the cloud, you want all the feedback you can get. Constant flows of log data and test reports are essential in order to know that things are working correctly, or to find out why they’re not.

Although these methods are always advisable, I’ll admit I was lazy about them in the current version of the service. It’s running on a Linux box that I can ssh into and poke around on whenever I want. The same would be true if it were running on Amazon EC2. With Azure, as with Google’s App Engine, things are different. The execution environment is more of a black box. You can’t just jump in there and poke around. I miss that.

On the other hand, the black box architecture forces me to rethink some basic assumptions. Should my service expect to be able to modify environment variables? Should it even expect to communicate directly with a file system? We’ve always done things that way, but cloud computing invites us to move to a new level of abstraction. As always, that shift brings challenges along with opportunities.

I’m really of two minds about this. It is frustrating not to be able to use NUnit, unmodified, in Azure. I’m not sure what the effects of my surgery really are, or in what other ways NUnit may yet be incompatible with Azure. A mode of Azure that runs fully trusted code, and even allows EC2-style use of raw virtual machines, would be a wonderful option.

And yet … I haven’t been stymied so far. And part of me wants to embrace constraints in order to gain flexibility at another level of the stack.

From the comments on part one of this series:

“Either give me a machine in the cloud to work on our don’t (anything less is censorship)”

I’d rather have the opportunity to self-censor. And on Amazon EC2 I have that opportunity. That said, when I’ve used EC2 VMs I have been running as root. Why? No good reason, just path of least resistance.

Do you routinely run as root on your personal box, and on hosted boxes? If so, you can do that on EC2, and I suspect you’ll be able to on raw Azure VMs too. But setting the default to something less potent is, well, think about it. Have you ever condemned Microsoft for not being secure by default? How do you square that with condemning Microsoft for being secure by default?

More broadly, the cloud environment is going to challenge a lot of long-held assumptions in what I think will be useful ways. Less so for raw VM hosting a la Amazon, more so for the kinds of “fabrics” of which App Engine and Azure are examples.

That said, although I think it’s useful to challenge assumptions about access to environment variables and file systems, I chafe at the restrictions on reflection. My original plan was to use IronPython for this service, because I believe that the flexibility of dynamic languages will be a key asset in the dynamic environment of the cloud. Currently I’m using IronPython in auxiliary and complementary ways, outside of Azure, as I’ll explain in another installment. Meanwhile I’m finding that C# is becoming more and more dynamic. But reflection is at the core of that dynamism. I’m no expert on this subject, but will be interested to know what folks who are think about the tradeoffs that Azure’s trust policy entails.

4 Comments

  1. luckily, Azure now seems to support Full trust! Have you perhaps tested if nunit now runs unmodified?

  2. Pingback: Facebook Hack

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s