Alternative logging for Azure services

Some people say that cloud services are just web services rebranded. Since I’ve always defined web services inclusively, I tend to agree. But I do think that the emerging notion of cloud computing leads us toward greater abstraction of resources. And as I build out the Azure service described in this series, one of the resources I’m abstracting more than ever is the file system.

I’ve built online services for many years, and have always been aggressive about logging everything they do. Detailed logs assure me that things are working well, or help me figure out why they’re not.

My services have always done logging in the grand Unix tradition: They append lines of text to log files. I probe those files using tools like tail (to peek at the end) and grep (to search).

You can’t do things that way in the Azure cloud. True, the default logging mechanism writes to blobs, which are the moral equivalent of named files containing arbitrary streams of bytes. But your service can’t just open the log and append to it. Instead it calls a method, RoleManager.WriteToLog, that sends a message to the log. And in order to peek at the most recent entries, or search the log, you have to download one or more quarter-hourly blobs, then parse the XML records inside them.

So instead I’m using a cloud database. Or actually, three of them. One is Amazon’s SimpleDB, the second is the SQL Server-based SQL Data Services, and the third is Azure’s table storage. I figure my service will generate a lot of real data over the long haul, and it’ll be interesting to compare these services as they evolve and grow — and as the types and quantities of data I’m logging grow along with them.

In all three cases my pattern is the same. For each service, I’m wrapping a thin C# library around the HTTP/REST interface. The existing wrappers I’ve found, for Azure and SQL Data Services in particular, tend to hide the HTTP/REST interfaces. But I want to be able to see and touch them.

When I’m developing software that relies on a web-based infrastructure service, I want to be able to access that service in as many ways as I can. When I get stuck I can drop down to the HTTP level, and there I can triangulate on a problem in many complementary ways: from the command line using curl, from Python, from C#, from an HTTP sniffer.

Another pattern common to all three logging mechanisms is mixed use of statically- and dynamically-typed languages. Although I’ve written these interface libraries in C# in order to deploy on Azure, I use them from both C# and Python. When my Azure-based service logs its activities, it invokes the structured-storage interfaces from C#. But I invoke the same interfaces from IronPython to view, query, and analyze the logs. And if IronPython becomes a service provider on the Azure platform, as I hope it will, I’ll invoke the same interfaces again to write, as well as read, the logs created by those services.

So, for example, the current version of my SQL Data Services interface library is here. (Corresponding tests here.)

This is the method that writes a log message:

public static void sds_write_log_message(string type, string message, 
    string data)
  {
  var dt = Utils.XsdDateTimeFromDateTime(DateTime.Now);
  sds_flex_entity[] entities =
    {
    new sds_flex_entity("type","string",type),
    new sds_flex_entity("message","string",message),
    new sds_flex_entity("datetime","dateTime",dt),
    new sds_flex_entity("data","string", data != null ? data : "")
    };
 
string id = "id_" + System.DateTime.Now.Ticks.ToString();
var sr = create_entity("elmcity","events", "Event", id, entities);

And here’s create_entity which invokes the REST API:

public static sds_response create_entity(string authority, 
    string container, string entityname, string entityid, 
    sds_flex_entity[] entities)
  {
  byte[] payload = make_sds_entity_payload(entityname, entityid, 
    entities);
  var response = DoSdsRequest(authority, container, null, "POST", 
    payload);
  return get_sds_response(response, false, entityname, null, null);
  }

The container for this set of records is called events, and it lives in an authority called elmcity, so the request URI will be https://elmcity.data.database.windows.net/v1/events, and the body of the HTTP POST request will look like this:

<s:Event xmlns:s='http://schemas.microsoft.com/sitka/2008/03/'>
<s:Id>id_012345</s:Id>
<type xsi:type='x:string'>exception</type>
<message xsi:type='x:string'>DoHttpRequest: ProtocolError</message>
<datetime xsi:type='x:dateTime'>2009-01-29:T12:42:01</datetime>
<data xsi:type='x:string'>400 Bad Request</data>
</s:Event>

Here’s the method that queries a container and returns a package of results:

public static sds_response query_entities(string authority, 
    string container, 
  bool in_ns, string entity, List<string> entitynames, string query)
  {
  var response = DoSdsRequest(authority, container, query);
  return get_sds_response(response, in_ns, entity, entitynames, null);
  }

In the current version of the SQL Data Services query syntax, here’s how you ask for recent log entries:

from e in entities 
  where e["datetime"] >= "2009-01-29:T14:00:00" 
  orderby e["datetime"] ascending 
  select e

Here’s the method to transmit that query to SQL Data Services:

public static sds_response query_entities(string authority, 
    string container, bool in_ns, string entity, 
    List<string> entitynames, 
    string query)
  {
  var response = DoSdsRequest(authority, container, query);
  return get_sds_response(response, in_ns, entity, entitynames, null);
  }

The HTTP request again goes to https://elmcity.data.database.windows.net/v1/events, but this time it’s a GET not a POST, and the full URI ends with “?q=” plus the “from e in entities…” query from above.

The HTTP response is a sequence of XML packets like the <Event>...</Event> example shown above, filtered and ordered by the query. The query_entities method transforms those into a list of name/value collections.

That list of collections is accessible from C# code running inside the Azure service, but equally accessible from IronPython running outside Azure. Here’s an IronPython script that finds events logged within the last 2 hours whose error messages contain ‘400’:

import clr
clr.AddReference("CalendarAggregator")
from CalendarAggregator import *
import System

sds = SdsStorage()

flexentities = ('type','message','datetime','data')
_flexentities = System.Collections.Generic.List[str](flexentities)

dt = System.DateTime.Now
dt_diff = System.TimeSpan.FromHours(2)
dt_str = Utils.XsdDateTimeFromDateTime( dt - dt_diff )

q = 'from e in entities where e["datetime"] >= "%s" orderby \
      e["datetime"] ascending select e' % dt_str

sr = SdsStorage.query_entities("elmcity","events", False, \
  "Event", _flexentities, q )

results = filter ( lambda x: x['data'].startswith('400'), sr.response )

for d in results:
  print d['type'],d['message'],d['datetime'], d['data']

The details of the SQL Data Services query syntax aren’t important here. What matters is the strategy:

  1. Make a thin wrapper around the REST interface to the query service
  2. Use the available query syntax to produce raw results
  3. Capture the results in generic data structures
  4. Refine the raw results using a dynamic language

You can use this strategy with any of the emerging breed of cloud databases.

8 Comments

  1. Just a question – if the logs are XML, why not use XQuery and XSLT to “search” the XML and display what matches?

  2. > why not use XQuery and XSLT to “search”
    > the XML and display what matches?

    You can indeed do that with the log files created by the standard Azure API. But you have to download the files to search them, you’d need to combine them to search across time, and at some point you’d want indexing capability to search over long periods of time.

    One of the reasons I’m particularly interested in SQL Data Services is that it’s SQL Server under the covers, and has an XML query engine that will become more available as the service matures.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s