Familiar idioms in Perl, Python, JavaScript, and C#

When I started working on the elmcity project, I planned to use my language of choice in recent years: Python. But early on, IronPython wasn’t fully supported on Azure, so I switched to C#. Later, when IronPython became fully supported, there was really no point in switching my core roles (worker and web) to it, so I’ve proceeded in a hybrid mode. The core roles are written in C#, and a variety of auxiliary pieces are written in IronPython.

Meanwhile, I’ve been creating other auxiliary pieces in JavaScript, as will happen with any web project. The other day, at the request of a calendar curator, I used JavaScript to prototype a tag summarizer. This was so useful that I decided to make it a new feature of the service. The C# version was so strikingly similar to the JavaScript version that I just had to set them side by side for comparison:

JavaScript C#
var tagdict = new Object();

for ( i = 0; i < obj.length; i++ )
  {
  var evt = obj[i];
  if ( evt["categories"] != undefined)
    {
    var tags = evt["categories"].split(',');
    for (j = 0; j < tags.length; j++ )
      {
      var tag = tags[j];
      if ( tagdict[tag] != undefined )
        tagdict[tag]++;
      else
        tagdict[tag] = 1;
      }
    }
  }
var tagdict = new Dictionary();

foreach (var evt in es.events)
  {

  if (evt.categories != null)
    {
    var tags = evt.categories.Split(',');
    foreach (var tag in tags)
      {

      if (tagdict.ContainsKey(tag))
        tagdict[tag]++;
      else
        tagdict[tag] = 1;
      }
    }
  }
var sorted_keys = [];

for ( var tag in tagdict )
  sorted_keys.push(tag);

sorted_keys.sort(function(a,b) 
 { return tagdict[b] - tagdict[a] });
var sorted_keys = new List();

foreach (var tag in tagdict.Keys)
  sorted_keys.Add(tag);

sorted_keys.Sort( (a, b) 
  => tagdict[b].CompareTo(tagdict[a]));

The idioms involved here include:

  • Splitting a string on a delimiter to produce a list

  • Using a dictionary to build a concordance of strings and occurrence counts

  • Sorting an array of keys by their associated occurrence counts

I first used these idioms in Perl. Later they became Python staples. Now here they are again, in both JavaScript and C#.

22 thoughts on “Familiar idioms in Perl, Python, JavaScript, and C#

  1. David W.

    Hi Jon,

    As an improvement to the body of the C# foreach, you can avoid one lookup and one branch via:

    {
    int value;
    tagdict.TryGet(tag);
    tagdict[tag] = value + 1;
    }

    Similarly with Javascript:

    {
    tagdict[tag] = (tagdict[tag] || 0) + 1;
    }

    Reply
  2. Jon Udell Post author

    Thanks!

    One reason I posted, of course, was to invite somebody to show me a better way, as you have.

    Two minor tweaks in the C# version:

    1. TryGetValue, not TryGet
    2. Must pass value as an out param

    So:

    tagdict.TryGetValue(tag, out value);

    This is actually a nice illustration of one of the subtle downsides of carrying idioms forward. When you look for ways that map directly to your prior experience, you tend not to discover newer/better ways that have emerged since you first learned the idiom.

    Reply
  3. David W.

    Sleepy today, thanks for the correction.

    Your note on the downsides is so true. As a newbie (1 year ago) to Javascript, perhaps the first thing I learned was how to define ‘classes’ using prototypes.

    This resulted in missing the benefits of closures in many cases, whereupon later rewriting some code, reduced it from 200 lines to 30 (really). As we mature, it becomes more difficult to approach learning with the efficiency of a child.

    Reply
  4. kazoolist

    Also, I couldn’t help notice just how JS-like the C# above was, so I did a little of cute-hackery which allows all but the last line of the C# to be run as JavaScript, with the only change being substituting “for each” for “foreach”:

    /* create some data to run through the code */
    var es = {
    events: [
    { categories: null },
    { categories: 'cat1,cat2,cat3' },
    { categories: 'cat1,cat2,cat3' }
    ]
    };

    /* hackery */
    String.prototype.Split = String.prototype.split;
    var Dictionary = function() { this.INKEYSGETTER = false; };
    Dictionary.prototype.__defineGetter__(“Keys”, function(){ if (this.INKEYSGETTER) { return null; } this.INKEYSGETTER=true; var keys = []; for (key in this) { if (typeof this[key] == ‘number’) keys.push(key); } this.INKEYSGETTER=false; return keys; });
    Dictionary.prototype.ContainsKey = function(key) { return (typeof key == ‘string’ && typeof this[key] != ‘undefined’) }
    var List = function() { this.list = []; };
    List.prototype.Add = function(obj) { var idx = this.list.push(obj); this[idx - 1] = obj; };

    /* orig C# with only change being s/foreach/for each/g */
    var tagdict = new Dictionary();

    for each (var evt in es.events)
    {

    if (evt.categories != null)
    {
    var tags = evt.categories.Split(‘,’);
    for each (var tag in tags)
    {

    if (tagdict.ContainsKey(tag))
    tagdict[tag]++;
    else
    tagdict[tag] = 1;
    }
    }
    }

    var sorted_keys = new List();

    for each (var tag in tagdict.Keys)
    sorted_keys.Add(tag);

    /*

    At this point, the following:

    for each (var key in sorted_keys.list) {
    print(‘#’ + key);
    }

    will print out:

    #cat1
    #cat2
    #cat3

    */

    Reply
  5. Jon Udell Post author

    > As we mature, it becomes more difficult
    > to approach learning with the efficiency
    > of a child.

    Interesting use of the word efficiency. Is that really it? Maybe. Or maybe it’s just that a newbie, at any age, absorbs the prevailing gestalt. Then seeks to recreate it. Meanwhile the gestalt evolves.

    Reply
  6. Jon Udell Post author

    > allows all but the last line of the C# to
    > be run as JavaScript

    Fascinating. Thank you for this concise and thought-provoking demo of the power of JS prototypes. And of the isomorphism that originally motivated the post.

    Reply
  7. Jon Udell Post author

    > This resulted in missing the benefits of
    > closures in many cases, whereupon later
    > rewriting some code, reduced it from 200
    > lines to 30 (really).

    Curious: Do you think the 30 lines is as (or more) readable/maintainable by others? By your future self even?

    Reply
  8. David W

    Hi Jon,

    I’ve thought about it for a few days now, and the answer doesn’t seem simple. On the one hand, there is the well known correlation between bug count and LOC, whereas on the other, the newer, more compact version of the code in question requires a greater understanding of Javascript on the part of the maintainer in order to modify the code.

    I’m leaning towards “more maintainable”, due to potentially reduced bug count, and therefore the less maintenance likely required, but it’s a very good and slightly troubling question. :)

    Reply
  9. chris hollander

    decided not to wait. ;)

    var sortedTagDict =
    from evt in esevents
    from tag in evt.categories.Split(‘,’)
    group tag by tag into occurances
    orderby occurances.Count() descending
    select new {
    tag = occurances.Key,
    count = occurances.Count() };

    Reply
  10. Jon Udell Post author

    Chris: Thanks! For the query itself, I only had to add a null check on categories.

    In order to actually return a List<Dictionary<string,int>>, however, I couldn’t just ToList() the query, but instead had to enumerate it and build the return value.

    var tagquery =
      from evt in es.events
        where evt.categories != null
        from tag in evt.categories.Split(',')
        group tag by tag into occurrences
        orderby occurrences.Count() descending
        select new
          {
          tag = occurrences.Key,
          count = occurrences.Count()
          };
    
    var tagcloud = new List<Dictionary>();
    foreach (var result in tagquery)
      {
      var dict = new Dictionary() 
         { { result.tag, result.count } };
      tagcloud.Add(dict);
      }
    return tagcloud;
    

    I have used this approach sparingly because I’m not good at visualizing what goes on inside a LINQ query. With my more old-fashioned style I can step through everything in the debugger and understand things better.

    Reply
    1. chris hollander

      if you wanted to be able to just .ToList() the query, you could have the query select out the dictionary items that you want- just change the end of it to:

      select new Dictionary() {{ occurances.Key, occurances.Count() }};

      Reply
      1. Jon Udell Post author

        I was curious what the underlying extension method syntax looks like. Tried and of course failed to write something that would work. Then cheated and used Reflector to look:

        List<Dictionary> l = es.events.Where(delegate (EventStore.evt evt) { return (evt.categories != null); }).SelectMany(delegate (EventStore.evt evt) { return evt.categories.Split(new char[] { ‘,’ }); }, delegate (EventStore.evt evt, string tag) { return new { evt = evt, tag = tag }; }).GroupBy(delegate(h__TransparentIdentifier0 h__TransparentIdentifier0){
        return h__TransparentIdentifier0.tag;}, delegate (f__AnonymousType0 h__TransparentIdentifier0)
        {return h__TransparentIdentifier0.tag;}). OrderByDescending<IGrouping, int>(delegate (IGrouping occurrences) { return occurrences.Count(); }).Select<IGrouping, Dictionary>(delegate (IGrouping occurrences) { Dictionary g__initLocal1 = new Dictionary(); g__initLocal1.Add(occurrences.Key, occurrences.Count());
        return g__initLocal1; }).ToList<Dictionary>();

        Whoa.

  11. Jon Udell Post author

    > http://bannister.us/examples/

    > I am a bit more fond of isolation through
    > use of namespaces, and prefer to add a bit
    > more room for reuse. Make what you want of
    > the difference.

    Your JS example and Chris Hollander’s C# example further sharpen my point in comment 2: mapping old habits onto new languages tends to obscure what makes those new languages special.

    Reply
    1. pbannister

      Old habits also explain why almost all examples written in Javascript – up until very recently – are crap. Most folk wrote VisualBasic-in-Javascript or Java-in-Javascript. Really horrible stuff.

      The mutable nature of Javascript reminds me a bit of Lisp and Prototype. Used Lisp a fair amount in school. Prototype was more dynamic sort of object-oriented language built around instance prototypes rather than static classes, and presented in a paper around 1990. (Good luck finding the paper, as “Prototype” is not a good search term.)

      Reply
  12. Pingback: Querying mobile data objects with LINQ « Jon Udell

  13. Pingback: More Python and C# idioms: Finding the difference between two list « Jon Udell

  14. Pingback: asp.net, c#,javascript

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s