Tagging mechanisms and strategies part 3: Taxonomy and folksonomy

Should a tag namespace be a top-down taxonomy or a bottom-up folksonomy? My answer is: both. In recent months, as I curate calendar hubs for selected cities, I’ve been working toward an approach that harmonizes the two styles.

Principle: Top-down and bottom-up

In the elmcity context, the most important taggable object is the calendar feed. That’s because when you can characterize a whole feed with a tag, all the events in that feed inherit the tag. The primary sources of taggable feeds are Eventful, Upcoming, Facebook, Meetup, and EventBrite. I call them taggable because, while some of these services tag individual events, none tag feeds based on venues (Eventful, Upcoming) or Pages (Facebook) or groups (Meetup) or organizers (EventBrite). Assigning feed-level tags is an editorial exercise for the curator.

To these sources I add as many standalone iCalendar feeds as I can find. For Boston and Seattle, the results add up to lists of over 600 tagged iCalendar feeds. Here’s a table of the current list of tags for Boston, the current list for Seattle, and the intersection of the two lists.

Boston
adoption 1
african 1
animals 28
arabic 1
architecture 38
art 222
asian 30
astronomy 13
baseball 94
basketball 78
berklee 285
boating 7
books 160
boston 7
boston.com 1020
boston.gov 395
boston-latin-hs 109
bpl 153
bu 101
business 191
cathedral-hs 15
children 73
church 469
climbing 23
comedy 174
comics 1
community 883
conferences 96
confernces 1
conflict-resolution 7
cycling 21
dance 156
dining 4
diving 18
dorchester 4
east-boston-hs 17
education 365
english 7
environment 12
european 4
eventbrite 452
eventful 2692
facebook 436
family 79
fashion 4
film 194
finance 2
fitness 2
food 115
football 1
french 3
games 33
german 3
government 42
green-technology 2
harvard 11
health 206
highschool 142
hiking 44
hispanic 1
history 156
hockey 35
indian 4
irish 1
islamic 15
italian 13
japanese 50
jazz 70
language 83
law 5
lectures 304
lgbt 1
library 378
martial-arts 47
massart 10
meditation 10
meetup 975
mensa 46
museum 94
music 2023
nature 91
networking 105
northeastern 167
performing-arts 222
philanthropy 1
philosophy 6
photography 53
poetry 10
politics 153
polyamory 10
portuguese 8
pub-crawl 4
recreation 201
running 59
sailing 7
science 151
seminars 10
simmons 15
social-justice 89
softball 1
south-boston-hs 1
spanish 1
spirituality 106
sports 590
statistics 2
suffolk 2
support 46
surfing 2
swimming 29
synagogue 5
technology 320
theater 90
tourism 43
tours 90
travel 3
umass 9
university 599
upcoming 212
visual-arts 72
volunteer 46
women 25
writing 17
ymca 4
yoga 27
Seattle
africa 4
animals 26
aquarium 13
art 563
arts-and-crafts 14
ballet 4
basketball 31
beer 1
boating 9
books 201
business 62
business-and-technology 31
charity-and-volunteer 10
children 302
chinese 28
church 399
circus 23
cleveland-high 9
climbing 16
coffee 4
comedy 47
comics 1
community 574
conferences 65
cooking 3
dance 151
diving 8
dogs 3
education 103
environment 123
eventbrite 139
eventful 1996
facebook 216
fairs-and-festivals 13
film 136
finance 22
fitness 243
food 40
food-and-dining 26
games 114
garfield-high 12
german 13
government 145
gradeschool 17
green-technology 1
health 192
highschool 35
hiking 74
history 4
ingraham-high 1
insurance 1
italian 3
japanese 17
jazz 46
knitting 35
language 153
latin-american 30
lectures 101
lgbt 21
library 190
martial-arts 1
meetup 1107
museum 77
music 1223
native-american 20
nature 32
networking 54
nonprofit 4
nscc 1
opera 2
pacific-science-center 609
performing-arts 337
philosophy 2
photography 12
police 11
politics 31
real-estate 1
recreation 195
roosevelt-high 4
running 108
science 174
sculpture 1
seattle.gov 449
seattlepi 347
seattleu 12
seminars 23
skiing 2
spanish 19
spirituality 29
sports 151
storytelling 1
sustainability 5
swedish 73
synagogue 4
technology 98
teens 94
theater 166
tourism 11
town-hall-seattle 54
transportation 87
travel 9
trumba 296
university 390
upcoming 66
uw 366
vegan 4
visual-arts 89
volunteer 48
walk-bike-ride 3
walking 41
wallingford 159
wine 9
witches 13
women 26
writing 42
yoga 21
youth 105
Common Tags
animals
art
basketball
boating
books
business
children
church
climbing
comedy
comics
community
conferences
dance
diving
education
environment
eventbrite
eventful
facebook
film
finance
fitness
food
games
german
government
green-technology
health
highschool
hiking
history
italian
japanese
jazz
language
lectures
lgbt
library
martial-arts
meetup
museum
music
nature
networking
performing-arts
philosophy
photography
politics
recreation
running
science
seminars
spanish
spirituality
sports
synagogue
technology
theater
tourism
travel
university
upcoming
visual-arts
volunteer
women
writing
yoga

Among the dynamics in play here, we can see the general and specific principle at work. For a general tag like university there are city-specific instantiations: bu and northeastern for Boston, uw and seattleu and nscc for Seattle. Likewise for the general tag highschool there are specific tags like boston-latin-hs and cathedral-hs for Boston, garfield-high and ingraham-high for Seattle.

These city-specific tags are top-down in the sense that I, as curator of the hub, have assigned them and made them part of the hub’s core tag vocabulary. But they are also bottom-up in the sense that they represent discoverable sources that are providing enough event flow to warrant such treatment.

These core hub vocabularies are fluid. As I move from hub to hub I’ve been keeping an eye on the common core and refactoring all the hub vocabularies as I go along. I also use these evolving hub vocabularies as templates against which to match vocabularies from other sources.

Mechanism: Tag matching

Some of the source services, notably Eventful and EventBrite, include per-event tags. When one of these tags matches a tag in the (evolving) core vocabulary for that hub, the elmcity service adds that tag to the event’s list of tags which it inherited from its feed.

There are also tables for each foreign service that map tags used there to tags in the hub’s core vocabulary. So, for example, the Eventful tag movies_film and the EventBrite tag movies both map to the core tag film.

As we saw in Portable tags, some iCalendar feeds use the CATEGORIES property of the iCalendar format to express per-event tags. Managing these tags is trickier because, well, they’re unmanaged. Until recently I was suppressing them. Now I’m experimentally allowing them to appear, but segregating them from the core vocabulary. If you check the tags for Boston or Seattle or another city you’ll see that the list divides into two sections. The first presents managed tags: the core vocabulary. The second presents unmanaged tags from iCalendar feeds, enclosed in squiggly brackets to differentiate them from the core vocabulary.

Here’s the current set of unmanaged tags for Boston and Seattle:

Boston
{academics} 10
{adams street} 4
{air pollution control
commission hearings}
1
{alumni relations} 6
{athletics} 6
{bikes} 1
{blc} 3
{boston home center} 2
{boston main streets} 1
{boston public library} 3
{brighton} 2
{central library} 84
{charlestown} 2
{city clerk} 15
{city council} 8
{college of arts &
sciences}
10
{college of business
administration}
1
{college of computer
& information science}
13
{college of engineering} 13
{commercial} 1
{connolly} 12
{dnd} 1
{dudley literacy center} 11
{dudley} 19
{east boston} 9
{egleston square} 3
{elderly commission} 1
{election} 1
{faneuil} 2
{fields corner} 11
{group exercise} 4
{grove hall} 11
{honan- allston} 11
{hyde park} 10
{jamaica plain} 7
{licensing} 1
{lower mills} 5
{massart events} 1
{mattapan} 15
{north end} 11
{ongoing} 33
{orient heights} 5
{other} 90
{parker hill} 10
{performing/visual arts} 60
{president} 1
{public event} 139
{public health
commission}
1
{roslindale} 8
{social} 6
{south boston} 18
{south end} 6
{student affairs} 2
{student development} 9
{uphams corner} 2
{washington village} 4
{west end} 13
{west roxbury} 4
Seattle
{animal shelter} 23
{athletics/varsity
sports/men}
3
{athletics/varsity
sports/women}
4
{athletics} 6
{boards &
commissions}
32
{bothell} 29
{built environments} 3
{career management} 1
{city council} 88
{community centers} 29
{community outreach} 10
{community technology} 18
{concerts} 17
{continuing education} 20
{diversity} 2
{eastside } 58
{emergency} 12
{engineering} 13
{environmental learning} 3
{exhibits} 97
{farther afield} 10
{forums} 8
{global health} 1
{health sciences} 18
{hearing examiner} 12
{hr-benefits} 1
{jackson school of
international studies}
6
{libraries} 1
{meetings} 5
{north sound} 19
{office of the mayor} 2
{other} 1
{panel discussions} 2
{parks} 2
{performing/visual arts} 29
{psychology} 4
{ptsa} 6
{public outreach and
engagement}
68
{public} 23
{readings} 1
{research} 1
{sales} 1
{school of art} 92
{school of business} 22
{schoolof art} 1
{seattle area} 188
{seattle fire department} 2
{seattle youth
commission}
11
{south sound} 21
{special events} 16
{sports/spirit} 1
{student activities} 6
{tacoma} 3
{technical communication} 8
{the center for wooden
boats – south lake union}
9
{tours} 27
{training} 13
{urbanization} 1
{vst} 1
{walk bike ride} 3
{workshops} 6

When one of these tags matches a tag in a hub’s core vocabulary I promote it — that is, I treat it as part of the managed core and it no longer shows up in squigglies. That’s a top-down approach. But there’s a complementary bottom-up approach. As I scan the unmanaged tags, both within and across hubs, it can become clear that an unmanaged tag belongs in the managed core. To accomplish that I simply use the unmanaged tag somewhere in the managed core. From then on, occurrences of the unmanaged tag are promoted into the core.

A logical next step is to enable curators to edit per-hub maps so that, for example, Seattle’s {central library} and Boston’s {libraries} will be promoted to simply library. I haven’t built this mapping feature yet but it’s on the todo list.

I’m still exploring the interplay between the top-down and bottom-up approaches. But it definitely feels like the right way to handle common vocabularies augmented by different (and regionally-varying) vocabularies.

(This series: elmcity tagging principles.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s