Scholars' Lab

Subscribe to Scholars' Lab feed
Works in Progress
Updated: 5 hours 2 sec ago

2013-2014 Praxis Charter ratified!

Wed, 25/09/2013 - 18:41

Last week the new Praxis cohort ratified its charter.  This important document ended up demanding much more deliberation than we had anticipated.  Nonetheless, after a couple weeks of thinking about what really mattered to us in commencing our program, we established a set of core beliefs and structuring principles which I believe will help guide us through a very exciting year.

We took inspiration from the previous cohorts’ charters in several respects because in many ways we feel we are continuing in the same tradition.  We, too, will conduct our work in the spirit of open source.  We, too, feel that a key part of this experience will be our all sharing credit for the project.  We also hope to learn programing skills central to the DH profession, and we plan to launch a digital tool at the end of the year as an outcome of our participation in the program.

A key tenet which is of primary importance to our particular cohort is that of flexibility, and this ideal influences many aspects of our charter.  For instance, we want the tool we build to be adaptable for various scholarly needs.  As of yet, we are in the early stages of conceptualizing this tool, and the issue of flexibility and utility will no doubt arise as we progress.  (I anticipate many reflective blogs to come on that topic.)  Perhaps even more importantly, we plan to be flexible–understanding, sensitive–with each other.  We all come from different scholarly and professional backgrounds, and we all have personal lives with various demands and responsibilities.  It will be our goal to be supportive of each other personally while working together to make the Praxis experience an enriching one for all.

Last Wednesday, Eric and Wayne showed us how to use GIT Hub to publish the new charter on the Praxis website.  This was our first lesson in programing.  Our brilliant SLab computer mentors entered the new charter text, encoded it in HTML, and then  let us do the simple–but no-less-important–step of hitting “Enter.”  Upon striking the key, watching a whir of yet incomprehensible code flash across the screen, and thus finalizing our newly forged charter, we felt a rush of glee.  In that moment, we had plunged headfirst into the new and intriguing world of Digital Humanities, and we had a charter to guide our voyage.

Neatline 2.1.0

Wed, 25/09/2013 - 15:55

We’re pleased to announce the release of Neatline 2.1.0! This is a fairly large maintenance release that adds new features, patches up some minor bugs, and ships some improvements to the UI in the editing environment. Some of the highlights:

  • A “fullscreen” mode (re-added from the 1.x releases), which makes it possible to link to a page that just displays a Neatline exhibit in isolation, scaled to the size of the screen, without any of the regular Omeka site navigation. Among other things, this makes it much easier to embed a Neatline exhibit as an iframe on other websites (eg, a WordPress blog) – just set the src attribute on the iframe equal to the URL for the fullscreen exhibit view. Eg:

    Thanks coryduclos, colonusgroup, and martiniusDE for letting us know that this was a pain point.

  • A series of UI improvements to the editing environment that should make the exhibit-creation workflow a bit smoother. We bumped up the size of the “New Record” button, padded out the list of records, and made the “X” buttons used to close record forms a bit larger and easier-to-click. Also, in the record edit form, the “Save” and “Delete” buttons are now stuck into place at the bottom of the panel, meaning that you don’t have to scroll down to the bottom of the form every time you save. Much easier!

  • Fixes for a handful of small bugs, mostly cosmetic or involving uncommon edge cases. Notably, 2.1.0 fixes a problem that was causing item imports to fail when the Omeka installation was using the Amazon S3 storage adapter, as we do for our faculty-project installations here at UVa.

Check out the release notes on GitHub for the full list of changes, and grab the new code from the Omeka add-ons repository. And, as always, be sure to send comments, concerns, bug reports, and feature requests in our direction.

In other Neatline-related news, be sure to check out Katherine Jentleson’s Neatline-enhanced essay “‘Not as rewarding as the North’: Holger Cahill’s Southern Folk Art Expedition,” which just won the Smithsonian’s Archives of American Art Graduate Research Essay Prize. I met Katherine at a workshop at Duke back in the spring, and it’s been a real pleasure to learn about how she’s using Neatline in her work!

Praxis Time Capsule

Thu, 19/09/2013 - 21:22

Obviously this is a little late, but comprehensive exams have a way of stealing whatever time you thought you had.  I wanted to write a post that reflected on my time in Praxis, hopefully share a bit about what I am taking away, suggestions I have for future Praxis generations and an opportunity to share some general gratitude!


What I’m Taking Away

  • New level of computer/digital literacy: as I’ve mentioned in other blog posts, the way I frequently dealt with computer problems was to either cry, punch my computer, or take a nap.  While I will miss these fantastic coping strategies, Praxis has actually made me someone that other people ask for help with their computers.

  • Clearer sense of what my interest in digital and alt-ac careers might be: I have learned that I hate ruby and love CSS and design aspects of website development.

  • Fuller understanding of what “digital humanities” actually is and how I might integrate it in my own work

  • Excitement about alt-ac possibilities

  • New network of friends and colleagues


Suggestions for Future Praxi

  • Don’t put off the hard conversations–take on questions like what is our goal for this tool? What are the major things we want a user to get out of it? Tackle those early and just force yourself to sit in a room until you hammer them out

  • Learn to love (or at least tolerate) conflict. As someone who has done a lot of “interdisciplinary” work in the past, that was never as hard (or rewarding) as it was in Praxis.  It becomes a tricky balancing act of maintaining an awareness for what is useful/helpful for your discipline and what the tool might be separate from those disciplinary constraints.

  • There really is no such thing as a stupid question

  • The scholars lab team are beyond lovable so don’t be scared to ask lots of questions.



Thanks to everyone in the scholars lab faculty and staff who resisted the temptation to roll their eyes at my one millionth question about html and CSS formatting.  Also thanks for helping me work through all of my insecurities about technology and not judging my borderline obsession with Honey Boo Boo.


Thanks to the Praxis team.  I learned more about the practice of “interdisciplinarity” with you 5 than I have ever have.  I also learned a lot about collegial and productive conflict from you all. it also affirmed my love of collaborative and team-based work and gave me hope that there are spaces within (or adjacent to) the academy that can be full of laughter and making mistakes openly.


Mon, 16/09/2013 - 15:09

Hello! My name is Scott Bailey, and I’m one of the new Praxis Fellows. I am also a Ph.D. student in Religious Studies, writing a dissertation on vulnerability as a locus of dogmatic reflection. Taking a cue from Brené Brown’s work on vulnerability, I’m asking what it means to think through the vulnerability of Christ, leading us to think of the vulnerability of God and of humanity. Much of this is done within the context of 20th Century Protestant theology, with a particular focus on Eberhard Jüngel’s theology.

I am also avidly interested in technology, though, both inside and outside the classroom. For the past three years, I was the Teaching + Technology Support Partner for the Department of Religious Studies, and helped faculty and grad students learn to use and incorporate different applications into their teaching practice. Of particular interest were applications like WordPress and NowComment. I applied to Praxis in order to keep pushing further, to learn about the what makes some of these applications work, and to learn more broadly about the world of Digital Humanities. From just a bit of exposure to HTML/CSS, I’ve already found that the concrete character of writing code is a welcome balance to the often abstract and speculative questions in theological ontology with which I am concerned. I look forward to working with the other Praxis Fellows and the rest of the Scholars’ Lab in the year to come.

a bit more medieval

Wed, 11/09/2013 - 23:30

/* Font Definitions */
{font-family:"MS 明朝";
panose-1:0 0 0 0 0 0 0 0 0 0;
mso-font-signature:1 134676480 16 0 131072 0;}
{font-family:"MS 明朝";
panose-1:0 0 0 0 0 0 0 0 0 0;
mso-font-signature:1 134676480 16 0 131072 0;}
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-signature:-536870145 1073743103 0 0 415 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
mso-fareast-font-family:"MS 明朝";
mso-bidi-font-family:"Times New Roman";
mso-fareast-font-family:"MS 明朝";
mso-bidi-font-family:"Times New Roman";
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;

Well. I must admit to some surprise and no small degree of trepidation regarding my presence here (both in the Praxis program and here online as a blogger being read by you [blog readers and, I guess, bots]). For example when, in our first meeting, I asked what a ‘wiki’ actually was I found out that they are ‘like Google Docs,’ which of course only begged the question on my part: what is a Google Doc? So. As we can see, this will be a fun year.  In truth, I applied to the Praxis program not because of a strong background in the Digital Humanities (hence forth ‘DH’ [incl. the definite article]) but rather on account of my relative lack of any background in DH.  Lack of background, however, ought not be equated with lack of interest. As a medievalist, specifically a student of medieval books, manuscripts, and textual cultures, I have both pragmatic and theoretical interests in DH. 

Pragmatically DH offers new ways to share previously impossible (or at least highly improbable) amounts of data, specifically visual data. As a palaeographer/book historian this allows me to avoid certain compromises forced on previous generations by the exigencies of print. For example, when cataloguing the medieval manuscripts of Wadham College, Oxford I looked to other, printed, manuscript catalogues for guidance regarding the type and amount of information to include.  In those catalogues choices regarding how much description of certain facets of a given book to include required consideration of the volume’s overall publishability, especially given the expected low sales volume. Conversely, online cataloguing would allow the inclusion of all the material the cataloguer finds relevant and useful regardless of length.  This is an admittedly conservative example, but nevertheless, my experience cataloguing suggested to me the extent to which my scholarly thought, or perhaps more properly ‘imagination’ in its strictest medieval sense, is constrained to the medium in which I am thinking.

So pragmatic interest is pretty easy to grasp but positing ontological stakes might seem a bit much.  Nor can I provide a neat, concrete, example. It’s more of a feeling that the types of textual production and consumption which occur in pre- and post- print environments share a certain resonance that itself poses an ontological challenge- a challenge the very being-ness- of a print-centric intellectual culture. As we begin to think about our charter we, the new Praxis team, have begun to think about credit, i.e. who gets it. In print world this is pretty tidy. Publication represents a convenient end stop to the production process at which juncture credit and the rewards therein may be distributed along traditional lines of authorship, etc. Online, things seem a bit more medieval. By that I mean the lines of authorship and production are blurred to the extant that disentangling them becomes not only impossible but somewhat ludic in principle. Kind of like the manuscripts I spend most of my time buried in.

Anyhow, as usual, I run on. In short, I am excited about the chance Praxis offers to both learn new DH skills and understand how those skills fit in the long, fluid, tradition we conventionally call the ‘Liberal Arts’ or ‘Humanities.’

Mapping Crowd Sourced Bicycle Data

Wed, 11/09/2013 - 15:14

I am a certified instructor from the League of American Bicyclists and a bicycle advocate.  Charlottesville is not the easiest place to ride a bicycle.  There are obstacles beyond the narrowness of the streets.  Let’s take a look at a few of these.

Cville street grid overlaid on elevation surface

The above map shows the elevation around Charlottesville with dark green being the lowest areas and bright red being the highest.  The Charlottesville street system is primarily laid out on top of a series of connected ridges.  This fragmented grid leaves only a small number of routes between different “sections” of the street network.  Not only that, most of the ridges run in a north/south direction with one going east/west.   In the following map, circles (blue for downtown and orange for UVa Central Grounds) show two “employment zones” (those circles will be used as reference in other maps).

Downtown and UVa reference circles

To make matters worse, there are two rail lines running through town.

Cville rail corridors

These two lines cross at the train station on W. Main St.  They esentially cut the city into four quads.

Cville quadrants

The rail lines further limit the ability to traverse the street network.  Below are all the street crossings (over/under rail lines) from one quad to another.  The crossings marked in red are not, in my opinion, suitable for bicycles in current configurations.

Cville quad passes

That leaves us with some serious bottlenecks for bicycle movement that not coincidentally tend to be along busy streets.

Here is a street map to provide context for anyone not familiar with Charlottesville.

Base map uses Open Street Map data

Cville Bike mApp

Back in April of 2012, the Thomas Jefferson Planning District Commission (TJPDC) launched a bicycle route data collection project called  Cville Bike mApp.   The TJPDC  adapted a smart phone apps that allowed users to log all their bicycle trips and share the location data with the TJPDC.  Data collection officially closed on May 18.  The apps are still available and the TJPDC is still collecting data.    The source code can be downloaded here.

Kelly Johnston and I were asked to consult with the TJPDC intern working with the data.  We discussed various visualizations and spatial analyses.  We asked for and were given permission to have the data to see what we could do.  Let the fun begin!

Data Issues

The data arrived after being “cleaned” of many data points.  This was necessary because there were many obvious cases where people had left their logging app running after they were at home, work or driving.  There were also routes with only a few datapoints over a very large geographic area.  The data was in Excel with each record containing a single latitude/longitude pair with trip ID, date and time.  There was a separate table listing  user ID, trip purpose, weather and notes.  There is no personal information in the data we received.  Of the data mapped, there were 1011 trips made by and 118 users.

Someone forgot to turn off their app

Clearly a few points missing along this route

The first idea for visualizing these data was trying to quantify the trips by route.  I thought the idea of using a spatial join to aggregate the trips seemed a useful technique.  However, there were a few issues with the data.  Relying on GPS coordinates from a variety (and quality) of smart phones is problematic.  The data ended up all over the place.

Main St. Routes

The first technique to try would be to expand the road centerlines using buffers to increase the catchment area for the roads.  Then, spatially join the routes to the roads creating a count of unique trips along any roadway section.  After a little trial and error, I settled on 60 foot buffers around all the road segments.  Of course, this still does not capture all the trips as shown the the image below.

Red lines indicate edge of 60ft buffer

The Visualizations

I ran the spatial join twice, once to capture the total number of trips and once to capture the unique users.  I then took these data from the buffers and joined it back to the road centerline to associate the trip and user counts with the centerline files in order to create something like this.

Unique Trips

Unique Users

This technique creates some interesting artifacts.  When you spatially join relatively inaccurate data like these, some of the aggregated features will end up being associated with features that were not meant to be.  In other words, side streets that were not or lightly used end up with much higher counts than they should have.  Case in point, W. Main St: is it possible that riders deviated to all the side streets and returned back to Main?

Side street issue

Of course, the answer is no and the problem becomes acutely clear when the trip data is overlaid.

Lack of accuracy

So the buffer segments of the side streets nearest to W. Main St. were picking up trips along Main St.  I tried several times to deal with this using various techniques and found nothing very helpful.  Next option?  Move to raster analysis.  I used a tool in ArcMap called Line Density to create a surface raster of all the unique trips.

Trip density surface raster

The map above clearly shows the most traveled streets in the study are W. Main St., Rugby Rd./Dairy Rd., Alderman Rd., Water St., Market St., Preston Ave., Rose Hill Dr. and 10th St. NE/Locust Ave.  There is also heavy traffic on the path between Rugby and Emmett adjacent to Lambeth Field.

The phone app also asked people to log the type of trip.  This is mapable!

Trips by type

As you can see in the map, commuting seems to outweigh the other trip types.  What about densities for these types?  I decided to make a map with a series of small multiple maps to demonstrate the densities of the four categories of trips I identified.

A series of small multiples – term coined by E. Tufte

Clearly there is a lack of data evident in at least one category.  However, I think there is a definite difference between the four types of rides.  This led me to look at the origination and destination of commuting trips.

Originations/destinations raster

Other than the fact that Downtown is both a hot spot in morning and afternoon for originations and destinations, I am not sure what else can be gleaned from that map.

I also wanted to see how steep slopes compared to the heaviest used routes.  I created a slope layer to show the steep slopes and overlayed the route density layer.

Trip density raster overlaid w/ steep slopes

As you can see, there is not a great deal of overlap between the most heavily used routes and high slope areas.  In those areas where there is overlap, most tend to be where the slope is just off the road and does not represent the road grade as along 5th St. Extended or Market St between 2nd St. NW and McIntire Rd.


I don’t think there are many surprises in this data.  Cyclists tend to take the flattest routes or at least the ones that don’t have steep climbs.  W. Main St. is the highest trafficked area.  Locust Ave., Rose Hill Dr. and Rugby Rd. are north/south collectors for that corridor.  This phenomenon has two causes in my opinion.  First, W. Main is the flat, straight corridor between the two largest population/employment centers in Charlottesville, UVa and Downtown.  Secondly, as mentioned earlier, W. Main is the only east/west oriented ridge in the city.

What I do find interesting is some of the alternative paths cyclists are taking.  The 8th St. NW connector (one of my favorite short cuts) under the railroad to the 10th and Page neighborhood connects to a parking lot underpass that bypasses W. Main St.  The extension of the parking lot leads to 7th St. SW which leads to a road within a condominium development (Walker Square).  This route extends to Grove St. and Roosevelt Brown Blvd. (and the UVa Hospital).  This path traverses all four quads, takes advantage of three underpasses, and uses very little in the way of busy streets without bike lanes.

The four quad alternative

Here is a rather heavily used route along the bike/ped pathway from from Ruby Rd. to Emmett St.  This path connects UVa to Barracks Road Shopping Center.

Lambeth bike/ped path

Another large barrier is the Rivanna River.  The past few years have seen a good deal of development on Pantops east of town.  The only way to get there is over Free Bridge on the US 250 Bypass.  Neeldess to say, that is not an inviting route for cyclists.  There are paths on both sides of the river.  There needs to be a bridge connecting them.

Dream crossing for the Rivanna River


This type of data collection has inherent issues.  Only people with smart phones can participate which really limits the sample size and variety.  While always improving, there is a lack of locational accuracy with smart phones.  The phone apps require the user to start and stop the app at the appropriate times.  These issues lead to a small amount of  collected data that has to be further slimmed into “good data.”  There is also seems to be a distinctive lack of data around UVa.  The timing of the study did not allow for UVa students to fully participate since they were either in exams or on break for a good portion of the study time.  However given a larger sample of riders over a longer period of time, I think some meaningful results would be forthcoming.  I urge the planning commission and local governments to consider more of this type of survey to gather better data.

If anyone has some ideas about better ways to visualize or analyze these data, I would love to hear from you.

Greetings and Salutations

Mon, 09/09/2013 - 13:08

Hello all!

My name is Eliza Fox, and I’m a third-year PhD student in the English Department.  My research focuses on the Victorian novel, along with secondary interests in children’s and young adult literature.  I’m not a total newcomer to the DH world: I spent last year as a NINES Fellow, working on projects that ranged from encoding a manuscript of Prometheus Unbound to updating metadata for NINES’s vast collection of aggregated digital objects.  I’ve studied databases at DHSI, and my undergraduate career included the rudiments of HTML.  I’ve also spent an inordinate amount of time enjoying Prism, otherwise known as the fruits of Praxis Past.  (My masterpiece: Prism on Prism, in which I highlighted articles on NSA surveillance for “Uncertainty about the future” and  “Fear.”  Clearly, irony is my medium.)

I recognize, however, that none of this has prepared me for the whirlwind year that lies ahead.  If my past experiences have allowed me to dip my toe in the DH pool, Praxis will force me to jump into the deep end and to figure out – live, in public, and in the moment – how to swim.

But, strange as it may sound, it’s a jump that I’m immensely excited to take.  Despite my love/hate relationship with technology (I once nicknamed my computer Cher – short for Chernobyl), I’m perpetually intrigued by the opportunities DH offers to represent, reconsider, and reconstitute our studies.  Praxis, in particular, offers the kind of full-throttle, hands-on, learning-by-doing approach that allows for a true appreciation and mastery of the field.  I’ve had many friends go through this program, and I’ve watched them transform, in just a few months, from technological novices into committed DH scholars.  Whatever their introductory levels of experience, they’re now designing archives, teaching coding, and project managing.

I can’t wait to join them.

Greetings from Stephanie, new Praxis Fellow!

Mon, 09/09/2013 - 00:23

I am excited to be a part of the new Praxis cohort and would like to take a few moments to introduce myself before a flurry of–ideally, great and innovative–thoughts populate the blog.  I am a second-year MA student in the English department, specializing in American literature, textual studies, and digital humanities.  My academic interests include Colonial and 19th-century American literature and history, as well as American book history.  My goal is to graduate this Spring and work in publishing or alternate academia.

Alongside my literary studies, a key component of my career at the University of Virginia has been to learn as much as possible about the digital humanities.  I have assisted on Alison Booth’s Collective Biographies of Women, a database of women grouped in communities based on the 19th-century biographies in which they are featured.  I continued my DH education with Documents Compass’s People of the Founding Era, an online archive of individuals mentioned in the Papers of the Founding Fathers.  Additionally, I took David Seaman’s Rare Book School Course “XML in Action” this past summer.  These projects have introduced me to digital editing and archiving, while also getting me thinking about other applications of DH.  An area I have yet to break into is crowdsourcing, inviting user participation and contribution, and this is an area which Praxis will plunge me into immediately.  Already, I am blogging… reaching out to people… and excited to be doing it.

We discussed in our first Praxis meeting a somewhat conflicted relationship to technology which we all share.  For my part, I still have no Smart Phone, forget to check my Facebook, and–lacking a GPS–have been known to use a paper map; notwithstanding all this, a hobbyist dream of mine is to create a digital archive of old family letters and photographs, alongside ancestor profiles.  I believe that DH is the key to preserving and disseminating this sort of material throughout the world.  At the same time, I will never cease to love the smell of a brand-new paperback book, or the feel of one of the many treasures housed in UVA Special Collections.  As a textual-studies and DH scholar, I inhabit both worlds and exist in a constant state of perplexity and wonderment… a state in which I now turn my attention to the work at hand.  To the SLab and Praxis cohort, and all our Praxis-blog followers, I am glad to make your acquaintance and thrilled to begin our work this year!

A bit about me

Sun, 08/09/2013 - 21:30

Hello readers! My name is Francesca Tripodi, and I am one of the 2013/2014 Praxis Fellows at UVa. I come to academia from a more circuitous route. Unlike many graduate students that I meet, I didn’t realize that I wanted to be in academia until much later in life. As a student at the Annenberg School at the University of Southern California my immediate interests were working in media. But after an extremely fulfilling internship at Fox Cable Networks Group, I caught the travel bug and took off to Australia where I spent six months backpacking “down under” followed by a month exploring New Zealand and month in Thailand. When I returned to The States, I yearned for a more global metropolis and spent the better part of my twenties working in Washington, DC.

My first job was at the United States Telecommunications Training Institution (USTTI). I worked as a liaison between the private sector (Cisco Systems, Bechtel, Qualcomm, and Microsoft) and the public sector (FCC, NTIA) to help deliver low-cost training programs to citizens of developing countries looking to expand and improve their digital infrastructure.  In addition to organizing course content, I worked with USAID offices and the State Department to coordinate the logistics of participants traveling to the US for the training (including visa processing).  After that job, I moved to Georgetown University and eventually became the Program Director of Pathways to Success – an academic immersion program that brings high school students from rural America to Georgetown University in an effort to improve  minority involvement in STEM education. One of my greatest achievements in this position was helping to secure additional funds (1.3million) to continue financing the program through 2012.

As an employee at Georgetown I also took advantage of their tuition remission benefits and earned my MA in Communication, Culture and Technology. It was there that I learned I ask very “sociologically oriented” questions and with the help of my advisor decided to continue my education at the University of Virginia. As a fourth year PhD Candidate in the Sociology Department, I am currently working on data that I collected from an ethnographic study in rural Louisiana on alligator hunting. Some of my more immediate findings are the importance of female hunting in the community and the parallels between the Cajun culture I experienced and the media representations of Cajun life on the show “Swamp People.” In the spring I hope to defend my dissertation proposal in an effort to answer the central research question that currently occupies my mind: To what extent does media influence a community’s boundary making process? In what ways do these boundaries shift depending on who controls the mediated narrative?

I am also happily married to a wonderful guy and six weeks ago we welcomed to the world a beautiful baby boy.

And so it begins…

Thu, 05/09/2013 - 22:15

This is my first blog post (ever), so I have spent a good deal of time hemming and hawing over an appropriately novel and pithy title to headline my blogging debut. Needless to say, that hasn’t happened.

Anyway, I’m Veronica Ikeshoji-Orlati, a 4th year PhD Candidate in Classical Art & Archaeology in the McIntire Department of Art here at UVA. I took a BA in Classics, with minors in Art History and Philosophy, from the University at Buffalo (2003), later returned to UB for an MA in Art History (2010), and am now weaving together the disparate threads of my personal and academic interests into my dissertation, entitled Music, Performance, and Identity in 4th century BCE South Italian Vase-Painting. The field of Classical Archaeology, and South Italian Archaeology in particular, offers up many challenges, incredible opportunities, and fascinating methodological questions, so I find my research engaging on many levels.

I am thrilled to be part of the 2013-14 Praxis cohort for a panoply of reasons. The two most important to me are 1. the opportunity to work with, and learn from, people from vastly different backgrounds with diverse personal, professional, and academic interests, and 2. the chance to plunge into the field of Digital Humanities with patient, knowledgeable guides and amicable, resourceful accomplices. Getting to work on the Ivanhoe game is an added benefit, since pedagogy happens to be a topic which occupies a significant corner of my mind and the role of digital spaces and teaching tools in the classroom is a developing interest.

That just about sums up what an introductory post should say. I look forward to sharing what we’re doing, and how it impacts my own research and thinking, here!

Welcome, new SLab grad fellows!

Mon, 26/08/2013 - 14:48

The Scholars’ Lab is pleased and proud to announce our partnership with nine new graduate fellows for the 2013-2014 academic year! They represent seven academic disciplines in the humanities and social sciences at the University of Virginia, and join a distinguished group of past recipients of Scholars’ Lab fellowships. (Since 2007,  UVa Library has offered 44 fellowships to deserving grad students in fields as diverse as History, Archaeology, Computer Music, Anthropology, Economics, English, Ethnomusicology, French, Religious Studies, Art History, Linguistics, and Architecture.)

First, we have our three winners of the UVa Library Graduate Fellowship in Digital Humanities. They are:

Erik DeLuca of the Composition and Computer Technologies Program in UVa’s McIntire Department of Music
“Community Listening in Isle Royal National Park, a sonic ethnography”

Gwen Nally of the Corcoran Department of Philosophy
“When Socrates Misleads: Falsehood and Fallacy in Plato’s Dialogues”


Tamika Richeson of the Corcoran Department of History
“I Know What Liberty Is: Black Motherhood, Labor, and Criminality, 1848-1878″

These fellows will have the opportunity to work closely with Scholars’ Lab staff over the course of the year, in applying digital methods to their dissertation research and presenting that work online.  Please join us on September 10th at noon in the Scholars’ Lab, when we welcome Erik, Gwen, and Tamika with a casual luncheon, and have the opportunity to hear a brief summary from each of them, about what they hope to accomplish this year!

Next, we’re getting started with a third year of the Praxis Program here at UVa, which is now home base for the new, international Praxis Network!  Last year saw the refinement and reimagining of Prism (not that Prism), a tool created by the first Praxis cohort in 2011-12. Prism is a web application for crowdsourcing interpretation, and for thinking through the relationship of humanities inquiry to the methods and motives of crowdsourcing. This year, the 2013-14 Praxis team will rethink, revive, and (we expect) utterly remake the Ivanhoe Game, another platform for playful, collaborative interpretation of documents and artifacts.

2013-2014 Praxis Fellows include:

Scott Bailey (Religious Studies)
Elizabeth Fox (English)
Veronica Ikeshoji-Orlati (Classical Art & Archaeology)
Stephanie Kingsley (English)
Francesca Tripodi (Sociology)
and Zachary Stone (English)

Keep an eye on the Scholars’ Lab blog for news throughout the year, on the work of all of our wonderful graduate fellows!

Problem Solving with HTML5 Audio

Thu, 15/08/2013 - 14:19

Several years ago I worked on a project to take recordings made of William Faulkner while he was the Writer-in-Residence at the University of Virginia in 1957 and 1958. The project, Faulkner at Virginia, transcribed the audio and then keyed the main components of the audio to the text using TEI. In order to provide playback of an individual clip, we used a streaming server (Darwin Streaming Server) that was being managed by another group. This allowed me to provide “random” access to the components of the audio, without needing to split up the files. Using the associated API, I could generate a clip of the data with something like this:

try { QT_WriteOBJECT( '', '300', ' 16', '', 'autoplay', 'false', 'scale', 'tofit', 'starttime','00:03:44.33:00', 'endtime','00:04:42.68:00' ); } catch (e) { //document.write(e); }

While this is kind of a nasty bit of JavaScript, it (somewhat) abstracts the Object embed code:

<object classid="clsid:02BF25D5-8C17-4B23-BC80-D3488ABDDC6B" width="300" height=" 16" codebase=",3,0,0"> <param name="src" value=""> <param name="autoplay" value="false"> <param name="scale" value="tofit"> <param name="starttime" value="00:03:44.33:00"> <param name="endtime" value="00:04:42.68:00"> <embed src="" width="300" height=" 16" pluginspage="" autoplay="false" scale="tofit" starttime="00:03:44.33:00" endtime="00:04:42.68:00"> </object>

At the time, the WHATWG specifications for audio were still pretty nascent, and didn’t have a lot of actual implementation saturation in browsers. At the time (late 2000s), the approach of using a third-party plugin to provide “advanced” interaction with a media element was pretty much the only game in town.

As with any project that relies on web technology, eventually things start to break, or just flat-out not work on devices that can access the Internet (e.g. mobile). Browsers have been in a war with each other for speed, implementation of “standards”, and market share. This has been a real boon for users as it has allowed developers to really push what the web is capable of as a run-time environment. Unfortunately for the Faulkner audio, the code got to the point where the approach stopped functioning consistently across all desktop browsers (interestingly, Chrome seemed to manifest this issue most consistently), and oh yeah, there are those iOS-based mobile devices that can’t play this either.

HTML audio to the rescue

You know that modern browsers (everything but IE < 9) can play audio natively (i.e. without a plugin), right? Really the only really horrible thing is that not every browser handles the same "native" format. You can check out a good table for codec support for audio, but it basically boils down to needing an MP3 and an Ogg Vorbis version of the audio files to provide for nearly all the browsers (IE being the outlier, with this working of IE 9+).

<audio id="T106" controls="true" preload="auto"> <source src="" type="audio/mpeg; codecs='mp3';"> <source src="" type="audio/ogg; codecs='vorbis'"> </audio>

This provides something on your page like this:

The great thing is that this will work on a mobile device as well. Score one for the Internet! Now to figure out the best way to do this.

Split the files

My first instinct was to take the files and split them into “clips” on the page. This would allow the browser to provide its native playback mechanism, and allow individuals to grab the segments for remixing (still waiting for an auto-tuned remix of “Spotted Horses”). In the TEI source are the start and end times for each of the “snippets.” My go-to tool for working with media is ffmpeg, and I knew I could break up the files into components, copying the bitrate into a new mp3. I wrote a quick XSLT to generate a shell script that would generate the ffmpeg commands to run.

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="" xmlns:tei="" exclude-result-prefixes="tei" version="2.0"> <xsl:output method="text" /> <xsl:strip-space elements="*"/> <xsl:template match="/"> <xsl:variable name="basename"> <xsl:value-of select="//idno[@type='analog tape']" /> </xsl:variable> <xsl:for-each select="//div2"> ffmpeg -i <xsl:value-of select="translate($basename, '-', '')"/>.mp3 -ss <xsl:value-of select="@start"/> -t <xsl:value-of select="@end"/> -acodec copy <xsl:value-of select="@id"/>.mp3 </xsl:for-each> </xsl:template> </xsl:stylesheet>

This generated a nice file of the commands to run.

ffmpeg -i T-106.mp3 -ss 00:00:00.00 -t 00:00:23.93 -acodec copy wfaudio02_1.1.mp3 ffmpeg -i T-106.mp3 -ss 00:00:23.94 -t 00:00:44.37 -acodec copy wfaudio02_1.2.mp3 ffmpeg -i T-106.mp3 -ss 00:00:44.59 -t 00:01:55.07 -acodec copy wfaudio02_1.3.mp3 ffmpeg -i T-106.mp3 -ss 00:01:55.06 -t 00:02:38.19 -acodec copy wfaudio02_1.4.mp3

At this point, all the data has been processed, so I need to see if this this going to actually work. I wrote another XSLT to preview what was going on and make sure this approach was going to work ok. Nothing too fancy, just an HTML wrapper, with most of the “work” happening in the div2 element.

<xsl:template match="div2"> <xsl:variable name="base"> <xsl:value-of select="@id"/> </xsl:variable> <div class="{@type}"> <div class="row"> <div class="span10"> <audio id="{$basename}" controls="true" preload="auto"> <source src="{$basename}.ogg" type="audio/ogg; codecs='vorbis'"></source> <source src="{$basename}.mp3" type="audio/mpeg; codecs='mp3';"></source> </audio> </div> <div class="span2 omega"> <button class="btn btn-info top" type="button"><i class="icon-circle-arrow-up"></i> top</button> </div> </div> <div class="row"> <p><xsl:value-of select="u"/></p> <hr/> </div> </div> </xsl:template>

Since the segment file names were derived from their id attributes, I was able to just point at the file without a lot of poking around. Now for the test!

I started playing with it and it appeared to work just fine. I then asked one of my colleagues who was working remotely to take a look at it, and she ran into a show stopper. She observed that when she loaded the page, only the first several of the clips were loading.

In the audio element, I had added the preload="auto" attribute to allow the file to buffer and play before then entire file had downloaded. When I profiled what was going on, I realized that somewhere around 20Mb of download, the browser was giving up preloading the audio for immediate playback. If you remove that attribute, the browser won’t buffer the file and you would have to download the entire file before you could start the playback. Definitely not what I was aiming at. Time to try something else.

Audio Time Range

In reading the MDN docs on HTML5 audio, I came across a section on Specifying a playback range. This looks promising! There is one file reference, and I just need to get the playback times in. It is unclear, however, from the description if the browser treats this as a single file transfer, or each segment as it’s own download thread. Fortunately it’s just a small tweak to the XSLT generating the audio elements.

<xsl:template match="div2"> <xsl:variable name="basename"> <xsl:value-of select="translate(/TEI.2/teiHeader/fileDesc/publicationStmt/idno[@type='analog tape'], '-', '')" /> </xsl:variable> <div class="{@type}"> <div class="row"> <div class="span10"> <audio id="{$basename}" controls="true" preload="auto"> <source src="{$basename}.mp3#t={@start},{@end}" type="audio/mpeg; codecs='mp3';"></source> <source src="{$basename}.ogg#t={@start},{@end}" type="audio/ogg; codecs='vorbis'"></source> </audio> </div> <div class="span2 omega"> <button class="btn btn-info top" type="button"><i class="icon-circle-arrow-up"></i> top</button> </div> </div> <div class="row"> <p><xsl:value-of select="u"/></p> <hr/> </div> </div> </xsl:template>

After checking the browser again, looks like the same issue is there; the browser treats each segment as its own download thread and chokes when it gets around 20Mb. Meh; the Internet. Ok, time to try something different.

Audio Sprites

When I was writing my book on developing HTML5 games, I ran across a great article Audio Sprites (and fixes for iOS) by Remy Sharpe. The idea draws inspiration from CSS sprite sheets where you put all your image assets into a single file then display the portion of the image you want on an HTML page. With audio sprites, instead of shifting coordinates of an image around, you shift the playhead of an MP3 file and tell it how long to play. This is really great for games as you can have a single file for players to download with all the audio files. Maybe this technique will work here…

Since I wanted to see how well this would work, and not necessarily write a library to support this, I used the howler.js library which has support for audio sprites. Back to the XSLT.

The howler.js API defines sprites by names to allow you to refer to then as variables in your code (again, it’s written for developing games). It also wants you to (in milliseconds) tell it where to start playing, and for how long to play. Ugh, my start and end times are in hh:mm:ss.s format. I wrote a quick function to explode the timestamps and add them together as milliseconds (actually this is a bit off, but I didn’t spend the time to work in the actual conversion units, but I wanted to see if this is going to work before I put that time in).

<xsl:function name="tei:timeToMilliseconds"> <xsl:param name="timestamp" /> <xsl:variable name="components"> <xsl:value-of select="sum(for $s in tokenize($timestamp, ':') return number($s)) * 1000" /> </xsl:variable> <xsl:value-of select="$components[1]"/> </xsl:function>

Now I can set up JavaScript for the Howler object literal for use on the page.

A few notes here, iOS devices require user interaction before an audio asset can be loaded. To handle the mobile devices, I added a button to the page that is hidden with a media query for desktop browsers. When the user clicked on it, they would see a thumper and a notice when the file had been loaded. I also had to add my own “play” buttons as this API is really meant for games.

Awesome, it works! But is this really a good idea? This is kind of an exotic (“clever” in programming terms) approach. It also relies on an obscure library that may not be maintained in the future. This probably isn’t the best path forward…

Blended Approach

After some more thought, maybe what’s needed here is a blended approach. I liked the fact that with the timestamps, I only have to create one extra file (and not 2 * n clips for both mp3 and ogg formats), but there was that sticky preload issue. This is where JavaScript can also help. What if there was an obvious mechanism for a user to click and then I could use JavaScript to dynamically construct an audio element in the DOM, and only start streaming the “segment” the user requested? This just may work

With a little JavaScript, I take a look at the DOM and construct an audio element, passing back to the browser the smallest version of the file (ogg then mp3) the browser can play back natively.

So the final product results in this, which has an animation to remove the icon, replacing it with the native audio playbar:

Generating Ogg

Now that I’ve got a basic system for playing the audio that works on just about every browser, time to take a look at converting these audio files. There are about 60 MP3s that needed to get transcoded for this project. If it were just a handful, I may have just manually done the transcoding in something like Audacity, but there were a lot of files, and I’m a “lazy” developer. Obviously this is a another job for ffmpeg. I had to recompile it from homebrew (an OS X package manager) to include the libvorbis bindings.

brew unlink ffmpeg brew install ffmpeg --with-theora --with-libogg --with-libvorbis

After getting the proper transcoding libraries installed, I wrote a quick bash script to convert all MP3s in a directory to ogg.

#! /usr/bin/env bash FILES="*.mp3" for f in $FILES do base=`basename "$f" .mp3` echo "Converting $base..." command="ffmpeg -i $f -acodec libvorbis $base.ogg" `$command` done

After this ran (it took a few hours), I had a complete set of MP3 and OggVorbis files for the project.


After rethinking how to address the problem of streaming audio to multiple platforms, with various limitations on how the audio specification is implemented, I finally landed on something that is not novel. What it does do, however, is move away from an approach that is no longer widely supported (the use of QTSS), to a single method that leverages the native support of modern browsers to do something reasonably simple…play a bit of sound. I also got rid of a lot of JavaScript (which breaks), the reliance on another server (which breaks), and sped up the delivery of the audio to the client. Additionally, since this isn’t an exotic (or complicated) replacement, the next person who has to do something with this code in five years will have a fighting chance at figuring out what is going on!

Parsing BC dates with JavaScript

Wed, 14/08/2013 - 18:29

[Cross-posted from]

Last semester, while giving a workshop about Neatline at Beloit College in Wisconsin, Matthew Taylor, a professor in the Classics department, noticed a strange bug – Neatline was ignoring negative years, and parsing BC dates as AD dates. So, if you entered “-1000″ for the “Start Date” field on a Neatline record, the timeline would display a dot at 1000 AD. I was surprised by this because Neatline doesn’t actually do any of its own date parsing – the code relies on the built-in Date object in JavaScript, which is implemented natively in the browser. Under the hood, when Neatline needs to work with a date, it just spins up a new Date object, passing in the raw string value entered into the record form:

Sure enough, though, this doesn’t work – Date just ignores the negative sign and spits back an AD date. And things get even funkier when you drift within 100 years of the year 0. For example, the year 80 BC parses to 1980 AD, bizarrely enough:

Obviously, this is a big problem if you need to work with ancient dates. At first, I was worried that this would be rather difficult to fix – if we really were hitting up against bugs in the native implementation of the date parsing, it seemed likely that Neatline would have to get into the tricky business of manually picking apart the strings and putting together the date objects by hand. It’s always feels icky to redo functionality that’s nominally built into the programming environment. But I didn’t see any other option – the code was unambiguously broken as it stood, and in a really dramatic way for people working with ancient material.

So, grumbling at JavaScript, I started to sketch in the outlines of a bespoke date parser. Soon after starting, though, I was idly fiddling around with the Date object in the Chrome JavaScript terminal when stumbled across an unexpected (and sort of inexplicable) solution to the problem. In reading through the documentation for the Date object over at MDN, I noticed that the constructor actually takes three different configuration of parameters. If you pass in a single integer, it treats it as a Unix timestamp; if you pass a single string, it treats it as a plain-text date string and tries to parse it into a machine-readable date (this was the process that appeared to be broken). But you can also pass three separate integers – a year, a month, and a day. Out of curiosity, I plugged in a negative integer for the year, and arbitrary values for the month and day:

Magically, this works. A promising start, but not a drop-in solution for the problem – in order to use this, Neatline would still have to manually extract each of the date parts from the plain-text date strings entered in the record forms (or break the dates into three parts at the level of the user interface and data model, which seemed like overkill). Then, though, I tried something else – working with the well-formed, BC date object produced with the year/month/day integer values, I tried casting it back to ISO8601 format with the toISOString method. This produced a date string with a negative date and…

two leading zeros before the four-digit representation of the year. I had never seen this before. I immediately tried reversing the process and plugging the outputted ISO string back into the Date constructor:

And, sure enough, this works. And it turns out that it also fixes the incorrect parsing of two-digit years:

I am deeply, profoundly perplexed by this. The ISO8601 specification makes cursory note of an “expanded” representation for the year part of the date, but doesn’t got into specifics about how or why it should be used. Either way, though, it works in all major browsers. Mysterious stuff.

Reprinting Printed Parts

Wed, 14/08/2013 - 15:00

As some of you know, the Scholars’ Lab has a spiffy 3D printer, a Makerbot Replicator 2. We’ve had fun with it, printing all sorts of wonderful things. As time went on and we continued using it, we ran into a problem plenty of other folks encountered, where the plunger that pushes filament against the drive gear was weakening. The first solution we tried was tightening up the plunger, but we’d have to do this frequently. A better solution was to print a new set of parts—a spring-loaded arm and mount—to replace the plunger. So I ordered up the hardware I’d need (spring, bearing, and bolts), and when those arrived in the mail, I printed the arm and other parts, disassembled the drive block on the printer, and replaced the plunger with new spring-loaded arm. You can see the fully-assembled drive block in my made stuff on Thingiverse. After I put the drive block back on the printer, I could tell an immediate difference in the prints. The plastic was extruding more smoothly than before. Loading the filament was a tiny bit trickier, but well worth it.

Then last month, I hauled the printer halfway across the country to use in a workshop on 3D modeling and printing at the Digital Humanities conference in Lincoln, Nebraska. When I started a test print to calibrate the printer, I noticed the printer wasn’t extruding plastic as well as it had been. so I took the drive block off again, and noticed the arm was loose, that the spring wasn’t actually pushing up on it enough to exert pressure on the filament coming it. I assumed I had just cracked the arm somehow during the drive out to Nebraska, so I just stretched the spring out a bit so it would better push the arm up, and went on with the workshop as planned.

Yesterday I finally got around to printing a replacement arm. After taking the old arm and mount off, and comparing the old arm with the new one, it seems like my initial thought that I had broken the arm were incorrect:

Comparing printed arms for the drive block. The black arm is the replacement, and the orange arm is the old one.

Upon further inspection, the old one isn’t broken as I had first suspected. Comparing the two (The orange arm is the first one I printed, and the black arm is the replacement I just printed) , it looks like the orange one has warped, very likely due the plastic arm being so close to the heated extruder.  It could also be that the spring was strong enough to warp the plastic arm bit by bit over time. I’m sure the proximity to the heated extruder helped with that too.

Assembled drive block with spring-loaded arm.

I decided to go ahead and use the newly printed arm for now. Needless to say, if you’ve tried this solution for your Replicator 2, you’ll be better off long-term sending for the machined replacement parts as I did. But for now, the new arm is working great, and should hold us over until the machined parts arrive.

Now available: Report and data from SCI’s survey on career prep and graduate education

Mon, 12/08/2013 - 15:05

[Cross-posted at my personal website]

I am delighted to announce the release of a report, executive summary, data, and slides from the Scholarly Communication Institute’s recent study investigating perceptions of career preparation provided by humanities graduate programs. The study focused on people with advanced degrees in the humanities who have pursued alternative academic careers. Everything is CC-BY, so please read, remix, and share. I’d especially welcome additional analysis on the datasets.

All of the materials are openly accessible through the University of Virginia’s institutional repository:

(Note that the files available for download are listed in the top left-hand corner of each Libra listing.)

Having worked on this for over a year, I’m more convinced than ever about the importance of incorporating public engagement and collaboration into humanities doctoral education—not only to help equip emerging scholars for a variety of career outcomes, but also to maintain a healthy, vibrant, and rigorous field. It has been fascinating to connect with scholars working in such a diverse range of stimulating careers, and to see some of the patterns in their experiences.

Many, many thanks to everyone who has contributed time and energy to this project—from completing the survey, to reading (or listening to) the preliminary reports, to providing feedback and critique.

Why do we trust automated tests?

Mon, 12/08/2013 - 14:08

[Cross-posted from]

I’m fascinated by this question. Really, it’s a academic problem, not so much a practical one – as an engineering practice, testing just works, for lots of simple and well-understood reasons. Tests encourage modularity; the process of describing a problem with tests makes you understand it better; testing forces you to go beyond the “happy case” and consider edge cases; they provide a kind of functional documentation of the code, making it easier for other developers to get up to speed on what the program is supposed to do; and they inject a sort of refactoring spidey-sense into the codebase, a guard against regressions when features are added.

At the same time, though, there’s a kind of software-philosophical paradox at work. Test are just code – they’re made of the same stuff as the programs they evaluate. They’re highly specialized, meta-programs that operate on other programs, but programs nonetheless, and vulnerable to the same ailments that plague regular code. And yet we trust tests in a way that we don’t trust application code. When a test fails, we tend to believe that the application is broken, not the tests. Why, though? If the tests are fallible, then why don’t they need their own tests, which in turn would need their own, and so on and so forth? Isn’t it just like fighting fire with fire? If code is unreliable by definition, then there’s something strange about trying to conquer unreliability with more unreliability.

At first, I sort of papered over this question by imagining that there was some kind of deep, categorical difference between resting code and “regular” code. The tests/ directory was a magical realm, an alternative plane of engineering subject to a different rules. Tests were a boolean thing, present or absent, on or off – the only question I knew to ask was “Does it have tests?”, and, a correlate of that, “What’s the coverage level?” (ie, “How many tests does it have?”) The assumption being, of course, that the tests were automatically trustworthy just because they existed. This is false, of course [1]. The process of describing code with tests is just another programming problem, a game at which you constantly make mistakes – everything from simple errors in syntax and logic up to really subtle, hellish-to-track-down problems that grow out of design flaws in the testing harness. Just as it’s impossible to write any kind of non-trivial program that doesn’t have bugs, I’ve never written a test suite that didn’t (doesn’t) have false positives, false negatives, or “air guitar” assertions (which don’t fail, but somehow misunderstand the code, and fail to hook onto meaningful functionality).

So, back to the drawing board – if there’s no secret sauce that makes tests more reliable in principle, where does their authority come from? In place of the category difference, I’ve started to think about it just in terms of a relative falloff in complexity between the application and the tests. Testing works, I think, simply because it’s generally easier to formalize what code should do than how it should do it. All else equal, tests are less likely to contain errors, so it makes more sense to assume that the tests are right and the application is wrong, and not the other way around. By this logic, the value added is proportional to the height of this “complexity cliff” between the application and the tests, the extent to which it’s easier to write the tests than to make them pass. I’ve starting using this as a heuristic for evaluating the practical value of a test: The most valuable tests are the ones that are trivially easy to write, and yet assert the functionality of code that is extremely complex; the least valuable are the ones that approach (or even surpass) the complexity of their subjects.

For example, take something like a sorting algorithm. The actual implementation could be rather dense (ignore that a custom quicksort in JavaScript is never useful):

The tests, though, can be fantastically simple:

These are ideal tests. They completely describe the functionality of the code, and yet they fall out of your fingers effortlessly. A mistake here would be glaringly obvious, and thus extremely unlikely – a failure in the suite almost certainly means that the code is actually defective, not that it’s being exercised incorrectly by the tests.

Of course, this is a cherry-picked example. Sorting algorithms are inherently easy to test – the complexity gap opens up almost automatically, with little effort on the part of the programmer. Usually, of course, this isn’t the case – testing can be fiendishly difficult, especially when you’re working with stateful programs that don’t have the nice, data-in-data-out symmetry of a single function. For example, think about thick JavaScript applications in the browser. A huge amount of busywork has to happen before you can start writing actual tests – HTML fixtures have to be generated and plugged into the testing environment; AJAX calls have to be intercepted by a mock server; and since the entire test suite runs inside a single, shared global environment (PhantomJS, a browser), the application has to be manually burned down and reset to a default state before each test.

In the real world, tests are never this easy – the “complexity cliff” will almost always be smaller, the tests less authoritative. But I’ve found that this way of thinking about tests – as code that has an imperative to be simpler than the application – provides a kind of second axis along which to apply effort when writing tests. Instead of just writing more tests, I’ve started spending a lot more time working on low-level, infrastructural improvements to the testing harness, the set of abstract building blocks out of which the tests are constructed. So far, this has taken the form of building up semantic abstractions around the test suite, collections of helpers and high-level assertions that can be composed together to tell stories about the code. After a while, you end up with a kind of codebase-specific DSL that lets you assert functionality at a really high, almost conversational level. The chaotic stage-setting work fades away, leaving just the rationale, the narrative, the meaning of the tests.

It becomes an optimization problem – instead of just trying to make the tests wider (higher coverage), I’ve also started trying to make the tests lower, to drive down complexity as far towards the quicksort-like tests as possible. It’s sort of like trying to boost the “profit margin” of the tests – more value is captured as the difficulty of the tests dips further and further below the difficulty of the application:

[1] Dangerously false, perhaps, since it basically gives you free license to to write careless, kludgy tests – if a good test is a test that exists, then why bother putting in the extra effort to make it concise, semantic, readable?

Displaying Recent Neatline Exhibits on your Omeka Home Page

Mon, 12/08/2013 - 06:00

The charismatic Alex Gil submitted a feature request to Neatline asking to be able to browse Neatline exhibits on your Omeka home page. Turns out you can already specify which page you want as your home page in Omeka 2.0, so that helped with Alex’s original query. But as we discussed the issue, Alex also wondered about putting a list of recent Neatline exhibits on the home page, much the same way Omeka already does with recent items. While we’re not sure yet about putting this kind of thing in the plugin itself, I mentioned that it’s fairly easy to do in your own them using one of Omeka’s hooks, and promised him a blog post explaining more. Here’s me making good on that promise.

In case you didn’t know, Omeka has plenty of ways for developers to add new content to an Omeka site or filter existing content using hooks and filters, respectively. To use them, you first need to write a function that adds or changes content to your preference, then you pass that function to the relevant hook or filter in Omeka. Some dummy code to illustrate:

<?php function my_custom_function() { echo 'Hello world!'; } add_plugin_hook('hook_name', 'my_custom_function');

You could put this kind of code anywhere that Omeka could run it, particularly a new plugin or your activated theme’s custom.php file. (An Omeka theme’s custom.php file is a great place to put some custom code for your Omeka site, without having to go to the trouble of creating and activating a plugin.)

In our case, we want to append some new content to the home page of an Omeka site, so we’ll need to find a hook to let us do that. Fortunately, we have one available—public_home—so let’s use that to display some recent Neatline exhibits.

(Keep in mind that the following code should work in Omeka 2.0 and Neatline 2.0; you can take a similar approach for earlier versions of each, but some of the functions would be different.)

First, we’ll need to create a custom.php file in your current active theme, if one doesn’t already exist. (If it does exist, we’ll use that one.) Make sure the file is in the root of your theme: omeka/themes/your-theme/custom.php.

Next we’ll need to write a function that gets a certain number of Neatline exhibits and lists them out, and put that in our custom.php file. We’ll name our function display_recent_neatline_exhibits, and put all our goodies in there. Let’s create the function:

<?php function display_recent_neatline_exhibits() { }

After we’ve created the function, we’ll go ahead and pass that function to the public_home hook:

<?php function display_recent_neatline_exhibits() { } add_plugin_hook('public_home', 'display_recent_neatline_exhibits');

We still shouldn’t see any changes on the home page, since our function isn’t actually doing anything. But you shouldn’t get any errors on the page either. If you do, make sure you have every curly brace and semicolon and all the other characters right; PHP is quite dramatic about syntax errors.

Now lets add some stuff to our function to get some recent Neatline exhibits. First, let’s define a variable $html and set that equal to an empty string. In the end, we’ll echo the value of $html, so we want it equal to at least something, in case you actually don’t have any Neatline exhibits to display.

<?php function display_recent_neatline_exhibits() { $html = ''; echo $html; } add_plugin_hook('public_home', 'display_recent_neatline_exhibits');

Next we’ll create a variable, $neatlineExhibits, and assign it to the results of a query using Omeka’s get_records function. The get_records function takes three arguments: the type of record, an array of query parameters, and number to limit results. We’ll query for ‘NeatlineExhibit’ record type, make sure that the recent parameter is true, and limit our results to five:

<?php function display_recent_neatline_exhibits() { $html = ''; // Get our recent Neatline exhibits, limited to five. $neatlineExhibits = get_records('NeatlineExhibit', array('recent' =&gt; true), 5); echo $html; } add_plugin_hook('public_home', 'display_recent_neatline_exhibits');

Now we’ll set the results in $neatlineExhibits for a record loop, and check to see if in fact we have exhibits to display in a PHP if statement:

<?php function display_recent_neatline_exhibits() { $html = ''; // Get our recent Neatline exhibits, limited to five. $neatlineExhibits = get_records('NeatlineExhibit', array('recent' =&gt; true), 5); // Set them for the loop. set_loop_records('NeatlineExhibit', $neatlineExhibits); // If we have any to loop, we'll append to $html. if (has_loop_records('NeatlineExhibit')) { } echo $html; } add_plugin_hook('public_home', 'display_recent_neatline_exhibits');

Inside our if statement, we’ll update the value of $html so that, instead of echoing an empty string, it echos some HTML that includes links to each of our recent Neatline exhibits. Remember that this will only get printed if we actually have Neatline exhibits in the database, otherwise we’ll just return an empty string.

<?php function display_recent_neatline_exhibits() { $html = ''; // Get our recent Neatline exhibits, limited to five. $neatlineExhibits = get_records('NeatlineExhibit', array('recent' =&gt; true), 5); // Set them for the loop. set_loop_records('NeatlineExhibit', $neatlineExhibits); // If we have any to loop, we'll append to $html. if (has_loop_records('NeatlineExhibit')) { $html .= '&lt;ul&gt;'; foreach (loop('NeatlineExhibit') as $exhibit) { $html .= '&lt;li&gt;' . nl_getExhibitLink( $exhibit, 'show', metadata($exhibit, 'title'), array('class' =&gt; 'neatline') ) . '&lt;/li&gt;'; } $html .= '&lt;/ul&gt;'; } echo $html; } add_plugin_hook('public_home', 'display_recent_neatline_exhibits');

As you can see, we append an opening unordered list tag, <ul> to $html. (Using .= in PHP lets us append additional strings onto an existing variable.) Then, we use Omeka’s loop function to loop through our set of Neatline exhibits. Inside that loop, we once again adding something to the value of $html: A list item wrapping a link to a the current Neatline exhibit in the loop. To help us make that link, we’ll use a function provided by the Neatline plugin: nl_getExhibitLink. We’re passing values for four arguments: The exhibit object (defined in $exhibit in the foreach loop); the action or route you want the link to take; the text of the link (here we’ve used Omeka’s metadata function to give us the title of the exhibit); and an array of attributes for the link (we’ve added a class attribute equal to ‘neatline’). Then we end with a closing list item tag.

And that should do it. You can see a version more or less the same as what I demonstrate here in a public gist I published earlier in the week. If you’d like to display a recent list of Neatline exhibits on your Omeka home page, just grab this code, and put it in your theme’s custom.php template.

Speaking in Code

Thu, 08/08/2013 - 16:02

We’re pleased to announce that applications are open for a 2-day, NEH-funded symposium and summit to be held at the Scholars’ Lab this November 4th and 5th.

Speaking in Code” will bring together a small cohort of accomplished digital humanities software developers. Together, we will give voice to what is almost always tacitly expressed in DH development work: expert knowledge about the intellectual and interpretive dimensions of code-craft, and unspoken understandings about the relation of our labor and its products to ethics, scholarly method, and humanities theory.

Over the course of two days, participants will:

  • reflect on and express, from developers’ own points of view, what is particular to the humanities and of scholarly significance in DH software development products and practices;
  • and collaboratively devise an action-oriented agenda to bridge the gaps in critical vocabulary and discourse norms that can frequently distance creators of humanities platforms or tools from the scholars who use and critique them.

In addition to Scholars’ Lab developers and project managers, facilitators include Steve Ramsay, Bill Turkel, Stéfan Sinclair, Hugh Cayless, and Tim Sherratt.  The SLab particularly encourages and will prioritize participation of developers who are women, people of color, LGBTQ, or from other under-represented groups. (See “You Are Welcome Here” for more info.)

This will be the first focused meeting to address the implications of tacit knowledge exchange in digital humanities software development. Check out the Speaking in Code website to apply! Deadline September 12th.

Announcing Neatline 2.0.2!

Wed, 07/08/2013 - 18:38

Today we’re pleased to announce the release of Neatline 2.0.2! This is a maintenance release that adds a couple of minor features and fixes some bugs we’ve rooted up in the last few weeks:

  • Fixes a bug that was causing item-import queries to fail when certain combinations of other plugins were installed alongside Neatline (thanks Jenifer Bartle and Trip Kirkpatrick for bringing this to our attention).

  • Makes it possible to toggle the real-time spatial querying on and off for each individual exhibit. This can be useful if you have a small exhibit (eg, 10-20 records) that can be loaded into the browser all at once without causing performance problems, and you want to avoid the added load on the server incurred by the dynamic querying.

  • Fixes some performance issues with the OpenStreetMap layer in Chrome.

And more! Check out the release notes for the full list of changes, and grab the new code from the Omeka add-ons repository. Also, watch this space for a couple of other Neatline-related releases in the coming weeks. Jeremy and I are working on a series of themes for Omeka specifically designed to display Neatline projects, including the NeatLight theme, which is currently used on the Neatline Labs site I’ve started playing around with (still a work in progress). We’re also just about ready to cut off a public release of the NeatlineText plugin, which makes it possible to connect records in Neatline exhibits to individual sections, paragraphs, sentences, and words in text documents (check out this example).

Until then, give the new code a spin, and let us know what you think!

Scholars’ Lab Speaker Series: James Smithies

Mon, 29/07/2013 - 16:19

Speaker Series Brown Bag: James Smithies
The UC CEISMIC Digital Archive: Co-ordinating Libraries, Museums, Archives, Individuals and Government Agencies in a Disaster Management Context

On July 22, Dr. James Smithies, Senior Lecturer in Digital Humanities at the University of Canterbury in Christchurch, New Zealand, spoke in the Scholars’ Lab about his work designing and developing the CEISMIC Digital Archive.

The Canterbury region in the South Island of New Zealand has experienced over 11,000 earthquakes since September 2010, including a devastating magnitude 6.3 quake on February 22nd 2011 that resulted in the loss of 185 lives. Only months after the February quake, while university staff were teaching from tents in the approach to winter, a fledgling digital humanities programme was established that had as its first goal the development of a national federated digital archive to preserve the vast quantities of content being produced as a result of the earthquakes. Paul Millar and James Smithies drew together a Consortium of 10 local and national agencies representing New Zealand’s libraries, museums, archives and cultural organisations in an effort to ensure a co-ordinated response.They then led technical development of a national federated archive,, and a bespoke research archive, The CEISMIC archive has recently completed Phase 1 of its technical development, and includes over 20,000 items. Consortium member organisations, local government agencies, and commercial companies provide content next to community groups and individuals. Projections indicate that the archive will hold 100,000 items by the end of 2013. The intention is to remain operational for the 10-15 years it is expected to take to rebuild the region. This talk will describe the current state of the archive, and explain how Millar and Smithies used methods inspired by the digital humanities community to achieve their goals.

You can find Dr. Smithies online on his website ( and on twitter (@jamessmithies).

As always, you can listen to (or subscribe to) our podcasts on the Scholars’ Lab blog, or on iTunesU.