Embedding GeoCrossWalk

Project start date: 2008-10 Project end date: 2009-06
The Embedding GeoCrossWalk project sought to provide a deeper understanding of how references to place in structured texts can be researched and automatically extracted. The project aims were threefold. Firstly it sought to deploy the Geoparser tool, developed previously by the Language Technology Group of Edinburgh University's School of Informatics, to georeference the Stormont Papers, using Natural language Processing (NLP). The project used the Geoparser in conjunction with GeoNames, an open-source global gazetteer, to identify, tag and (where appropriate) disambiguate all references to location. Secondly, the project refined and developed a better understanding of the Geoparser toll's application to content to this kind, and highlighted. Finally, it laid the foundations for an expanded geospatial browsing capability for the Stormont collections, which will be implemented alongside the existing interface.
Methods usedCategory
Cataloguing and indexingData structuring and enhancement
IndexingData analysis
Geo-referencing and projectionData structuring and enhancement
ParsingData analysis
Text recognitionData capture
Interface designData publishing and dissemination
text miningData analysis
textContent types
Funding sources: 
Joint Information Systems Committee (JISC)
Content types created: 
Still Image/Graphics, Text
Software tools used: 
GeoParser, GeoNames, Google Maps, MMAX2
Source material used:  
The collection used was the Hansards (proceedings) of the Lower House of the devolved "Stormont papers".
Digital resource created:  
A geotagged version of the digital Stormont papers. It is a complete record, comprising of all eighty-four volumes, with placenames automatically identified and marked up in XML. Various search facilities are provided, based on the original indices.
Data Formats created: 
Extensible Markup Language (XML), KML, JPEG
Conversion of geotaggged resource using bespoke XLST sheet.
Metadata standards employed: 
Keyhole Markup Language (KML)
Publications:  
Promotional materials were developed and distributed at major conferences including Digital Resources for the Humanities and Arts (September 2009), where a short presentation was given to the JISC projects meeting, and on the CeRch/CLARIN/DARIAH stand at the UK e-Science All Hands Meeting (December 2009).

A seminar paper on the project, and the geoparsing aspects therein, was solicited by HumLab, University of Umea, Sweden (October 2009).

A full paper, Use of the Edinburgh Geoparser for Georeferencing Digitised Historical Collections by C. Grover, Richard Tobin, Kate Byrne, Matthew Woollard, James Reid, Stuart Dunn and Julian Ball was accepted for the e-Science All Hands Meeting 2009, and a full manuscript has been submitted for publication in the proceedings.

Institutions affiliated with this project: 

UK HE institutions involved:
King's College London
University of Edinburgh
Queen's University Belfast

Project staff and expertise: 

Principal staff member:Sheila Anderson, Stuart Dunn, Claire Grover, Paul Ell
Other staff:
External expertise:


Metadata on this arts-humanities.net record
Author(s) of recordStuart Dunn
TitleEmbedding GeoCrossWalk
Record created2010-02-22
Record updated2010-04-21 16:16
URL of recordhttp://www.arts-humanities.net/node/3362
Citation of recordStuart Dunn: Embedding GeoCrossWalk.
<http://www.arts-humanities.net/node/3362>
created: 2010-02-22, last updated 2010-04-21 16:16