Two UCD post-docs — Dr. Terrence Szymanski (Insight) and Dr. Gerard Lynch (CeADAR) have won an intriguing challenge at the Ninth International Workshop on Semantic Evaluation (SemEval-2015), which will be held in Denver, Colorado (USA) later this year.

The challenge dealt with developing a system that could identify the time period, between 1700 and 2014, in which a text was first published.

The researchers trained their system on 3000 text samples, and then tested it on 1000 unseen text fragments.

Szymanski & Lynch’s system won this “Diachronic Text Evaluation” challenge, against several international teams, being able to identify 46% of the texts within a 6-year range and 55% within a 20-year range.

Intuitively, one might expect that this success might have been based on spotting historical differences in the vocabulary used; however, part of Szymanski & Lynch’s winning formula hinged on exploiting syntactic and punctuation differences in the texts over time.

Dr. Terrence Szymanski (terrence.szymanski@insight-centre.org)

Dr. Gerard Lynch (gerard.lynch@ucd.ie)

IMG_0905