DTAL: Computational Linguistics Research Cluster

Cluster themes

The Computational Linguistics Research Cluster focuses on the following themes:




Theme 1: Foundational techniques in computational linguistics

Developing computational models of human language requires work on foundational techniques in computational linguistics.

Members of the cluster focus develop techniques for processing language at different levels of linguistic description (e.g.,morpholology, syntax, semantics, and discourse) and work on a range of computational linguistic tasks, including, for example:

  • automatic lexical acquisition
  • tagging and parsing
  • computational lexical semantics
  • discourse, dialogue and textual information structure
  • automatic metaphor identification
  • text classification
  • information extraction
  • text mining



Theme 2: Computational linguistics for linguists and cognitive scientists

Computational linguistics can support theoretical research in linguistics and cognitive sciences. For example, it can:

  • provide access to information in large corpora and the means to analyse them automatically
  • provide the mans to test the empirical adequacy of linguistic theories
  • provide computational models which help to understand how humans learn, comprehend and produce language

Members of the cluster collaborate with theoretical and applied linguists in the department and with cognitive scientists working in other language-related departments in Cambridge (e.g. Experimental Psychology and the Cambridge Brain Science Unit).




Theme 3: Corpus Linguistics

Members of the cluster develop and maintain different types of corpora (e.g. historical, learner and biomedical corpora) and develop techniques and tools for analysing the corpora automatically. The resulting corpora are used for computational linguistic research as well as research in linguistics and cognitive sciences.




Theme 4: Computational Linguistic Applications

Computational Linguistic techiques can be used to develop systems that process natural languages for practical applications. Members of the cluster work on a number of applications, including. e.g.:

  • text mining
  • machine translation
  • dialogue processing
  • summarisation

Many such applications are important in their own right as they are aimed at improving communication and information access. However, they can also provide a useful framework for evaluating the usefulness of different computational linguistic techniques.