ILSP PsychoLinguistic Resource

quoteA repository of quantitative word-level information and software processing tools for Greek phonology and orthography.

Welcome to the ILSP PsychoLinguistic Resource (IPLR), an online set of tools and data freely available to researchers working with Greek materials or needing information about Greek words or pseudowords. This web site is the first substantive output of a long-term work in progress at the Institute for Language & Speech Processing (ILSP). We welcome your comments, requests, and contributions to iplr[at]

IPLR is based on lists of word forms, derived from printed text corpora, lacking any phrasal or other context. There is no morphological information or lemma grouping. For corpus analyses on entire sentences, including concordances, lemma-based searches and part-of-speech tagging, see the ILSP Hellenic National Corpus.

2011 Update: Added a fifth online tool (COMBO), and calculation of uniqueness/recognition points and Levenshtein distance neighborhoods.

Please cite this article when reporting work done using IPLR:
Protopapas, A., Tzakosta, M., Chalamandaris, A., & Tsiakoulis, P. (in press). IPLR: An online resource for Greek word-level and sublexical information. Language Resources & Evaluation. doi:10.1007/s10579-010-9130-z


Lexical and sublexical measures include length, frequency, neighborhoods, cohorts, orthographic and phonological syllables, letter and phone bigrams, and individual letters, graphemes, and phones.


Python code to process word lists and sublexical units; includes syllabification, alignment, grapheme-to-phoneme transcription, pattern matching, cohorts and neighborhoods.


Sources and characteristics of the text corpora, explanation of processing philosophy and algorithms, description of methodological limitations, and links to published results.