Tools
Online processing of your materials.
WARNING: Information and data provided here may contain errors. Please report problems to iplr[at]ilsp.gr.
NOTE: Certain limits and restrictions apply on results returned by the Online tools. If you need complete lists, or processsing of more items, send your materials and request to iplr[at]ilsp.gr along with a brief explanation. Noncommercial research requests will normally be fulfilled free of charge as soon as time permits.
See the Documentation page or the measures table for more information about the measures. Word-type and word-token statistics are also available online.
The NUM Tool provides quantitative measures for each letter string you submit. You may submit up to 20 strings at a time. You select the measures you are interested in by ticking the appropriate boxes. The results are provided in tab-separated form that can be displayed in your browser or downloaded for processing by any program that can read text columns (e.g., MS Excel).
Available measures include: Length (letters, phones, syllables), frequency, uniqueness/recognition point, cumulative syllable and bigram frequency, number and frequency of orthographic and phonological neighbors, cohort, stress and Levenshtein distance neighbors, and indices of orthographic transparency.
If you plan to request properties of words, consider downloading the full processed corpus word list from the Downloads page before you use this tool and looking for your items there. This can save you and our server a lot of time.
The TXT tool provides a variety of text results for one letter string you submit. You select the kinds of results you are interested in by ticking the appropriate boxes. The results are provided as plain text in Greek encoding (CP1253/ISO-8859-7) and can be displayed in your browser or downloaded to your computer. Available output includes: Pronunciation, syllables, phoneme-grapheme and syllable-level alignment, maximal cohort competitors, phonological and orthographic neighbors, cohort, Levenshtein distance, and stress neighbors.
The SEL tool retrieves subsets of the CLEAN corpus that match quantitative criteria you specify on the available measures. You fill in the range of acceptable values for as many measures as you wish. The results are provided as plain text in Greek encoding (CP1253/ISO-8859-7) and can be displayed in your browser or downloaded to your computer, along with a set of properties of your choice for the selected words.
Available measures include: Length (letters, phones, syllables), frequency, uniqueness/recognition point, cumulative syllable and bigram frequency, number of orthographic and phonological neighbors, cohort, stress and Levenshtein distance neighbors, and indices of orthographic transparency.
The FIND tool allows you to retrieve subsets of the CLEAN corpus that match a specific spelling or pronunciation pattern. You provice a single orthographic of phonological string to be matched, using the wildcards ? (question mark; for any single character) and * (asterisk; for any number of characters). The results are provided as plain text in Greek encoding (CP1253/ISO-8859-7) and can be displayed in your browser or downloaded to your computer, along with a set of properties of your choice for the selected words.
Available measures include: Length (letters, phones, syllables), frequency, uniqueness/recognition point, cumulative syllable and bigram frequency, number of orthographic and phonological neighbors, cohort, stress and Levenshtein distance neighbors, and indices of orthographic transparency.
The COMBO tool combines the FIND and SEL tools, allowing you to retrieve subsets of the CLEAN corpus that match both a specific spelling or pronunciation pattern and a set of quantitative criteria specified on the available measures. It can also match syllable structure patterns specified by strings such as CV.CVC (C=consonant, V=vowel, .=syllable boundary). See the above descriptions of the FIND and SEL tools for information on usage and results.
Please use this tool only when you really need to match both patterns and criteria, because it consumes inordinate computing resources and is also very slow.