Hitzaldia. Adam Kilgarriff: Nola ezagutu zure corpus hori. (2012/11/07)(e)ko iruzkinak https://www.unibertsitatea.net/blogak/ixa/2012/11/04/hitzaldia-adam-kilgarriff-nola-ezagutu-zure-corpus-hori-20121107/ IXA taldea. Hizkuntzaren prozesamendua Wed, 31 May 2023 15:59:10 +0000 hourly 1 https://wordpress.org/?v=4.9.23 Hitzaldia. Adam Kilgarriff: Nola ezagutu zure corpus hori. (2012/11/07) | Hizkuntza-teknologiak | language technology and business internationalization | Scoop.it(e)k https://www.unibertsitatea.net/blogak/ixa/2012/11/04/hitzaldia-adam-kilgarriff-nola-ezagutu-zure-corpus-hori-20121107/#comment-91 Sun, 04 Nov 2012 18:17:01 +0000 https://www.unibertsitatea.net/blogak/ixa/?p=730#comment-91 […] Corpora are not easy to get a handle on. The usual way of getting to grips with text is to read it, but corpora are mostly too big to read (and not designed to be read). We show, with examples, how keyword lists (of one corpus vs: another) are a direct, practical and fascinating way to explore the characteristics of corpora, and of text types. Our method is to classify the top one hundred keywords of corpus1 vs: corpus2, and corpus2 vs: corpus1. This promptly reveals a range of contrasts between all the pairs of corpora we apply it to. We also present improved maths for keywords, and quantitative comparisons between corpora. All the methods discussed (and almost all of the corpora) are available in the Sketch Engine, a leading corpus query tool.  […]

]]>