-->

Kig

A web interface to the 300k-word CIG1 and 570k-word CIG2 corpora
Page language:
Non-child:


Child:



CIG1 documentation

(Replicated from the Child Language Databases website.)

CIG1 is a database of the Welsh of children who are variously between the ages of 18 months and 30 months. The database was created by a project which was funded by the Economic and Social Research Council, and which was located in the Department of Education of the Univerity of Wales Aberystwyth, and the Department of Linguistics of the University of Wales Bangor.

Background

This electronic version is the direct result of a project, The Acquisition of Welsh Syntax which was funded by the Economic and Social Research Council (R000236420) with a grant of £45,590.

The project was based in the Department of Education, of the University of Wales Aberystwyth, and also in the Department of Linguistics of the University of Wales Aberystwyth, and ran from 1st January 1996 until the 31st December 1996.

It was jointly directed by Professor Robert D. Borsley and Dr Michelle Aldridge of the Department of Linguistics, UWB, and Mr Bob Morris Jones of the Department of Education, UWA.

Two Research Officers were responsible for recording and transcribing the data: Ms Susan Clack in Bangor, and Ms Gwenan Creunant in Aberystwyth.

Paper based on the corpus: Aldridge, M, RD Borsley, S Clack, G Creunant and BM Jones (1998). 'The acquisition of noun phrases in Welsh', in Language Acquisition: Knowledge Representation and Processing. Proceedings of GALA '97, University of Edinburgh.

The main aim of the project was to produce a database for the study of the acquisition of Welsh syntax. This involved:

Data

Children Ages Files Hours Filenames Filesize
Southern:
Bethan 1;7.28 - 2;4.23 24 12 bethan01-24 280Kb
Melisa 1;6.17 - 2;3.23 26 13 melisa01-26 314Kb
Rhian 1;6.20 - 2;3.17 25 12.5 rhian0-25 324Kb
Northern:
Alaw 1;6.08 - 2;3.21 27 13.5 alaw01-27 584Kb
Dewi 1;9.21 - 2;6.09 29 14.5 dewi06-38 531Kb
Elin 1;5.08 - 1;9.20 11 5.5 elin01-11 153Kb
Rhys 1;8.31 - 2;5.21 26 13 rhys01-26 548Kb
168 84 2.67Mb

The data were collected in the homes of the children: four in north Wales, and three in mid-Wales. The children were recorded every fortnight at the beginning of the project and then every week. The recordings were transcribed in CHILDES format.

Using two locations was an attempt to include dialectal differences in the study. One researcher was familiar with northern dialects, and another was familiar with southern dialects.

The recordings of Elin were abandoned in favour of the more talkative Dewi. The latter had been the subject of a smaller-scale project in the Department of Linguistics at the University of Wales Bangor, and the audio-recordings were already available.

License

The database has been placed in the public domain for use in academic research. Every researcher is welcome to use the data. Please fully acknowledge the roles of the University of Wales Aberystwyth, the University of Wales Bangor, and the Economic and Social Research Council in the creation of the database.