Title: Defining a Matrix Language in Language Mixing
Abstract: Researchers of bilingual code-switching often assume that one of the participating languages
serves as the ‘base’ or ‘matrix’ into which elements of the other language are embedded.
However, the means by which the matrix language of a clause or extended discourse is
determined remains much debated: Is has been variously associated with the numerical
frequency of lemmas, with the predominant closed class or functional morphemes, or with the
first language in a left-to- right parsing, oftentimes with contradictory results. The matrix
language of “Being bilingue is mas sexy” would be either Spanish or English, depending on the
language annotation of sexy; but it would be unambiguously English, as established by the
gerund and copula or by its initial ordering in the surface string.
Accurate identification of the matrix language for bilingual text or speech is important for
linguists because it is proposed to be predictive of the grammatical constraints that are
observed in code-switching. And, in natural language processing, detection of the matrix
language can inform the selection of tools as researchers seek to analyze mixed-language data,
which is ever increasing. This poster presentation demonstrates several metrics for easily
quantifying and visualizing the matrix language, at various levels of analysis, in ways that are
valid and replicable. The metrics were developed by the Bilingual Annotations Tasks (BATs)
research group, an interdisciplinary cohort directed by Professors Bullock and Toribio and MA
candidate Gualberto Guzman.
Publication Year: 2018
Publication Date: 2018-01-01
Language: en
Type: article
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot