A Reassessment of Chomsky’s View on the Use of Corpus Databases in Linguistic Research: Between Theoretical Challenges and Empirical Opportunities

Ahmad Syafiq Amir Abdullah Zawawi; Fazal Mohamed Mohamed Sultan

doi:10.1234/ic.v1i1.62245

Authors

Ahmad Syafiq Amir Abdullah Zawawi Malay Linguistics Programme, Universiti Malaya
Fazal Mohamed Mohamed Sultan Center for Research in Language and Linguistics, Universiti Kebangsaan Malaysia

DOI:

https://doi.org/10.1234/ic.v1i1.62245

Keywords:

corpus database, linguistics, artificial intelligence, syntax, digital data

Abstract

In the generative linguistics tradition, Noam Chomsky has consistently rejected the use of empirical corpus data to study language structure, especially in syntax research. He believes that native speaker intuition is more important in language studies and argues that corpus data is not reliable because it can be affected by variation and does not show true linguistic competence. However, with the fast growth of artificial intelligence and language technologies, the availability of large corpus databases, and the increasing need for wider empirical analysis, this view has been debated again in today’s linguistic research. This paper aims to re-examine Chomsky’s arguments against corpus use by applying a corpus-based method in syntax studies. This can help us understand universal syntactic structures more clearly. Some challenges of using corpora include their limits in showing native speaker competence, the lack of negative data, their inability to reflect how the mind works, and the possibility of biased or limited data. However, there are also new opportunities in corpus-based research, such as having access to billions of words from many sources and types of texts, using advanced technology to find morphosyntactic patterns, and using big data to test hypotheses and theories. In conclusion, combining corpus-based research with theory is very important today. Corpus data is not an enemy of theory, it is a valuable tool that supports and strengthens modern linguistic analysis.

A Reassessment of Chomsky’s View on the Use of Corpus Databases in Linguistic Research: Between Theoretical Challenges and Empirical Opportunities

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

menu