Web-Based Lexical Resources
Research conducted in the last several decades suggest that vocabulary is one of the most important aspects in second language learning. The Internet offers numerous resources that are useful for teaching, learning, and researching vocabulary. Useful resources for teachers and learners include online dictionaries (e.g., OneLook.com), word lists (e.g., General Service List, new-General Service List, BNC/COCA lists, Essential Word List, Academic Word List), multi-word item lists (e.g., Phrasal Expressions List, Phrasal Verb Pedagogical List, Academic Formulas List, Academic Collocation List), vocabulary tests (e.g., Vocabulary Levels Test, Updated Vocabulary Levels Test, Vocabulary Size Test, Productive Vocabulary Levels Test, Word Associates Test, Word Part Levels Test, Guessing from Context Test), software to support language-focused learning (e.g., Memrise, Quizlet, Puzzlemaker, YouGlish.com, TED Corpus Search Engine), software to support meaning-focused learning (e.g., Extensive Reading Central, Extensive Reading Foundation, ELLLO, YouTube, Netflix, TED Talks, Language Reactor, italki.com), corpora and corpora-based tools (e.g., English-Corpora.org, SketchEngine, Compleat Lexical Tutor), text analysis programs (e.g., Vocabulary Profilers, New Word Level Checker, IDIOM Search, Multiword Unit Profiler), and guides for teaching and learning vocabulary. Researchers will benefit from online data collection platforms (e.g., Gorilla Experimental Builder, PsyhoPy, jsPsych, Prolific), bibliographies on vocabulary (e.g., Vocabulary Acquisition Research Group Archive [VARGA]), online articles, and lexical databases (e.g., MRC Psycholinguistic Database, CELEX, Word Net, English Lexicon Project MCWord: An Orthographic Wordform Database, ARC Nonword Database). Most resources can be used free of charge, and knowing how to find and use these resources will be valuable for language teachers, students, and researchers.
Keywords: vocabulary learning, online resources, dictionaries, vocabulary lists, vocabulary tests, corpora
The Internet offers numerous lexical resources that are useful for teaching, learning, and researching vocabulary. Useful resources for teachers and learners include online dictionaries, vocabulary lists, vocabulary tests, vocabulary learning software, corpora and corpora-based tools, text analysis programs, and guides for teaching and learning vocabulary. Researchers will benefit from online data collection platforms, bibliographies on vocabulary, online articles, and lexical databases. Most resources can be used free of charge, and knowing how to find and use these resources will be valuable for language teachers, students, and researchers.
[A]Resources for Teachers and Students
Numerous online dictionaries can be found on the Internet. Computer-based dictionaries, including web-based ones, offer several benefits that paper-based ones do not (Rizo-Rodríguez, 2008). First, the former allow users to carry out more complex searches than the latter such as backward match (e.g., search all words that end with –ment) and wild cards search (e.g., search all words that start with f- and end with –ment). Second, most web-based dictionaries can play audio recordings of foreign language words, which is a feature exclusive to computer-based dictionaries. Third, some web-based dictionaries, although freely accessible, offer almost the same content as paid paper-based dictionaries including definitions, parts of speech, example sentences, phrases, collocations, idioms, phonetic symbols, illustrations, and usage notes.
Freely accessible online dictionaries include Oxford Advanced Learner’s Dictionary (www.oxfordadvancedlearnersdictionary.com), Cambridge Advanced Learner’s Dictionary (http://dictionary.cambridge.org/), Collins Online Dictionary (https://www.collinsdictionary.com/), Longman Dictionary of Contemporary English (www.ldoceonline.com), Merriam-Webster Dictionary (www.merriam-webster.com), and Thesaurus.com (https://www.thesaurus.com/). OneLook.com (www.onelook.com) is a very useful website because it enables users to search more than 1,000 dictionaries at once, with just a single click of the mouse. The Phrase Finder (https://www.phrases.org.uk/) is useful for researching the meanings and origins of multi-word items, such as idioms, whereas Urban Dictionary (https://www.urbandictionary.com/) has a large collection of slangs, such as a keyboard warrior, OK boomer, or Netflix and chill.
Word lists compiled based on frequency are very useful because the frequency of a word gives learners a good indication of how useful the word is. In English, for instance, the most frequent 3,000 word families, together with proper nouns and marginal words, account for around 95% of the running words in most texts (Nation, 2022), and acquiring these words is one of the most important goals in the initial stage of language learning. Word frequency lists help teachers and learners to determine which words are high-frequency and need more attention than others. For instance, learners can go over frequency lists to see if there are any high-frequency but unfamiliar words, which need to be learned deliberately through flashcards or dictionaries. In elementary classes, teachers can consult frequency lists to make sure that a considerable amount of time is spent on them.
A classic English frequency list is Michael West’s (1953) General Service List (GSL), which consists of 1,986 word families. Because the GSL was compiled based on texts from the 1930’s, it does not contain modern words such as television, computer, or online. Yet, research suggests that the list is still useful even today (Nation, 2004). GSL can be found at Paul Nation’s website (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/vocabulary-lists). If more up-to-date English frequency lists are needed, the British National Corpus / Corpus of Contemporary American English (BNC/COCA) lists (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/vocabulary-lists), new-General Service List (https://academic.oup.com/applij/article/36/1/1/226623), and Essential Word List (https://www.edu.uwo.ca/faculty-profiles/docs/other/webb/essential-word-list.pdf) are useful. Paul Nation’s survival vocabulary lists (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/vocabulary-lists) are useful for those who are planning to travel to another country but do not have time to learn thousands of high frequency words.
After learning high-frequency words, it is recommended that learners acquire academic vocabulary. The Academic Word List (AWL; Coxhead, 2000) is the most widely used list of English academic vocabulary. The list consists of 570 word families that are outside the GSL and occur frequently in a wide range of academic disciplines such as Arts, Commerce, Science, and Law. The AWL words are very useful as they account for 8.5-10% of academic texts (Coxhead, 2000). The AWL can be found at Averil Coxhead’s website (https://www.wgtn.ac.nz/lals/resources/academicwordlist). The Academic Vocabulary Lists, which were derived from a larger corpus than AWL and consist of over 20,000 lemmas, are available at https://www.academicvocabulary.info/x.asp.
Lists of technical vocabulary have also been developed for fields such as business (https://idus.us.es/handle/11441/34157), science (https://www.wgtn.ac.nz/lals/about/staff/publications/Sci_EAP_sub_lists_Coxhead_and_Hirsh.pdf), medicine (https://www.researchgate.net/publication/248530568_Establishment_of_a_Medical_Academic_Word_List), and nursing (https://www.ohsu.edu/sites/default/files/2019-04/Nursing%20AWL.pdf). Secondary School Vocabulary Lists consist of technical vocabulary from the following eight subject areas: Biology, Chemistry, Economics, English, Geography, History, Mathematics, and Physics (https://www.eapfoundation.com/vocab/other/svl/). Secondary Phrase Lists include two-word combinations, rather than single words, from the same eight areas.
Due to the increasing recognition that the knowledge of multi-word items (also referred to as formulaic sequences) plays an important role in the use, processing, and acquisition of second language (L2), various lists for multi-word items have also been developed. They include Phrasal Expressions List (https://academic.oup.com/applij/article/33/3/299/220807), Phrasal Verb Pedagogical List (PHaVE List; https://afa4be34-0fda-46d9-8e64-5adf13d4216b.filesusr.com/ugd/5f2482_ba18c227594d463aae9438e7b065f592.pdf), Academic Formulas List (https://academic.oup.com/applij/article/31/4/487/191083), and Academic Collocation List (https://www.eapfoundation.com/vocab/academic/acl/). A number of other lists, for both single-word and multi-word items, are available at EAPFoundation.com (https://www.eapfoundation.com/vocab/wordlists/overview/).
Vocabulary tests are very useful tools for determining what kinds of words students already know and should be focusing on next. The most widely used English vocabulary test is perhaps the Vocabulary Levels Test (VLT). VLT measures whether learners have acquired the most frequent 1,000, 2,000, 3,000, 5,000, and 10,000 word families as well as the AWL words. VLT can be found at Compleat Lexical Tutor (https://www.lextutor.ca/tests/vlt2/?mode=test). The Updated Vocabulary Levels Test is available from the website of Stuart Webb (https://www.edu.uwo.ca/faculty-profiles/stuart-webb.html).
Although the levels tests are very useful, they cover only limited word levels (words that are outside the most frequent 5,000 word families, for instance, are not tested by The Updated Vocabulary Levels Test) and are not designed to measure learners’ overall vocabulary size. If teachers want to estimate the number of words that their students know, the Vocabulary Size Test (VST) will be useful. VST measures learners’ vocabulary size from the first 1,000 to the twentieth 1,000 word families, and can be found at Paul Nation’s website (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/vocabulary-tests) and Compleat Lexical Tutor (https://www.lextutor.ca/tests/vst/index.php?mode=test).
In addition to VLT and VST, there are many other online vocabulary tests available. Paul Meara’s _lognostics offers a wide range of lexical tests including V_Quint, X_Lex, Y_Lex, and LEX30, in addition to the LLAMA Tests, a widely used aptitude test battery (http://www.lognostics.co.uk/tools/index.htm). Compleat Lexical Tutor (http://www.lextutor.ca/tests/) also offers web-based versions of various tests, including Productive Vocabulary Levels Test, Word Associates Test, and Eurocentres Vocabulary Size Test. The website of Stuart Webb also has Word Part Levels Test and Guessing from Context Test (https://www.edu.uwo.ca/faculty-profiles/stuart-webb.html). Although most of these tests use a multiple-choice format, VocabLevelTest.Org (https://vocableveltest.org/) is unique in that it helps users to create, administer, and grade meaning-recall tests, which require learners to produce, rather than to choose, meanings of L2 words.
[B]Vocabulary Learning Software
The Internet offers numerous kinds of vocabulary learning software. One example would be flashcard programs, where target items are presented outside meaning-focused activities, and learners associate the second language (L2) word form with its meaning, usually in the form of a first language (L1) translation, L2 synonym, or definition (Nakata, 2011). Examples of web-based flashcard programs include Memrise (https://www.memrise.com/), Quizlet (http://quizlet.com/), and Flashcards Builder (https://www.lextutor.ca/cgi-bin/flash/).
Vocabulary can also be practiced in online exercises. Compleat Lexical Tutor (www.lextutor.ca), for instance, allows users to create various exercises including dictation (https://www.lextutor.ca/spell/dict/), cloze (https://www.lextutor.ca/cloze/), and word recognition quizzes (http://www.lextutor.ca/id/). Puzzlemaker (http://puzzlemaker.discoveryeducation.com/) helps users to create quizzes such as crossword puzzles and letter tiles. General purpose authoring tools including Hot Potatoes (http://hotpot.uvic.ca/), Moodle (http://moodle.org/), and Kahoot! (https://kahoot.it/) can also be used to create vocabulary exercises.
Vocabulary can also be acquired incidentally from reading. Hypertext Builder (https://lextutor.ca/hyp/; Cobb, 2007) is useful for facilitating vocabulary learning through reading. The software allows users to get definitions and hear pronunciation of words used in a particular text with just one or two clicks of the mouse. The program also lets learners study concordance lines of words. Concordance lines refer to sentences derived from a corpus. Viewing concordances may help learners study collocations, grammatical patterns, or idioms of a word (see Corpora and Corpora-based Tools below for more details). Hypertext Builder also enables learners to set aside words that they have encountered during reading and practice them in various quizzes.
Extensive reading and listening also provide opportunities for vocabulary learning through meaning-focused input. Websites such as Extensive Reading Central (https://www.er-central.com/) , The Extensive Reading Foundation (https://erfoundation.org/wordpress/free-reading-material/), Tween Tribune (https://www.tweentribune.com), CommonLit (https://www.commonlit.org), Free Graded Readers (https://freegradedreaders.com/wordpress/), BBC Learning English (www.bbc.co.uk/learningenglish), ELLLO (http://www.elllo.org), Audible (https://www.audible.com/), and Apple Podcasts (https://www.apple.com/apple-podcasts/) provide reading and listening materials at varying levels that can be used for extensive reading and listening. YouTube (https://www.youtube.com/), Netflix (https://www.netflix.com/), and TED Talks (https://www.ted.com/) provide opportunities for vocabulary learning through extensive viewing. Language Reactor (https://www.languagereactor.com/), formerly known as Language Learning with Netflix, is a useful web-based program. It works with YouTube and Netflix and allows users to display subtitles in multiple languages (e.g., L1 and L2) simultaneously, look up words in subtitles in a pop-up dictionary, save words, change the playback speed, or automatically pause at the end of every subtitle, helping vocabulary learning through viewing videos.
Writing or speaking in L2 facilitates vocabulary development through meaning-focused output. Websites such as italki.com (https://www.italki.com/), Busuu (https://www.busuu.com/), Interpals (https://www.interpals.net/), HelloTalk (https://www.hellotalk.com/), and Tandem (https://www.tandem.net/) allow users to find language exchange partners or tutors, providing opportunities for vocabulary learning through meaning-focused speaking or writing.
[B]Corpora and Corpora-based Tools
Corpora (or corpus in its singular form) refer to a collection of electronic texts that are assembled for a particular purpose in a systematic manner (O’Keeffe et al., 2007). Although corpora have been used mainly by researchers and lexicographers, they can be useful for teachers and students as well. For instance, teachers can create example sentences or cloze exercises based on sentences extracted from corpora. Learners can search a particular word in corpora for studying collocations, idioms, grammatical patterns, or semantic prosody (whether the word has a positive or negative connotation) of a word.
The Internet offers numerous web-based corpora. English-Corpora.org (https://www.english-corpora.org/), for instance, provides a number of English corpora, such as Corpus of Contemporary American English (COCA), British National Corpus (BNC), iWeb, TV Corpus, Movie Corpus, and Coronavirus Corpus. SketchEngine (https://www.sketchengine.eu/) and Compleat Lexical Tutor (https://www.lextutor.ca/conc/) provide corpora in multiple languages such as English, French, German, and Spanish.
Corpora are particularly useful for investigating how a particular word is used in context. For instance, if users search for try in English-Corpora.org, the website will display all the sentences containing the word from a selected corpus. By studying concordance lines, learners can find out that try can be followed by both a to-infinitive and gerund although they have different meanings (for instance, tried to swim does not necessarily imply that the speaker swam while tried swimming indicates that the speaker did swim).
Corpora are also useful for extracting common collocations. Collocations refer to sets of words that often co-occur such as take a break, make a decision, or a bitter disappointment (Nesselhauf, 2003). Several web-based programs, such as Just The Word (http://www.just-the-word.com/), SKELL (Sketch Engine for Language Learning; https://skell.sketchengine.eu/), and English-Corpora.org (https://www.english-corpora.org/), can give a list of common collocations for a given word. These programs are useful because they may help learners find natural, appropriate expressions. Using these programs, for instance, students can discover that the verb cause often collocates with nouns such as problem, damage, death, pain, disease, harm, trouble, cancer, injury, loss, infection, accident, stress, or illness, and has a negative semantic prosody.
Google Ngram Viewer (https://books.google.com/ngrams), Netspeak (https://netspeak.org/), and Linggle (https://linggle.com/) can also be used to investigate common collocations. YouGlish.com (https://youglish.com/), TED Corpus Search Engine (https://yohasebe.com/tcse), and PlayPhrase.me (https://www.playphrase.me/) allow users to play videos where a certain word or phrase is used. As such, they are useful for learning not only usage but also pronunciation. YouGlish.com is particularly valuable because it works with a number of languages, including Arabic, Chinese, English, French, German, Japanese, Spanish, and Swedish.
Collocaid (http://www.collocaid.uk/) is useful for helping students write academic essays in English since it suggests common academic collocations derived from corpora. Typing research in Collocaid, for instance, suggests common collocations with the word such as the following:
Adjective + research: qualitative research, quantitative research, previous research, future research, further research, empirical research
Verb + research: conduct research, undertake research, carry out research, guide research, review research, support research
Research + verb: research shows, research suggests, research examines, research indicates, research reveals
Research + preposition: research into, research on, research in, research with, research at
Adverb + researched: thoroughly researched, extensively researched, carefully researched, well researched, widely researched
AWSuM (Academic Word Suggestion Machine; https://langtest.jp/awsum/) is also a valuable resource for English academic writing because it suggests a list of common academic multi-word expressions. If one types the results of this study in AWSuM, for instance, it will suggest the following expressions:
the results of this study suggest that
the results of this study have indicated that
the results of this study are consistent with
the results of this study differ from those
the results of this study are limited by
On the Internet, one can also find corpora for languages other than English. The websites of Martin Weisser (http://martinweisser.org/corpora_site/mega_and_national.html) and Stanford Linguistics (https://linguistics.stanford.edu/resourcescorpora/corpus-inventory) provide extensive lists of such corpora. The most comprehensive one is perhaps SketchEngline (https://www.sketchengine.eu/). It contains corpora for over 30 languages, such as French, German, Spanish, Chinese, Korean, and Japanese. The website of Université catholique de Louvain also has a comprehensive list of learner corpora (https://uclouvain.be/en/research-institutes/ilc/cecl/learner-corpora-around-the-world.html), a collection of written and spoken language produced by L2 learners.
[B]Text Analysis Programs
A number of text analysis programs are also available on the Internet. One example is a vocabulary profiler, which compares a text against word lists specified by the user. Vocabulary profilers allow users to create their own frequency lists, identify words that are or are not shared by various texts, find words that are likely to be unknown to students of certain vocabulary knowledge, evaluate students’ productive vocabulary knowledge, or estimate the vocabulary load of materials (Webb & Nation, 2008).
Examples of vocabulary profilers include Compleat Lexical Tutor’s Vocabulary Profilers (http://www.lextutor.ca/vp/) and New Word Level Checker (https://nwlc.pythonanywhere.com/). The former allows users to analyze texts based on lists such as BNC-COCA lists, BNC lists, CEFR English lists, and frequency-based French lists, while the later analyzes texts using New General Service List and SEWK-J (Scale of English Word Knowledge–Japanese) lists, among others. MultiLingProfiler (https://www.multilingprofiler.net/) allows users to profile texts in French, German, and Spanish.
There are also several web-based programs to analyze lexical richness of texts (i.e., density, diversity, and sophistication). Examples include Coh-Metrix (http://cohmetrix.com/), Web-based Lexical Complexity Analyzer (https://aihaiyang.com/software/lca/), and Program for Calculating S (http://www.kojima-vlab.org/lexical_richness/S.html). TAALES and other text analysis programs are also available for download at https://www.linguisticanalysistools.org/.
Another example of text analysis programs includes an online concordancer. Voyant Tools (https://voyant-tools.org/) and Versatext (https://versatext.versatile.pub/) allow users to create concordances or create a word cloud from texts.
Given the increasing interest in multi-word items, several computer programs to identify multi-word items (e.g., idioms, collocations, and phrasal verbs) in a text have also been developed. They include IDIOM Search (https://idiomsearch.lsti.ucl.ac.be/), Phrase Profiler (https://www.lextutor.ca/vp/collocs/), and Multiword Unit Profiler (https://multiwordunitsprofiler.pythonanywhere.com). IDIOM Search can analyze texts in English, Spanish, French, and Chinese, while the other two programs work only with English.
[B]Guides for Teaching and Learning Vocabulary
In recent years, numerous empirical studies have been conducted on vocabulary acquisition (e.g., Webb, 2020). These studies have contributed to our knowledge of how vocabulary should be taught and learned. However, teachers and learners tend to have misconceptions about what constitutes an effective vocabulary learning technique (Folse, 2004). On the Internet, there are many research-based guides for teaching and learning vocabulary that may help teachers and students who might have been misinformed about effective learning techniques.
For instance, works by Paul Nation (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/paul-nations-publications/publications) offer very valuable insight into how vocabulary should be taught and learned. Articles written by Norbert Schmitt, which are freely accessible at his website (https://www.norbertschmitt.co.uk/publications), also offer helpful suggestions for vocabulary teaching and learning. Other useful guides include an online article by Godwin-Jones (2010), which provides a list of web-based resources on memorization techniques such as the keyword method, peg method, and semantic mapping (https://www.lltjournal.org/item/536/).
[A]Resources for Researchers
The Internet offers a wealth of valuable resources for researchers too. Paul Nation’s website (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources), Tom Cobb’s Compleat Lexical Tutor (www.lextutor.ca/), and Paul Meara’s _lognostics (www.lognostics.co.uk/) are perhaps the most comprehensive ones available. Paul Nation’s website provides a bibliography on vocabulary, numerous articles on vocabulary, vocabulary lists, vocabulary tests, and vocabulary profilers. Compleat Lexical Tutor offers a wide range of vocabulary research tools such as online vocabulary tests (http://www.lextutor.ca/tests/), text analysis programs (e.g., Range http://www.lextutor.ca/range/), Vocabulary Profilers (http://www.lextutor.ca/vp/), Multiword Extractors (https://lextutor.ca/multiwords/), KeyWords Extractor (https://www.lextutor.ca/key/), web-based corpora (https://www.lextutor.ca/conc/), a reaction timer (https://www.lextutor.ca/cgi-bin/rt/), statistical test calculators (https://www.lextutor.ca/stats/), and text processing utilities (https://www.lextutor.ca/tools/). Paul Meara’s _lognostics provides a bibliography on vocabulary acquisition (https://www.lognostics.co.uk/varga/index.htm), a collection of articles on vocabulary (http://www.lognostics.co.uk/vlibrary/index.htm), computer-based vocabulary tests, and text analysis programs (http://www.lognostics.co.uk/tools/index.htm).
For lexicographers, the website of Reinhard Hartmann (http://euralex.pbworks.com/f/Reference+Portals+aug+2010.pdf) provides a wealth of valuable resources. For Corpus and Computational Linguistics, the website of Stanford Natural Language Processing Group (http://www-nlp.stanford.edu/links/statnlp.html) offers valuable information about a wide range of corpora and text analysis tools.
[B]Online Data Collection Platforms
Due to the pandemic, many researchers have opted to collect data online, instead of in the laboratory. Google Forms (https://www.google.com/forms/), Qualtrics (https://www.qualtrics.com/), and SurveyMonkey (https://www.surveymonkey.com/) have been used widely to administer questionnaires online. Gorilla Experimental Builder (https://gorilla.sc/), PsyhoPy (https://www.psychopy.org/), jsPsych (https://www.jspsych.org/), and Lab.js (https://lab.js.org/) allow researchers to design and conduct online experimental tasks such as a lexical decision task or a semantic priming task. Prolific (https://www.prolific.co/) helps researchers to recruit participants by specifying a number of variables, such as demographic information (e.g., age, nationality, gender), languages (L1, L2), occupation, or education.
[B]Bibliographies and Articles on Vocabulary
There are several web-based bibliographies that are valuable for vocabulary researchers. Vocabulary Acquisition Research Group Archive (VARGA. https://www.lognostics.co.uk/varga/index.htm) and Paul Nation’s Vocabulary Bibliography Database (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/vocabulary-bibliography-database) are among the most comprehensive ones available. VARGA is particularly noteworthy as it is a searchable database and allows users to find vocabulary research according to keywords, publication years, or authors.
In addition to bibliographies, articles on vocabulary are also freely accessible on the web. Reading in a Foreign Language (https://nflrc.hawaii.edu/rfl/) and Language Learning & Technology (https://www.lltjournal.org/), both of which are open-access, international refereed journals, have a large number of freely downloadable articles related to vocabulary. Vocabulary Learning Instruction (http://vli-journal.org/wp/) is an open-access journal devoted solely to the topic of vocabulary. Furthermore, numerous articles on vocabulary are available at the personal websites of researchers. Examples include the websites of Paul Nation (https://www.wgtn.ac.nz/lals/resources/paul-nations-resources/paul-nations-publications/publications), Norbert Schmitt (https://www.norbertschmitt.co.uk/publications), Paul Meara (http://www.lognostics.co.uk/vlibrary/index.htm), and Tom Cobb (http://www.lextutor.ca/cv/#Publications).
One can find many lexical databases on the Internet. For example, MRC Psycholinguistic Database (https://websites.psychology.uwa.edu.au/school/mrcdatabase/uwa_mrc.htm) and CELEX (https://catalog.ldc.upenn.edu/LDC96L14) have been used mainly by psycholinguists and are useful for controlling vocabulary related variables in experiments. For instance, suppose you are looking for high-frequency, monosyllabic, abstract English nouns or seven-letter, polysyllabic, concrete nouns that have an irregular plural form. By specifying variables such as the frequency, number of syllables, number of letters, concreteness ratings, familiarity ratings, or the parts of speech, you can obtain a list of words that satisfy your criteria.
The Edinburgh Associative Thesaurus and The University of South Florida Free Association Norms, both of which are accessible at http://rali.iro.umontreal.ca/word-associations/query/, are very helpful for researchers investigating word associations. These databases allow users to investigate which words are strongly associated with a given word. According to The Edinburgh Associative Thesaurus, for instance, L1 speakers of English tend to associate rain with umbrella, clouds, weather, shower, and Spain.
Other useful lexical databases include Word Net (http://wordnet.princeton.edu/), English Lexicon Project (https://elexicon.wustl.edu/index.html), MCWord: An Orthographic Wordform Database (http://www.neuro.mcw.edu/mcword/), MorphoLex (https://github.com/hugomailhot/MorphoLex-en), and The Irvine Phonotactic Online Dictionary (www.iphod.com). More resources can be found at the website of Behavior Research Methods (https://www.springer.com/journal/13428).
Lexical databases for languages other than English are also accessible online. They include Global WordNet Association (http://globalwordnet.org/resources/wordnets-in-the-world/), CELEX (English, German, and Dutch: https://catalog.ldc.upenn.edu/LDC96L14), Lexique (French: www.lexique.org), and Database of Japanese Vocabulary (https://tamaoka.org/download/index.html). Difficulty norms for Swahili-English word pairs (Nelson & Dunlosky, 1994) and Lithuanian-English word pairs (Grimaldi et al., 2010) help researchers create sets of L2-L1 word pairs that are matched for difficulty.
To control for participants’ prior knowledge, some researchers choose to use pseudowords in their research. Pseudowords refer to strings of characters that do not really exist in a given language. For instance, bunction, defermication, and sepretennial are all pseudowords in English. The ARC Nonword Database (http://www.cogsci.mq.edu.au/research/resources/nwdb/nwdb.html) and Wuggy: A multilingual pseudoword generator (http://crr.ugent.be/programs-data/wuggy) may help researchers to choose or create pseudowords for their experiments. English Lexicon Project, MCWord: An Orthographic Wordform Database, and The Irvine Phonotactic Online Dictionary, which are mentioned above, also contain some pseudowords.
As the above discussion has shown, the Internet offers a wide range of valuable lexical resources for language teachers, learners, and researchers. Resources such as online dictionaries, vocabulary lists, vocabulary tests, and corpora may facilitate teaching and learning of vocabulary. Researchers will benefit from online data collection platforms, bibliographies on vocabulary, online articles, and lexical databases. With the advance in information technology and vocabulary research, more useful resources will continue to appear in the future.
wbeal0093.pub2; wbeal0223.pub2; wbeal0755.pub2; wbeal1098.pub2; wbeal1109.pub2; wbeal1270.pub2; wbeal1285.pub2; wbeal20245
Cobb, T. (2007). Computing the vocabulary demands of L2 reading. Language Learning & Technology, 11(3), 38–63. https://www.lltjournal.org/item/441/
Coxhead, A. (2000). A New Academic Word List. TESOL Quarterly, 34(2), 213–238. https://doi.org/10.2307/3587951
Folse, K. S. (2004). Vocabulary myths: Applying second language research to classroom teaching. University of Michigan Press.
Godwin-Jones, R. (2010). From memory palaces to spacing algorithms: Approaches to second-language vocabulary learning. Language Learning & Technology, 14(2), 4–11. https://www.lltjournal.org/item/536/
Grimaldi, P. J., Pyc, M. A., & Rawson, K. A. (2010). Normative multitrial recall performance, metacognitive judgments, and retrieval latencies for Lithuanian-English paired associates. Behavior Research Methods, 42(3), 634–642. https://doi.org/10.3758/BRM.42.3.634
Nakata, T. (2011). Computer-assisted second language vocabulary learning in a paired-associate paradigm: A critical investigation of flashcard software. Computer Assisted Language Learning, 24(1), 17–38. https://doi.org/10.1080/09588221.2010.520675
Nation, I. S. P. (2004). A study of the most frequent word families in the British National Corpus. In P. Bogaards & B. Laufer (Eds.), Vocabulary in a second language: Selection, acquisition, and testing (pp. 3–13). John Benjamins.
Nation, I. S. P. (2022). Learning vocabulary in another language (3rd ed.). Cambridge University Press.
Nelson, T. O., & Dunlosky, J. (1994). Norms of paired-associate recall during multitrial learning of Swahili-English translation equivalents. Memory, 2(3), 325–335. https://doi.org/10.1080/09658219408258951
Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24(2), 223–242. https://doi.org/10.1093/applin/24.2.223
O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge University Press.
Rizo-Rodríguez, A. (2008). Review of five English learners’ dictionaries on CD-ROM. Language Learning & Technology, 12(1), 23–42. https://www.lltjournal.org/item/10125-44129/
Webb, S. (2020). The Routledge handbook of vocabulary studies. Routledge.
Webb, S., & Nation, I. S. P. (2008). Evaluating the vocabulary load of written text. TESOLANZ Journal, 16, 1–10. https://www.tesolanz.org.nz/publications/tesolanz-journal/
West, M. (1953). A General Service List of English words. Longman.
Anthony, L. (2020). Resources for researching vocabulary. In S. Webb (Ed.), Routledge handbook of vocabulary studies (pp. 561- 590). Routledge.
Ballance, O. J. (2018). Technology to teach vocabulary. In J. I. Liontas (Ed.), The TESOL encyclopedia of English language teaching (pp. 1-6). Wiley-Blackwell. https://doi.org/10.1002/9781118784235.eelt0920
Ballance, O. J., & Cobb, T. (2020). Resources for learning single-word items. In S. Webb (Ed.), The Routledge handbook of vocabulary studies (pp. 320-335). Routledge.
Elgort, I. (2018). Teaching/developing vocabulary using ICTs and digital resources. In J. I. Liontas (Ed.), The TESOL encyclopedia of English language teaching (pp. 1-15). Wiley-Blackwell. https://doi.org/10.1002/9781118784235.eelt0735
Karpenko-Seccombe, T. (2020). Academic writing with corpora: A resource book for data-driven learning. Routledge.