- howdoyou.do allows students to practice their English writing and speaking skills by chatting with native speakers or with other learners of English
- Basecamp allows groups of people to collaborate on projects remotely…
- …as does Trello
- Tandem is a mobile app which helps students to find native speakers of English who want to do a language exchange
- P2 is a theme for WordPress that transforms a blog into a social forum, with features such as inline comments, and inline editing of posts and comments
- The Ginger Grammar Checker helps students write better English and efficiently corrects texts
- The Cambridge English Corpus is a multi-billion word collection of written and spoken English
- The NGSL-S is a list of high frequency words of everyday spoken English…
- …while the Michigan Corpus of Academic Spoken English focuses on academic spoken English
- Word Learner (iOS|Android|Web) is a science-based efficient word learning system which tracks students’ vocabulary learning
- Lingopolis is a fun, social and fast vocabulary game powered by Cambridge Dictionaries Online
- VoiceTube allows students to learn English by watching TED talks, movies, and music videos…
- …while Speech Yard also offers English learning through videos with interactive subtitles
- PirateBox is a DIY anonymous offline file-sharing and communications system built with free software and inexpensive off-the-shelf hardware
- RACHEL Offline provides free copies of open source websites such as TED and Wikipedia for download and use without an internet connection
- Plickers is a tool that lets teachers collect real-time assessment data without the need for student devices
- The Intel Compute Card is a credit card sized computing device to be launched in August 2017
- SpyFall is an online version of the popular language-based board game of the same name
- Showbie combines all of the essential tools for assignments, feedback and communication into a single app…
- …while Schoology allows teachers and students to connect, communicate, and share with their peers across campus and around the world
- ReadTheory is an online reading practice platform that supplies students with an extensive library of passages targeting individual levels
- English News Weekly is a weekly English news podcast produced by Hiroshima University
- The M3 is a tiny fully functioning computer
- Kapture is an audio-recording wristband that allows you to easily save and share audio recordings of your life
- Not Hotdog is an app which allows you to check whether or not something (or someone?!) is a hotdog
- Mersiv is a concept designed to revolutionize the way in which we learn languages
- Memoto is a tiny, automatic camera and app that gives you a searchable and shareable photographic memory
- Sketch Engine is a corpus tool to create and search text corpora in more than 80 languages…
- …while WordSmith Tools provides a variety of corpus analysis software…
- …as does Laurence Anthony
- Datawrapper is an open source tool which allows you to create charts and graphs…
- …as does Tableau
- Datasift provides access to data from social networks, blogs, news, and more
- AntCorGen is a freeware corpus creation tool
- FireAnt is a freeware social media and data analysis toolkit
- PhraseBot is an awesome puzzle game to actively learn any kinds of words, phrases or sentences
- Apps 4 EFL: Real Time allows you to test your students’ vocabulary knowledge in real time, and has the NGSL, NAWL, and other word lists built in
- Apple TV allows students to wirelessly connect their devices to the classroom projector
- iBooks Author allows anyone to create iBooks Textbooks for iPad and Mac
- Spaceteam ESL is a fun English learning game that students can play with their friends and classmates using phones or tablets
…which reminds me of this Mitchell and Webb sketch:
- Do it without looking. Tell the students to look down at the line, then look up and say it.
- Do it with the book closed (students can open it briefly to check if they forget the line).
- Substitute words and phrases for the students’ own ideas, change names, places, or any other words.
- Do it with emotion – happy, sad, angry, confused, etc. Get the students to try a variety of combinations.
- Do it with an accent – American, British, robot, zombie – get the students to use their imaginations!
- Do it with gesture only but no sound, over emphasizing the gestures to convey the meaning of the text.
- Tell the students to stand up and act it out. Get them to use props and costumes if available.
- Have the students write another five or ten lines for the dialogue, and then repeat steps 1 to 7.
- Repeat steps 1 to 7 with a different partner.
- Have the students translate the dialogue into their first language(s), and then back to English again without looking at the original.
The list is available under a Creative Commons license, and can be viewed and downloaded here.
The list of real sounding “fake” words used for the new Apps 4 EFL activity “Fight the Fakes” is now available for download.
The list was generated by looping through each of the words from the SIL list and splitting them into three-letter chunks. A Markov chain process was then used to determine which of the three letter chunks were most likely to precede or follow each other. The three-letter chunks were then recombined according to these likelihoods in order to create realistic sounding neologisms of various lengths, e.g.
The words were doubled checked against the SIL list to ensure no real words were accidentally generated.
Fun ways to teach with the words
- Try the new Apps 4 EFL activity Fight the Fakes, which uses the words as distractors against low frequency items from the BNC
- Ask your students to try and invent “definitions” for the fake words based on what they sound like, e.g. “hispanelist (n.), chat show panelist from Latin America”, “mandibilious (adj.), used to describe an animal with extraordinarily strong jaws”, “rattlesnatcher (n.), a person who goes around stealing toys from small children”
- Use them as in Yes/No vocabulary knowledge tests to ensure students don’t cheat by clicking “Yes, I know this word” for every item
Download the data:
- New General Service List: Google Sheet / More info
- New Academic Word List: Google Sheet / More info
- TOEIC Service List: Google Sheet / More info
- Business Service List: Google Sheet / More info
Each spreadsheet contains 23 columns:
- Word: the word (lemma) as it appears on the original list
- POS: the most common part-of-speech for the word according to the Moby Part-of-Speech database
- BNC Rank: the frequency ranking of the word according to the British National Corpus (lower number equals higher frequency)
- Google Rank: the frequency ranking of the word according to the Google Corpus (lower number equals higher frequency)
- IPA: the International Phonetic Alphabet transcription of the word, using data derived from the CMU Pronuncing Dictionary
- Conjugations: variations of the form of the word according to tense, person, etc*
- Synonyms: a list of words with similar or related meanings*
- – 23. Multilingual definitions: Arabic, Chinese, German, Greek, English, French, Italian, Japanese, Korean, Dutch, Portuguese, Russian, Spanish, Swedish, Thai, and Turkish*
*Data provided by public domain dictionary/thesaurus sources, where available
The final list consists of 3,773 high frequency TOEFL words, and can be downloaded here.
Step 1: Assemble a corpus of TOEFL materials
For my corpus, I used material from both the older CBT (Computer Based Test) and the current iBT (Internet Based Test). I found most of the materials online for free. Some were already in plain text format, but most were PDFs and required Optical Character Recognition (OCR) to convert to plain text. I used ABBYY’s FineReader Pro for Mac, but there are plenty of other options out there too. Some files were Microsoft Word format (.doc/.docx), and MacOS X’s batch conversion utility came in hand for these. I included model answers, listening transcripts, reading passages and multiple choice questions (prompts, distractors and answers). I tried to exclude explanations, advice and instructions from the authors and/or publishers.
Ultimately, I ended up with corpus just shy of a million words (959,124 to be precise). In general, bigger is better when it comes to corpus research. The TOEIC Service List (TSL) utilizes a corpus of about 1.5 million words, so my TOEFL corpus seems roughly comparable to this.
Step 2: Count the number of occurrences of each word
I used some custom PHP code to process my corpus data (although Python is probably more suited for corpus analysis). I lemmatized each token where possible using Yasumasa Someya’s list of lemmas. I then cross referenced each lemma occurrence with the NGSL, NAWL and TSL. Finally, I exported to a CSV, and ended up with 13,287 rows of data.
Step 3: Curate the final list
For my final list I removed any words which also appear on the NGSL, any contractions (e.g. “Don’t”,”I’m”,”that’s”), any numbers written in word form (e.g. “two”,”million”), any vocalizations (e.g. “uh”,”oh”), any ordinals (e.g. “first”,”second”,”third”), any proper nouns (“James”, “Elizabeth”, “America”, “San Francisco”, “New York”), and any words with fewer than 5 occurrences in the corpus. Next, I ran the list through a spell checker, and excluded any unrecognized words. I also excluded any non-lexical words, to leave a list consisting only of nouns, verbs, adjectives and adverbs.