The final list consists of 3,773 high frequency TOEFL words, and can be downloaded here.
Step 1: Assemble a corpus of TOEFL materials
For my corpus, I used material from both the older CBT (Computer Based Test) and the current iBT (Internet Based Test). I found most of the materials online for free. Some were already in plain text format, but most were PDFs and required Optical Character Recognition (OCR) to convert to plain text. I used ABBYY’s FineReader Pro for Mac, but there are plenty of other options out there too. Some files were Microsoft Word format (.doc/.docx), and MacOS X’s batch conversion utility came in hand for these. I included model answers, listening transcripts, reading passages and multiple choice questions (prompts, distractors and answers). I tried to exclude explanations, advice and instructions from the authors and/or publishers.
Ultimately, I ended up with corpus just shy of a million words (959,124 to be precise). In general, bigger is better when it comes to corpus research. The TOEIC Service List (TSL) utilizes a corpus of about 1.5 million words, so my TOEFL corpus seems roughly comparable to this.
Step 2: Count the number of occurrences of each word
For my final list I removed any words which also appear on the NGSL, any contractions (e.g. “Don’t”,”I’m”,”that’s”), any numbers written in word form (e.g. “two”,”million”), any vocalizations (e.g. “uh”,”oh”), any ordinals (e.g. “first”,”second”,”third”), any proper nouns (“James”, “Elizabeth”, “America”, “San Francisco”, “New York”), and any words with fewer than 5 occurrences in the corpus. Next, I ran the list through a spell checker, and excluded any unrecognized words. I also excluded any non-lexical words, to leave a list consisting only of nouns, verbs, adjectives and adverbs.
This might not be a problem for individual users, but it becomes a major issue when leading a group of students in lock-step through a structured learning process. The fact that the “user experience” is inconsistent means that there is no single set of instructions that all students will be able to follow. The fact that developing for every possible OS/handset combination is a challenge means that many apps only run on the latest OS versions of the most popular handsets.
So, although every student may possess a smartphone, not every smartphone will be able to run the cool CALL app you have in mind. Even if they can, you will either have to give individual support to every student in helping them set up the activity, or create multiple iterations of the instructions to cover every OS/device eventuality.
Unlike institutionally owned devices, which can be easily wiped after the user logs out or finishes the class, student owned devices contain a trove of personal data: photos, messages, appointments, contact information, and more.
Most students would probably feel uncomfortable sharing at least some of this information with their teachers. So when we walk around the room monitoring students to make sure they are on-task, or helping them set up the mobile-based CALL activities, we have to be careful not to inadvertently peek into the personal lives behind the tiny glowing screens in their hands.
Ever since Apple overhauled the iOS notification system, it seems that every app and its dog wants to send me updates, offers, news and status reports. While I endeavor to disable notifications for any app that doesn’t absolutely need them, my students tend to be less discerning. There’s nothing worse than setting up a class activity on mobile devices, only to have students navigate away from the app or site the moment a giant emoji-laden message drops down from the top of the screen. Even the students who diligently dismiss annoying messages from friends must find them a distraction from the learning process.
And I haven’t even begun to mention the students who will double click the home button and go back to Candy Crush the minute you’re not hovering over their shoulders and spying on their screens.
The modified version of Maslow’s hierarchy of needs now puts battery life right at the bottom of the pyramid, directly below “Wi-Fi”. Yes, this a sarcastic dig at millennials’ seeming inability to pull themselves away from their devices and do something healthy like.. climb a tree. However, in the CALL-based EFL classroom, it is a very pertinent observation.
This means that students, who are already heavy mobile users, may simple not have enough juice to utilize their devices during study time as well as break time. Where this is the case, you’d better hope that you have enough power outlets and charging cables to get them hooked back up to the mainline.
Capped data plans on mobile are generally the norm these days. There may be actual technological reasons behind this, but the cynical side of me suspects it’s just the carriers trying to milk heavy users for more money.
In any event, if you don’t have an easily accessible Wi-Fi network in your classroom (which isn’t restricted to just teachers) and you’re asking students to use their own data connections to engage with your chosen app or website, you have to be careful not to inadvertently incur additional charges for your students. Usually they will be quick to let you know when this is the case, but it can be yet another barrier to the successful exploitation of BYOD.
If you can overcome the difficulties presented by various models of various handsets running various versions of various operating systems, and all students have a fully juiced up device with plenty of bandwidth, and they are able to pull themselves away from Candy Crush, and ignore messages from their friends in other classes, then BYOD can be a good way to gain access to mobile technology in the classroom.
However, we must be careful not to appropriate students personal (and often private) devices as our own teaching tools, despite how cool that new ELT app may be.
I was honored to receive a Best of JALT award for my presentation on Apps 4 EFL at the Nakasendo English Conference 2015. I’d like to thank the organizers of Saitama JALT for inviting me to give the plenary presentation, and for nominating me for this award. In particular I’d like to thank Matt Shannon, Tyson Rode, and Rob Rowland – you guys are awesome! Thanks!
I’ve just finished reading the excellent Language Learning with Technology by Graham Stanley. It’s packed full of useful ideas for how best to integrate technology with the ESL classroom, and I highly recommend it.
However, having eagerly loaded up almost all the links in the book, I encountered several problems, which are well-known to anyone who’s ever surfed the web. I should acknowledge here that these problems apply equally to all publications relating to web-based technology, including my own.
The trouble with books is that their text is unapologetically static. Once a book is published, once the ink is dry on the dead cellulose wood fibers, it can’t be changed. At least, not until a new edition is released.
Conversely, the web is unpredictably dynamic. URLs which exist on Monday may completely disappear by Friday, or take us somewhere we never intended or expected to go. Link rot is defined by Wikipedia as:
the process by which hyperlinks on individual websites or the Internet in general point to web pages, servers or other resources that have become permanently unavailable
Link rot is a major issue when writing anything about web-based technology. This problem exists not just in relation to dead tree publications, but also with web-based ones, although the latter can be more easily updated.
Link rot is the main reason why it’s inadvisable, to say the very least, to publish anything that relies mainly on the availability and predictability of web resources. It’s one of the reasons why search engines like Google eventually won out against web directories like Yahoo! The web is in constant flux, and trying to write down, describe, or analyze any website, excluding perhaps the web’s most permanent destinations, is an exercise in futility.
No, I’m not talking about British people’s infuriating refusal to acknowledge compliments. I’m talking about what TechTarget defines as the following:
In IT, deprecation means that although something is available or allowed, it is not recommended or that, in the case where something must be used, to say it is deprecated means that its failings are recognized
Perhaps the most infamous example of a deprecated web-based technology in recent years has been Adobe Flash. Once the only way for webmasters to easily deploy games, videos, audio, and a whole host of other snazzy features on their sites, Flash is now regarded with disdain by surfers, web browsers, and tech giants alike.
Almost anyone who has ever used a smartphone or a tablet can tell you: Flash just doesn’t work on mobile. Unfortunately, the appeal of Flash-based apps hasn’t faded as quickly as Apple would have liked. This means that teachers who have recently furbished their classes with a set of iDevices have to be extra careful about which web-based resources they prescribe or recommend, and must be prepared for disappointment when they see the old familiar message: This page requires Macromedia Flash Player to run correctly.
Squatters and Hackers
If squatters are the opportunistic freeloaders who jump into your house and claim it for themselves the minute you vacate it, hackers are the guys who sneak in through the hidden back entrance, change the locks, and stick their name on your front door just to prove a point.
Both hackers and squatters are a major issue in relation to web-based resources. The domain of my name, paulraine.com, is a good example of domain squatting. Back in 2006, I owned it, but later let the registration slip. Now it is “parked”, and if I ever want to use it again, I will probably have to pay the current owner an exorbitant sum to do so. This happens a lot with domains that have at one point been registered. If the owner fails to renew, they get scooped up by internet squatters, and used to advertise tenuously related sites and services. This can happen at any time, and we must be careful that sites we recommend to colleagues or students haven’t been surreptitiously converted into money-making portals.
If your domain hasn’t been taken over by squatters, it may still have been invaded by hackers, which seems to be the unfortunate case with the official companion site to Language Learning with Technology (languagelearningtechnology.com), which, as of November 2016, seems to have be “pwned” by a certain “gunz_berry”:
Bear in mind this is a book that was published only three years ago, in 2013. Who knows how long the companion site has been displaying the “Hacked by gunz_berry” message. Unfortunately, many of the ideas and suggestions in the book refer to the companion site, so it’s a real shame that it’s been compromised. I hope that Cambridge University Press can get it back up and running again soon (I’ve already tweeted the author to let him know the site is down).
Ultimately, there’s not a lot ed-tech authors can do about many of these problems. To avoid link rot, it’s best to go with sites that have been around for at least a couple of years, but even then, they can disappear suddenly and without warning.
We should also be careful not to endorse deprecated technologies, but the pace of technological progress is so fast, that even newer innovations are becoming deprecated very quickly.
As for squatters and hackers, we can only try to ensure that our security arrangements are up-to-date, and we remember to pay our domain renewal fees.
Perhaps a non-technical solution is the best to these technical challenges, and Graham Stanley manages it quite well: focus on types of technology rather than specific instances; focus on ESL activities rather that ESL sites; and give alternatives and variations for every suggestion, to ensure that ideas can still be applied even if the technology itself fails in certain instances.