Virtual Worlds for Language Teaching and Learning

Introduction

The COVID-19 pandemic has forever changed the way that we work and learn. The “new normal” in the 21st century is for students to engage with their teachers and peers in both physical and virtual learning environments. In the summer of 2022, Paul Raine and Raquel Ribeiro explored 6 different virtual worlds, and evaluated their viability for language teaching and learning. Only virtual worlds that run in a web-browser and have a free trial were selected for this project. In this article, we present the results of our investigation, with the hope that this information will be of interest to other language teachers looking to teach all or some of their classes online. A series of YouTube videos to accompany this article can be found here.

Overview

The following virtual worlds are review in this article:

  1. Spatial.io (video)
  2. Gather Town (video)
  3. Wonder* (video)
  4. Orbital** (video)
  5. Mozilla Hubs (video)
  6. KumoSpace (video)

*In April 2023, Wonder announced it would be shutting down. We list the platform here for informational purposes only.
**In December 2022, Orbital announced it would be shutting down. We list the platform here for informational purposes only.

SpatialGatherWonderOrbitalMozilla HubsKumospace
Perspective3DTop-downTop-downTop-down3DTop-down
Audio & Video ChatYesYesYesYesYesYes
Text ChatYesYesYesYesYesYes
Screen SharingYes – EmbeddedYesYesYesYes –
Embedded
Yes
Customizable AvatarYesYesNoNoYesNo
Sticky NotesYesNoYesYesNoNo
Object PickerYesYesNoYesYesNo
Interactive EnvironmentYesYesNoNoYesNo
Emoji ReactionsNoYesYesYesYesYes
Mini GamesNoYesNoNoNoYes
Web AppYesYesYesYesYesYes
iOS AppYesNoNoNoNoNo
Android AppYesNoNoNoNoNo
Forever-free PlanYesYesNo – Free TrialYesYesYes
Best FeatureDance movesMini GamesRandomise UsersPrivate IslandLaser PenPiano Music

Common Features

Perspective

The virtual worlds (VWs) we evaluated for this report came in two different perspectives: top-down and 3D. Four of the six worlds had top-down perspectives, and two offered a full 3D experience.

Figure 1: The top-down perspective of Kumospace

Figure 2: The 3D environment of Spatial

Audio and video chat

All the VWs we evaluated had the ability to chat via live video and audio with other members. In some VWs, the video stream was rendered as the user’s avatar, and in other VWs, the video was rendered above or to the side of the environment.

Figure 3: In Orbital, the user’s video stream is rendered as their avatar.

Figure 4: In Wonder, the user’s video stream is rendered above or to the side of the environment.

Text chat

All of the VWs we investigated in this study offered the ability to send text-based messages to other members of the environment. We found that the text chat was a very useful supplement to audio-visual teaching methods, especially when teaching new words with unfamiliar pronunciations.

Screen Sharing [Embedded]

All of the VWs in this study offered the ability to share a screen with other users in the environment. In Mozilla Hubs and Spatial, the shared screen was “embedded” in the environment such that users could choose to either continue interacting with each other, or view the shared screen from a variety of perspectives.

Figure 5: Sharing a screen in Spatial

Custom Avatar

Some of the VWs we investigated allow the user to customise their avatar in various ways. The most advanced and personalised customization was offered by Spatial, which provided a way to convert a digital photograph into a 3D head for a user’s avatar.

Figure 6: Spatial offers the ability to convert a photo into a 3D head for a user’s avatar

Sticky Notes

Most of the VWs we investigated offered the ability to add “sticky notes” to the environment. These came in useful when teaching new words or phrases.

Figure 7: The “sticky note” function in Orbital

Interactive Environment

Some of the VWs we investigated offered the ability to interact with one’s environment. For example, by picking up and moving objects around. This feature could be used for teaching prepositions, by instructing learners to “put the goldfish on the wall” for example.

Figure 8: Interacting with a 3D goldfish in Mozilla Hubs

Mini Games

Two of the VWs we investigated featured “mini games” such as chess, which were completely contained within the virtual world. It is possible that these mini games could be used for spoken fluency practice by higher level language learners.

Figure 9: An invitation to play chess within the Kumospace virtual environment.

Object Picker

In addition to being able to interact with one’s environment, some VWs also offer an “object picker”, “designer”, or “build tool” that allows the user to add, remove, or change objects in the environment. This could either be used for introducing new vocabulary items, or for making the environment a more conducive space for language learning.

Figure 10: The object picker within Gather allows the user to add a wide range of weird and wonderful items to their virtual environment.

Emoji Reactions

Most of the VWs we investigated allow the user to react with a variety of emojis. These could be used for expressing comprehension, interest, or confusion when learning a language in a virtual environment.

Figure 11: Reacting with a “heart” emoji in Gather

General Suitability for Language Learning

The authors found that in general, it was possible to learn new foreign words and phrases inside of the VWs investigated in this study. This was verified in a rudimentary way by learning words and phrases in Portuguese and Japanese. One author had a native level of Portuguese, and attempted to learn some basic Japanese. The other author had an intermediate level in Japanese, and attempted to learn some basic Portuguese. Both authors were complete beginners in the language being taught to them. 

It was found that the fidelity of the audio stream was of paramount importance in the language teaching and learning process. Where the quality of the audio was bad (such as in Spatial) it was sometimes not possible to distinguish between similar consonant sounds, such as “d” and “b”. For instance, when the Portuguese word for “chair” was first introduced, it was initially pronounced by the learner as “cabeira” whereas the correct pronunciation is “cadeira”. The authors found that the chat function could be used to clarify the pronunciation of unfamiliar words when the audio was insufficiently clear.

Specific Methodologies

Because both authors were complete beginners in the languages they were learning (Portuguese and Japanese) simple “show and tell” and “listen and repeat” methodologies were the main ones adopted in this preliminary investigation. In addition, Total Physical Response (TPR) was also briefly trialled, with one author being instructed by the other to “go closer to the tree” in Portuguese. 

It is expected that, in reality, learners using VWs would not be complete beginners in the languages they are studying. Therefore, it seems reasonable that methodologies such as Communicative Language Teaching (CLT) or perhaps even Task Based Language Teaching (TBLT) could be adopted, and that this would result in improvements to communicative and pragmatic competence in a similar way that it would in real, physical classrooms.

Issues and Limitations

There were several issues and limitations with the current study. Firstly, the authors involved were living on opposite sides of the world, with a 12 hour time difference. This sometimes made it difficult to find a suitable time to meet up. It also caused occasional network issues. 

In addition, the authors encountered some audio fidelity problems in the Spatial virtual environment, which interfered with the ability to clearly understand the correct pronunciation of unfamiliar foreign words.

Finally, only the two authors in this study were able to participate in the environments investigated. In real life situations, it is highly likely that there would be more participants in a language learning environment, including the teacher and perhaps 10 to 20 students. The effect that this number of users would have on the quality of the language learning experience is not known, and should be further investigated. Many of the VWs investigated in this study were specifically designed to handle a large number of concurrent users, and it would be interesting to see how the affordances of these virtual environments could be leveraged for larger classes.

Conclusion

Although Zoom has become the de facto application for online synchronous communication, it is not the only way to connect with people in remote locations in real time. The authors found many of the above virtual worlds to be just as reliable as Zoom and in most cases more visually engaging and stimulating. Language teachers might like to consider one or more of the above options in addition to or instead of Zoom for their online language classes.