The Portal Dialogue Corpus
1 UC Berkeley 2 NYU 3 Saarland University * Equal Contribution
We collected a dataset of spoken human dialogues in Portal 2's cooperative mode, consisting of 11.5 total hours of gameplay collected from 36 participants. We release transcripts of these dialogues, along with gameplay videos and linguistic annotations, to support research on collaborative language use.
Background Portal is a first-person puzzle platformer game in which players must place portals in order to navigate through a series of increasingly difficult puzzles. The sequel to this game, Portal 2, introduces a cooperative mode in which players must collaborate in order to solve puzzles. We collected recordings from 18 pairs of players (36 participants in total), with each pair playing for up to one hour.
The Corpus We share gameplay data, videos, audio, transcripts, and dialogue act annotations at the links below:
| Game data (i.e., demo files) | Hugging Face Download |
| Video (without audio) | YouTube (stitched) Download (split) Download (stitched) |
| Audio | Fill out this form to request access |
| Transcripts | Data Explorer Hugging Face Download |
| Annotations | Data Explorer Hugging Face Download |
Game data is stored in the Source demo file format, which can be used to recover the underlying game state. Video recordings are available in two formats: stitched (i.e., both perspectives in a single video) and split (i.e., one player's perspective per video). We cannot directly share audio data due to an IRB agreement, but you can request access here.
Example Clips
Select a clip below to view the corresponding video and transcript. These clips highlight key examples of collaborative dialogue from our paper. You can also explore the full dataset on the Data Explorer.