How data drives video games
Here at Register Dynamics we do a lot of work in understanding data and data structures. Designing and developing them for a variety of applications. Often we talk about data we have worked on or made use of, typically in our professional lives and experiences. Today we’re going to dive into an area of data that many people, including ourselves, are more likely to encounter and engage with in their personal lives - data in videogames.
Modern video games are practically built on data, with data underpinning almost everything from gameplay mechanics to player experiences. Video games often use both structured and unstructured data to build worlds, levels, characters, items, and much more. Let’s take a look at some of the aspects of video games where data can play a key role.
Development and design
This is perhaps the most obvious way in which data is used, when building and developing the game. Assets and levels, comprising things such as 3D designs, textures, sound effects and level designs are typically stored as data files in JSON, xml and other formats. Physics engines within games are also built on blocks of data detailing everything from collisions to gravity. AI is also typically a bundle of data, consisting of decision trees (often made up of interconnected ‘if x then y’ statements, with a bit of randomisation thrown in) used to simulate intelligent behaviour. Game balancing also uses data, and is often done by tweaking and refining the many numerical values within the game, in reaction to player actions or choices where possible.
Player progression and save systems
Another clear way in which data is used is recording players progress through the game. Save files in modern games will often contain a vast amount of information on the state of the player's progress. Such information is often stored as binary or structured data. In RPGs, for example, such save files will need to record the players progress and interactions in the world, what areas have been unlocked, which NPCs they have spoken to, the progress of quests, and their characters stats (health, experience points, skill upgrades, equipped items, etc.); along with their position in the world at the time of the save and the entire contents of their inventory.
Save systems could be one of the most visible ways (to players) in which the ability of games to store data has improved over time. Early video games often had no ability to save game data, other than high scores. When the ability to save progress was introduced, it was often with limitations. Players were typically limited to a single save file per playthrough, whereas today players can create multiple save files per playthrough (and now playthroughs where players are limited to a single save are often presented as an increased difficulty option, e.g. Baldur’s Gate 3’s ‘Honour difficulty’).
Also, how much progress was saved was limited. In the Legend of Zelda series (some of the earliest games to use such a save system), reloading a save would not restore you to exactly where you were (as is the case in most modern games); rather, you would respawn at the beginning of a dungeon, or at a starting point in the overworld. The game cartridges were also limited to a certain number of save files, normally three. In some games in the series there were further limitations. In Majora’s Mask, the player has to save by returning to the start of the game’s three day cycle, where all the players collected items would be lost, likely due to the limitations of how much data could be stored. International versions of the game had a quicksave function, allowing players to make a limited save to their progress without returning to the beginning of the cycle, but at the expense of the game only having two save files to free up the necessary space. In updated rereleases of the game, it has become possible to remove these limitations.
Contrast the above with modern games, where players can typically save practically all the information about their game, and pick up right where they left off.
Online and Multiplayer features
Here we see storage of data not about the game, or the player characters position within it, but about the players themselves. Games with online multiplayer systems will often store data about players skill ratings and achievements to compare them with each other. This can be used for the purposes of matchmaking, ensuring that players compete against other players of a similar skill level, and also for the purposes of leaderboards or ranking players against one another. In most games this is done through the Elo system, where players are assigned a numerical value reflecting their skill level. After each match these scores are adjusted, with the winning player(s) having their score increased and the loser(s) having it decreased, proportional to the initial discrepancy between the scores.
In more modern games, the data that is collected on players can also be used to inform anti-cheat systems, which analyse the data on player behaviour for discrepancies that could indicate cheating.
Analysis of player behaviour
Modern video games can collect a lot of data about how players play the game and what they spend their in game time doing. This data can then be used for a variety of purposes.
Heatmap analytics provide a visual representation of where players spend their time in the game. This allows developers to identify popular or unpopular areas within the game, determine user flow (i.e. how they navigate through the game), or pain points where users have difficulty. Such information can be used to drive improvements to the game through updates, or to inform the design of future games.
Retention metrics are used to measure player engagement with the game and player longevity. This can provide valuable information about the player base, and identify where and why players drop off. Again such information can then be used to make changes that improve the retention rate.
Analysis of players actions is also used by game companies to improve monetisation strategies in games with downloadable content (DLC) or microtransactions. In these cases the data collected on player behaviour allows companies to maximise revenue generated from such in-game content.
On a more positive note, game companies can also release player statistics to engage with the community or to celebrate milestones. For example, the company behind Baldur’s Gate 3 (Larian Studios) released a smorgasbord of game statistics for the games 1 one year anniversary (and again for the second) to celebrate the milestone, engage with the community and demonstrate its impact, and to showcase the diverse (and often amusing) ways in which players engage with the game world.
Dynamic and procedural content
Procedural generation is a method for creating large amounts of data algorithmically rather than manually. This is typically done through the use of initial human generated content, which is used to feed algorithms combined with computer generated randomness to produce a considerable amount of data with relatively little input. The advantage of this can include smaller storage space, more content, and increased randomness for less predictable gameplay. Procedural generation was originally developed for video games but has since been utilised in other applications, such as music composition.
Dynamic content generation is where AI systems in games can learn from players’ actions and adjust gameplay in real time. Algorithms can use the player data that they have collected to change levels, characters and storylines based on how players interact with the game, giving the ability to generate personalised game experiences.
Post-launch support & live systems
In the olden days, once a game was launched, that was it, the game would remain in that state, bugs and all, maybe with an expansion a year or two down the line to fix some things and add new features. Today game developers typically provide a constant flow of patches and updates to games for a few years (in some cases even a decade) after release. To inform such patches, developers will collect data in the form of bug reports and player feedback (or through some of the player data analysis described above) to identify things that need fixing and ways in which the game can be improved and incorporate these into future patches. In some cases the ‘final’ version of the game can look vastly different to that which was released at launch.
Modern games also store a lot of data in the cloud (most commonly save files) which allows the automatic creation of backups to player data, so they can restore their progress if there is an issue on their end, or sync their data across multiple devices. Such cloud data can also be used to enable cross-play, allowing players to play the game across different platforms or, in multiplayer games, to play with otters using different platforms (e.g. PC and Xbox)
Emerging Trends
Machine learning is increasingly used in video games to create game AI that adapts to player behaviour. One example is Left 4 Dead’s Director AI, which spawns enemies in varying positions and numbers based on the skills and status of current players, in order to create a unique experience.
Blockchain is also making inroads into game development, where game data is stored in a decentralised structure. As such the game assets would be owned by a community rather than a single entity, in theory making them harder to compromise or discontinue. Blockchain can also allow in-game ownership of digital assets which can be bought or sold in an online market.
Summary
Modern video games are fundamentally built upon data, which underpins everything from core development to the player's experience. During development, data defines the game's very structure: assets like 3D models, levels, and sounds are stored as data files, while physics and AI are powered by data governing collisions, gravity, and decision trees. This data is also crucial for game balancing, as developers tweak numerical values to refine the gameplay.
Data is equally vital for managing the player's journey. Save systems record a player's precise progress, from character stats and inventory to quest completion and world state, allowing them to resume exactly where they left off. The evolution of these systems highlights the increased capacity for data storage, with modern games offering extensive, detailed saves compared to the limited save files of earlier titles. Beyond individual progress, data drives online multiplayer features, including skill-based matchmaking using systems like Elo, anti-cheat mechanisms, and leaderboards.
Finally, data analytics shape games both during and after their release. Developers analyse player behaviour through heatmaps and retention metrics to identify popular areas, pain points, and reasons for player drop-off, which informs updates and future game design. This data also helps refine monetisation strategies. Emerging trends like procedural generation use algorithms to create vast amounts of content from minimal data, while machine learning enables AI that adapts to player actions in real-time, creating dynamic and personalised experiences.
If you want more insights into the myriad ways in which data is used in a variety of applications, check out our other blogs here.
Author
Tags: