Introduction
In an endeavor to future-proof this discussion of a highly complex and evolving field, I will mostly discuss the structure of virtual reality and augmented reality applications with only limited mention of any particular contemporary implementations. Any quick browser query will tell you what languages are currently most popular for VR programming, but speaking more specifically on how to approach structuring a VR application from a conceptual level without referring to the specific hardware used to create it is a different beast entirely. To that end, the three major steps of formatting VR and AR programs are:
- Input Channels
- Input Parsing
- Output Channels
As you may have guessed, we are focusing on the primary part of these applications that functions completely differently from normal computer applications: the means by which the application communicates with the user.
Input Channels
Inputs are the most important and most volatile part of VR programming, both from a hardware and software perspective, as new technologies and old mesh together to provide the greatest degree of control and freedom in how a user interacts with your 3D world. That volatility comes from the many improvements and redesigns that regularly overtake the VR space and both vastly improve the abilities of developers while also potentially invalidating their previous work. As an example, in one implementation of a VR application the user might use joysticks that have positional data relative to a headset that the developer can then use, through varying levels of abstraction depending on their framework, to determine what the user intends to interact with and how they intend to do so. However, another variation of the same resulting output to the user could be implemented by eye-tracking technology coupled with an outgoing camera on the headset that could determine the user's intent by their hand's position and shape. Both of these examples achieve the same result of allowing the user to interact with their virtual environment, but in terms of how the user's input is received by the developer, they are completely different. So the first step to developing a VR application is understanding how you will be receiving data from the user, and what level of abstraction you will be dealing with.
Input Parsing
Once you have a grasp on the types of input you are receiving from the user, the next step is to analyze these inputs to extract their intent from them. In the previous example, we briefly touched on how you can track hand and eye position with cameras and other sensors so you can map their position onto the virtual environment. In these VR environments, context is the basis for all input parsing. If the user is pointing a finger and it is positioned on a button element in your 3D environment, that input can be interpreted as a button press. If they are making the same hand gesture but are in a fantasy video game environment, they might cast a spell based on their hand shape and the presence of an enemy in the direction the user is pointing. On a lower level of abstraction, depending on what level of VR programming you are doing, you might need to be able to understand the precise outputs of a sensor and how it translates to 3D space. Gyroscopes can relate information about movement and rotation, cameras can parse highly valuable and specific data about the user's physical build, position, and environment, and even more advanced sensors can read muscular signals to glean user intent. Thus, the second step in VR programming is to learn how to translate the variety of user inputs into usable data and actual changes in the virtual environment.
Output Channels
Finally, when the user's inputs have been translated and mapped to their environment, the various means by which the environment responds to the user are our output channels. The most obvious and commonly used at the time of writing is the headset that visually and audibly presents the 3D environment to the user. This feedback aspect of VR is in constant development as well though. One technology that has been gaining traction lately is haptic feedback: physical outputs that are usually transmitted through vibrations or adjustable resistances on VR controller triggers. To go back to the fantasy video game example, picking up a light object and throwing it might result in minor vibrational feedback, but grabbing and holding a large weapon might produce stronger resistance from the trigger when squeezed. While this part of the VR experience is currently not as varied as the input side, it is still important that you understand what avenues of communication are available to you when producing feedback for the user from their actions in the virtual environment.
Conclusion
To recap, the three most important things to keep in mind when programming a virtual reality application in any language or framework are: the types of inputs you will receive from the user, the way you interpret these inputs and translate them to the virtual environment, and the available means by which you can output the results of the user's actions in the virtual world. With this format in mind, you can effectively strategize your implementations of virtual or augmented reality across any number of applications, from interactive 3D videos to sprawling virtual sandboxes. If you are looking for how to get started in VR application programming from here, I would suggest researching frameworks for the type of applications you are interested in and starting there.
Top comments (1)
Thank you so much for such a comprehensive and clear overview of virtual reality apps! I have been interested in the possibilities of virtual reality for a long time, and your article gave me a lot of new ideas.