A nice gesture for XR

Using the human attention system to design, measure, and compare hand gestures for XR experiences.

What is a Good Hand Gesture in an XR Experience? And how to measure its success? As hand recognition has become more accurate, faster, and fairly reliable, Gestures are becoming the primary input in XR headsets. Still, designing a gesture with unclear principles and lacking evaluation methods can be tricky. Could findings from neuroscience help in designing and measuring hand gestures? Let’s see if making a good hand gesture for walking in VR and measuring its success is possible based on the findings.

Italian Pinched fingers emoji

Current principles for designing a hand gesture

Below are principles collected from papers, Meta and Apple’s guide for designing hand gestures:

  • Reliable and free from error.
  • Easy to learn. Logically and simply explainable. Memorable.
  • Socially acceptable. No gesture conflicts.
  • Make it comfortable. Minimize muscle usage. Low cognitive workload.
  • Accessible to everyone.
  • Respectful to multi-modality and good usability.
  • Provide feedback.
  • Natural and unambiguous. It should fit the experience. Use familiar gestures that match people’s expectations.

Can a good hand gesture for walking in VR be made using these principles? While some guides, like providing feedback or socially acceptable gestures, are clear to understand, others aren’t very clear. For instance, what indicates the comfort of a gesture, and how to measure it? Why can people play Beat Saber for an hour nonstop, but performing air tap in HoloLens causes fatigue in 5 minutes? Let’s look at some hand gestures for walking in VR.

Walking in a virtual environment by swinging hands up and down to move forward
Moving around in VRChat by holding and moving the hand forward to move accordingly.
Walking in Waltz of the Wizards by moving hands up and down– Walking in VRChat by contacting the middle finger and the thumb – captured by the author.

Daniel Beauchamp on Twitter: "Locomotion in VR has just been solved. Pack it in, folks. pic.twitter.com/mu6CTc03ik / Twitter"

Locomotion in VR has just been solved. Pack it in, folks. pic.twitter.com/mu6CTc03ik

All these creative solutions meet the guidelines to different extents. However, each has downsides like lack of multimodality (can’t shoot and walk simultaneously). Plus, there is no standard way to compare the gestures other than user feedback which can be biased and different in various experiences. So, what can be done to change this situation?

If a hand gesture causes fatigue, the user’s attention will shift from its task to the fatigue, which is why creating a gesture that doesn’t cause fatigue is better. What if it’s ok to cause fatigue; but without shifting the user’s attention from its primary task?

In Beat Saber, Users’ attention doesn’t shift shortly despite all the movements, compared to air taps in HoloLens. Shifting attention to a VR experience is so powerful that an FDA-approved VR system for chronic pain reduction exists. So, instead of making a gesture that is easy to learn, that doesn’t cause fatigue, etc., let’s make a gesture that doesn’t shift the user’s attention from their task to something else.

What is Attention, and how can it help in making Hand-Gestures?

Human Attention System is a broad topic, so only some of the findings will be mentioned in this article. Michael Posner’s findings in attention systems are the main source. Here is one of the findings:

Humans can only pay attention to one subject at once. Multi-tasking is a myth. What our brain does is switch attention between tasks quickly. This is why driving a car and listening to the radio simultaneously is possible. Attention switches can cause errors and less efficiency in a task.

To that point, an ideal hand gesture requires minimum attention switch cost. There are overlearned tasks that can be performed without paying attention. Think of riding a bicycle. Once you bike regularly, you will no longer pay attention to how to ride the bike. Keeping your balance will become Automatic. Let’s call automatic actions/processing, Habitual Actions for better wording. What habitual action can be borrowed from daily life to create a hand gesture for walking in VR? The joystick is the one selected for this article.

On the left, a hand holding the Oculus Quest controller. On the right, the hand gesture of holding the controller.
Hand posture when using a joystick — Photo from Meta

Many people have the muscle memory to use it effortlessly, making it a good candidate for a Habitual Gesture. Here is a clip from the gesture:


Test Design

How to evaluate the success of the gesture? Shifting attention can be measured by Reaction Time (RT). A three-step test displaying cues is designed to react by an input method. RT is the time between the moment a cue shows up to the moment the input is used. The test is done twice. First, using a keyboard and then hand gesture. Pressing arrows on the keyboard is assumed to be a habitual action with a low attention shift.

In the first step, a simple arrow shows up. The tester has to push the right arrow key on the keyboard and the right direction on the hand gesture to match the arrow direction.

An arrow in a rectangle poining forward
Step 1 — simple arrow pointing up.

After five cues in different directions, the second step starts by adding an asterisk to the arrow. If the arrow points at the asterisk, the tester can react.

On the left, an arrow in a rectangle points forward toward an Astras. On the right, an arrow in a rectangle points forward not toward an astros
Step 2 — Left is an example of when the user can react. Right is an example of when the user can’t respond.

After five more, the third step starts by adding colors. Now the asterisk’s color has to match the arrow’s color to react.

On the left, an arrow in a yellow rectangle points forward toward an astros in a yellow rectangle. On the left, an arrow in a yellow rectangle points forward not toward an astros in a green rectangle.
Step 3 — Left is an example of when the user can react. Right is an example of when the user can’t react.

Lower RT to the cues indicates a faster attention shift, a positive indicator for an input method. There were 12 testers. On average, the RT for the keyboard was 0.077 seconds faster than the hand gesture. This means the keyboard is performing slightly better than the gesture.

On the left bottom corner, a human hand with the same gesture as holding an Oculus Quest controller. The thumb is moved to the right. At the center of the image, the virtual hand makes the same gestures as the virtual controller. On the right corner of the image, a black arrow in a white rectangle pointing to right toward an astros.
Correct test scenario when the arrow direction matches the asterisk — Made by the author.

The Outcome

The article’s main point so far is to answer the two questions. What is a Good Hand Gesture in an XR Experience? And how to measure its success? With the Reaction Time, gestures can be measured and compared. Lower reaction time means better gestures. Though the walking gesture seems promising, it’s imperfect. Its design can improve using more attention findings other than Habitual Actions. The gesture feels natural and easy to learn. It was designed to solve the attention shift problem, not fatigue, learning time, etc., and its accuracy can be improved using EMG or BCI.

What about Attention in other XR Design aspects?

Let’s look at near buttons. Pressing a button in the air can be challenging because the user has to position their finger where the button is precisely. Usually, a halo effect highlights the button to show the user that their finger is pointing at it.

There are 3 holographic buttons. One of the buttons is called the Hand joint, and the user’s finger is on top of it. As a result of that, the button has a circular halo effect to showcase where the finger is relevant to the finger.
This is Vision Pro’s virtual keyboard. It has circular keys, and the keys are highlighted once the user’s finger gets close to them. The key goes down if the user presses it.
Left: MRTK Halo effect on the “Hand Joint” button — Right: Vision Pro Hover Effect.

While the effect ensures the finger is close to the button and helps with the lack of tactile, it doesn’t reduce the attention user has to spend to get their finger to the right place and to avoid pushing other keys by mistake. Can the button’s design help users precisely press a button by spending less attention? Inspiration: Make the intended button bigger.

Apple watch menu. Showcasing icons menu and how their size gets bigger as they get closer to the center of the screen.
Apple watch menu, the icon size changes based on their distance to the center.

In the Apple Watch, the central icons are bigger, so the user can see them better and touch them with less error. If you own an Apple watch, you know that pressing an app icon in the menu is easier than pushing a button in XR. In the walking demo, I made a hand menu for users to select different controller sizes and speeds. When the user’s finger gets closer to a button, the button gets bigger, and other buttons get smaller. A bigger target is easier to hit. Therefore, less attention is spent, to be precise. It feels much easier to press buttons, and it’s possible to push them without directly looking at them. I might write another article about typing in XR using this and other methods.


What about Attention in handheld AR?

Placing an object in the world is a typical interaction in AR. Our mind doesn’t like unnatural events. Anything that seems unusual will cause an attention shift.

A chair moving around in the environment based on user's touch. The chair goes beyond walls.
IKEA app — Chair Placement in AR, where the chair goes beyond the wall — captured by the Author

In the IKEA app, the chair goes beyond the wall and has the wrong direction, which is unusual. Also, a Chair usually faces the opposite direction of a wall. Attention shift can decrease if the chair can’t pass walls and detects walls to orient itself accordingly. This is another example of putting attention at the center of XR design.

The chair moves in real life with the user’s touch input. It does not go beyond the wall and it orient itself to the wall’s forward direction.
AR Placement Prototype, the chair stocks by the wall and orients to the wall automatically— Made by the author.


Using attention in design is not new. There are many studies and guides about users’ attention to design interactions and interfaces for 2D displays. However, there are few studies for users’ attention to design for XR. I think many gaps in XR design can be filled by involving users’ attention at the center of the design for XR. Because, in my opinion, attention directly connects with immersion, presence, and experience. One gap for me was not knowing how to make a Hand Gesture. Where to start the design and how to evaluate my design. I wanted to share my findings and journey researching the topic. I hope this can help for a better design future.

Download the Demo for Quest 2

  • Download the APK file and install it using Side Quest.
  • Enable hand tracking in settings.
  • Be in an environment with different colors than your hand.
  • When starting the app, keep your hands open, facing forward.
  • Walking speed and hand direction sizes can change in the hand menu.
  • None of the sizes may work for your hand shape.


Al-Kalbani, Maadh, Ian Williams, and Maite Frutos-Pascual. “Improving Freehand Placement for Grasping Virtual Objects via Dual View Visual Feedback in Mixed Reality.” 22nd ACM Conference on Virtual Reality Software and Technology, 2016, pp. 279–282. ACM Digital Library, https://0- doi-org.library.scad.edu/10.1145/2993369.2993401.

Albert, Jeremy, and Kelvin, Sung. “User-Centric Classification of Virtual Reality Locomotion.” 24th ACM Symposium on Virtual Reality Software and Technology, Article №127, pp. 1–2. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/3281505.3283376.

Alzayat, Ayman, Mark Hancock, and Miguel A. Nacenta. “Quantitative Measurement of Tool Embodiment for Virtual Reality Input Alternatives.” 2019 CHI Conference on Human Factors in Computing Systems, 2019. ACM Digital Library, https://0-doi- org.library.scad.edu/10.1145/3290605.3300673.

Arrington, Catherine M, and Melissa M Yates. “The role of attentional networks in voluntary task switching.” Psychonomic bulletin & review, vol. 16,4 (2009): 660–5. National Library of Medicine, doi:10.3758/PBR.16.4.660.

Bishop, Ian, and Abid, Rizwan. “Survey of Locomotion Systems in Virtual Reality.” The 2nd International Conference on Information System and Data Mining, 2018, pp. 151–154. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/3206098.3206108.

Boletsis, Costas, and Jarl Erik Cedergen. “VR Locomotion in the New Era of Virtual Reality: An Empirical Comparison of Prevalent Techniques.” Advances in Human-Computer Interaction, vol. 2019, APR 2019. Hindawi, https://doi.org/10.1155/2019/7420781.

Cabral, Marcio C, Carlos H. Morimoto, and Marcelo K. Zuffo. “On the usability of gesture interfaces in virtual reality environments.” 2005 Latin American conference on Human-computer interaction, 2005, pp. 100–108. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/1111360.1111370.

Fan, Jin et al. “Testing the efficiency and independence of attentional networks.” Journal of cognitive neuroscience, vol. 14,3, 2002): 340–7. National Library of Medicine, doi:10.1162/089892902317361886.

F. Zhang, S. Chu, R. Pan, N. Ji, and L. Xi, “Double hand-gesture interaction for walk-through in VR environment,” IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), 2017, pp. 539–544. IEEE Xplore, doi: 10.1109/ICIS.2017.7960051.

Frommel, Julian, Sven Sonntag, and Michael Weber. “Effects of Controller-based Locomotion on Player Experience in a Virtual Reality Exploration Game.” 12th International Conference on the Foundations of Digital Games, Article №30, 2017, pp. 1–6. ACM Digital Library, https://0-doiorg.library.scad.edu/10.1145/3102071.3102082.

Guerrero-García, Josefina, Claudia González, and David Pinto. “Studying User-Defined Body Gestures for Navigating Interactive Maps.” XVIII International Conference on Human Computer Interaction, Article №49, 2017. ACM Digital Library, https://0-doi- org.library.scad.edu/10.1145/3123818.3123851.

Hommel, B., Chapman, C.S., Cisek, P. et al. “No one knows what attention is.” Atten Percept Psychophys, Vol. 81, Oct 2019, pp. 2288–2303. Springer link, https://doi.org/10.3758.

Khundam, Chaowana. “First person movement control with palm normal and hand gesture interaction in virtual reality,” 12th International Joint Conference on Computer Science and Software Engineering (JCSSE), 2015, pp. 325–330. IEEE Xplore, doi: 10.1109/JCSSE.2015.7219818.

Krekhov, Andrey, Katharina Emmerich, Philipp Bergmann, Sebastian Cmentowski, and Jens Krüger. “Self-Transforming Controllers for Virtual Reality First Person Shooters.” Annual Symposium on Computer-Human Interaction in Play, 2017, pp. 517–529. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/3116595.3116615.

Lages, Wallace, Mahdi Nabiyouni, and Leonardo Arantes. “Krinkle Cube: A Collaborative VR Game Using Natural Interaction.” 2016 Annual Symposium on Computer-Human Interaction in Play Companion Extended Abstracts, 2016, pp. 186–196. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/2968120.2987746.

Leng, Hoo Yong, Noris Mohd Norowi, and Azrul Hazri Jantan. “A User-Defined Gesture Set for Music Interaction in Immersive Virtual Environment.” 3rd International Conference on Human-Computer Interaction and User Experience in Indonesia, 2017, pp. 44–51. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/3077343.3077348.

Li, Yang, and Jin HUANG, and Feng TIAN, and Hong-An WANG, and Guo-Zhong DAI. “Gesture interaction in virtual reality.” Virtual Reality & Intelligent Hardware, Feb 2019, pp. 84–112. ScienceDirect, https://doi.org/10.3724/SP.J.2096-5796.2018.0006.

Oh, Seo Young, Boram Yoon, Hyung-il Kim, and Woontack Woo. “Finger Contact in Gesture Interaction Improves Time-domain Input Accuracy in HMD-based Augmented Reality.” 2020 CHI Conference on Human Factors in Computing Systems, 2020. ACM Digital Library, https://0-doi org.library.scad.edu/10.1145/3334480.3383098.

Payne, John, Paul Keir, Jocelyn Elgoyhen, Mairghread McLundie, Martin Naef, Martyn Horner, and Paul Anderson. “Gameplay Issues in the Design of Spatial 3D Gestures for Video Games.” In CHI ’06 Extended Abstracts on Human Factors in Computing Systems, 2006, pp. 1217–1222. ACM Digital Library, https://0-doi-org.library.scad.edu/10.1145/1125451.1125679.

Posner, M I et al. “Is word recognition automatic? A cognitive-anatomical approach.” Journal of cognitive neuroscience, vol. 1, Jan 1989, pp. 50–60. doi:10.1162/jocn.1989.1.1.50.

Posner, M I, and S E Petersen. “The attention system of the human brain.” Annual review of neuroscience vol. 13, 1990, pp. 25–42. National Library of Medicine, doi:10.1146/annurev.ne.13.030190.000325.

Rautaray, Siddharth, and Anupam, Agrawal. “Real Time Gesture Recognition System for Interaction in Dynamic Environment.” Procedia Technology, vol. 4, 2012, pp. 595–599. ScienceDirect, https://doi.org/10.1016/j.protcy.2012.05.095.

Sarupuri, Bhuvaneswari, Simon Hoermann, Frank Steinicke, and Robert W. Lindeman. “TriggerWalking A Biomechanically-Inspired Locomotion User Interface for Efficient Realistic VirtualWalking.” 5th

Silkin, Alex. “A Look into Five Years of Locomotion in Virtual Reality.” ACM SIGGRAPH 2019 Talks, Article №8, July 2019, pp. 1–2. ACM Digital Library, https://0-doi org.library.scad.edu/10.1145/3306307.3328199.

Slater, Mel, and Sylvia, Wilbur. “A Framework for Immersive Virtual Environments (FIVE): Speculations on the Role of Presence in Virtual Environments.” Presence: Teleoper. Virtual Environ, 1997, pp. 603–616. ACM Digital Library, https://doi.org/10.1162/pres.1997.6.6.603.

A nice gesture for XR was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.






Leave a Reply

Your email address will not be published. Required fields are marked *