First Crit

A few days ago, we had our first crits of our initial design ideas for our course project NUIs. As I’ve previously described, I’m designing a NUI for the Perceptive Pixel for exploration of a MongoDB dataset.

The biggest question around my current design is how exactly collaboration is accomplished with the NUI. In reality, this question is about more than the NUI itself; really, the question is, how do multiple parties collaborate in data exploration? To begin to consider this question, it’s important to understand what the data are in my first target use case.

This use case considers data exploration with the Emotion in Motion dataset. Emotion in Motion is a large-scale experiment that measures subjects’ physiology while they listen to different selections of music. ‘Documents’ in the dataset come primarily in three flavors: trials, signals, and media. An abbreviated trial document looks something like the following:

Those entries in the media property correspond to three different songs to which this subject listened. The answers property contains both demographic information, as well as answers to questions that this subject was asked after listening to each song. For instance, after listening to the song with label ObjectId("537e601bdf872bb71e4df26d") (from the media property), the subject rated their ‘liking’ of the song as 4 on a scale of 1-5. The media ObjectIds point to media documents that look something like this (also abbreviated):

And finally, each media in a trial is associated with a signal document. Here’s an abbreviated example:

Each of the properties under the signals property is a very long array, with each entry representing the instantaneous value of a given signal as measured at a specific point in time while listening to the associated media file. The entries in eda_status and hr_status are binary indicators of the acceptability of the other EDA and HR signals at that moment in time. In addition, we work with a far greater number of featured that are derived from the raw and filtered physiological signals.

Looking at one of these combined media/signal/trials in any detail takes a considerable amount of screen space. The problem is, we are approaching 40,000 ‘song listens’, and this number continues to grow daily. Within the next two years, we expect to be well beyond 100,000 listens. So, for a given song, an interface to explore signals from the, say, 2,000 subject that have listened to the song needs to be carefully considered. And, how do we go about creating an interface with which multiple people can work together to explore such a dataset.

The most obvious way to visualize data like this is to create individual plots for each type of signal/feature (tonic EDA and heart rate variability, for instance). These plots are naturally aligned vertically, as they all correspond to a common timebase. How, though, do multiple people easily manipulate and view this visualization? I’ve imagined the scenario for this project to be one in which the Perceptive Pixel is used as a tabletop interface. Thus, the most obvious arrangement of users is on either side of the table. Is each user shown their own separate visualization/interface in the orientation that is correct for them? Is the separation of displays used only during the exploration process and later combined for a larger visualization? If the exploration is to be tightly linked (each party works closely together during the exploration), how is the interface oriented? Or, does a less tightly linked interaction better suit this scenario?

These are the kinds of questions that came up during my first crit. Many of them would be easily addressed by mounting the Perceptive Pixel vertically, and in the end, this may be the best solution. I’m still enjoying the challenge of exploring ways to create a collaborative NUI using a tabletop interface that deals with content that is highly sensitive to orientation, though.

Perceptive Pixel

I’ve had the opportunity to think a bit more about this NUI-based tool for MongoDB data exploration and visualization. In addition, I’ve been able to discuss the project with Doug Bowman. I’ve now have a bit more clarity about what I’d like to see from this interface, and what first steps I should take.

First, on Friday, Chris North introduced Virginia Tech’s new Microsoft Perceptive Pixel at the ICAT Community Playdate last Friday.

From Microsoft:

The Perceptive Pixel (PPI) by Microsoft 55″ Touch Device is a touch-sensitive computer monitor capable of detecting and processing a virtually unlimited number of simultaneous on-screen touches. It has 1920 x 1080 resolution, adjustable brightness of up to 400 nits, a contrast ratio of up to 1000:1, and a display area of 47.6 x 26.8 inches. An advanced sensor distinguishes true touch from proximal motions of palms and arms, eliminating mistriggering and false starts. With optical bonding, the PPI by Microsoft 55” Touch Device virtually eliminates parallax issues and exhibits superior brightness and contrast. And it has built-in color temperature settings to accommodate various environments and user preference.

Perceptive Pixel

While the unit is quite impressive, I’m most interested in how this interface might enable something truly unique for this project. Other than space around the unit, there’s no other limiting factor on the number of users who might view and interact with on-screen content. There is plenty of space for multiple users to carve out their own visualizations, as well. So, I’ll be working with the Perceptive Pixel, instead of the iPad. The learning curve will be steeper for me, as I’m already a competent iOS developer, but I think it will be worth the additional effort.

Second, I’m concerned about biting off more than I can chew in this project. Both data exploration and visualization (in particular, of the dataset with which I’m always working) are important for me to have. However, given the duration of the project, trying to get very deep into both might be too ambitious. Instead, I’ll be focusing on developing an interface for collaborative visualization of NoSQL data–data exploration can come later. This will likely mean that the first number of iterations use only canned data from the dataset.

So, the first step is to jump into C#. I’m not particularly excited to work on a Microsoft stack, but if this is what working with the Perceptive Pixel requires, so be it. The next step is to begin to brainstorm design ideas–more to come on that this week.

NUI Project

One of our ongoing studies is Emotion in Motion, a large-scale experiment that collects physiological data from people while they listen to selections of music. Emotion in Motion began in 2010, while we were working as Ph.D. researchers at Queen’s University Belfast. It first ran for several month in the Science Gallery in Dublin, Ireland. Here, we went through several iterations of the experiment: questions we asked the participants changed, the music selections changed, and so on. Since Dublin, Emotion in Motion has been staged in New York City, Bergen (Norway), Manila, and the Philippines. We are currently preparing to deploy Emotion in Motion in Taiwan for the entirety of 2015.

The data generated by Emotion in Motion were originally written to formatted text files. We wrote parsers for these files to work in the environments in which we chose to work. As Emotion in Motion’s life has continued, however, we’ve recognized that we really need a better method for storing and accessing these data. Across all of these iterations, while we’ve made a number of changes to the content of the experiment, the overall structure of the experiment has remained relatively stable: participants are always watching or listening to some form of media; we are recording their physiology, and asking them questions about their experiences. We decided that a NoSQL database would allow us to store huge numbers of data entities that in some aspects share many common similarities, but in others may vary wildly. For instance, while we record the same physiological signals from all participants during each media session, the lengths of all media selections are not the same. Or, while we ask for the same demographic information from all participants, we may ask different questions in response to each media selection. The difficulty of representing these varying schemas of data into an RDBMS’ tables made a NoSQL solution the obvious alternative.

So, I now find myself doing a great deal of work in MongoDB. The learning curve has been surprisingly gentle, and I’m very comfortable querying around through the scripting interface. One thing that I have found myself wanting as an easy means of quick-and-dirty visualization for data exploration and high-level analysis means, though. Currently, my workflow is to refine queries using the scripting interface, pull the data I need from MongoDB, and then use an external tool (MATLAB, R, etc.) to visualize the data. It would be very useful for me to be able to be able to visualize queries on the fly, instead of hopping through this piecemeal workflow. In addition, the modularity of MongoDB queries and aggregation would lend themselves well to construction and refinement through a graphical interface.

It’s this real, personal need for such a tool that has led me to choose building such a tool using a tablet interface for a semester-long project in Doug Bowman’s class on natural user interfaces. Some of the other ideas with which I was toying were:

  • Tabletop audio editing tool
  • Gestural music improvisation tool
  • Live music performance looping tool
  • Gestural musical score following tool

The musician in me would love to to build any of those tools. Certainly, it would make the project more enjoyable and motivating for me. The researcher in me (that just needs to finish this ****ing dissertation), needs what I’ve described in order to do his work. Practicality and necessity beats out fun and exciting in this case. I’ll post more as the project progresses.

NUIs in Everyday Computing

In a recent post on the Leap Motion blog, Alex Colgan discusses the influences that fictional user interfaces (read ‘user interfaces depicted in movies’) have on the development of motion controls being developed today. He draws on examples from Minority Report, Ender’s Game, and The Avengers to illustrate his three main points. In short, these are:

  • Successful motion controls ‘make us feel powerful and in control of our environment’.
  • Successful motion controls keep the user in a state of flow.
  • Successful motion controls leverage immersion and ‘anti-immersion’ well.

I’d like to focus on the second of those points. In his post, Colgan references Mihaly Csikszentmihalyi’s description of flow (the psychologist who initially proposed the notion):

Human beings seek optimal experiences, where we feel a sense of exhilaration–a deep sense of enjoyment. In these intense moments of concentration, our ego disappears, time stands still. Art. Sport. Gaming. Hacking. Every flow activity shares one thing in common: a sense of discovery, a creative feeling of transporting us into a new reality to higher levels of performance.

Many people who speak of flow (Colgan included) only discuss flow as occurring in creative activities, sports, gaming, and the like. Need this be the case? Is enabling a flow state really only a goal fit for user interfaces built for entertainment and gaming (as Wigdor and Wixon might have us believe?)

Csikszentmihalyi says no. To support this (drawing from his thousands of interviews with not only creatives and athletes, but also CEOs, shepherds, and the like) he describes seven indicators that one is in a flow state:

  1. Completely involved in what we are doing–focused, concentrated.
  2. A sense of ecstasy–of being outside everyday reality.
  3. Greater inner clarity–knowing what needs to be done, and how well we are doing.
  4. Knowing that the activity is doable–that our skills are adequate to the task.
  5. A sense of serenity–no worries about oneself, and a feeling of growing beyond the boundaries of the ego.
  6. Timelessness–thoroughly focused on the present, hours seem to pass by in minutes.
  7. Intrinsic motivation–whatever produces flow becomes its own reward.

While I don’t disagree that gaming and entertainment interfaces should aim to be conducive to flow, I’m convinced that flow has a place outside of the latest Call of Duty release. In my work, being completely involved in what I am doing, having inner clarity, having confidence in my abilities, finding serenity, being excited and motivated to do my work are all certainly desirable and achievable. Furthermore, I would hope that the tools I choose to do my work are conducive to these, as well. While NUIs, on the outside, may seem most appropriate for gaming and entertainment, no one has yet convinced me that these are the only applications where they are appropriate. And, if they are especially capable in enabling flow, we should be considering ways to incorporate them in all manner of UI.