[email protected] home: connected live event experiences

4
Figure 1. Two-way real time communication between the field and living rooms. CoStream@Home: Connected Live Event Experiences Niloofar Dezfuli Sebastian Günther Mohammadreza Khalilbeigi Max Mühlhäuser Technische Universität Darmstadt {niloo,guenther,khalilbeigi,max}@tk.informatik.tu-darmstadt.de Jochen Huber Singapore University of Technology and Design & MIT Media Lab [email protected] ABSTRACT Live events can be mainly experienced in two different ways: spectators are either present in-situ, i.e. in the stadium, or they witness the event remotely, e.g. at home or in a pub. Although both experiences concern the same event, they are fundamentally different. In addition, communication between these two ‘spectating realms’ is rather cumbersome, preventing spectators to co-experience events over a distance. In this paper, we contribute a system aiming at bridging this gap by establishing a two-way communication channel between the two realms through sharing user-generated mobile live video. We describe implementation details, depict salient interaction techniques and conclude with promising research directions. Categories and Subject Descriptors H.5.2. [Information interfaces and presentation]: User Interfaces- Interaction styles; General Terms Human Factors; Design Keywords Mobile live video sharing, event, experience, user generated content, multimedia sharing, experience sharing, experience co- construction 1. INTRODUCTION In live sporting events such as soccer matches, sport fans actively attend the event to co-experience something extraordinary. On the other side, there is a large number of people who gather together and watch professional broadcasts in the living room. Although both groups follow the same event, the experiences are different: spectators in the field, who live witness an event through both listening to the atmosphere in the stadium and peripheral vision, perceive the event differently than those people in living rooms who watch the event with the limited perspective of professional broadcasts. People in living rooms may have access to the additional information such as an audio commentary, scene replays or even information from the Internet (such as event’s Tweets). However, they lack (1) limited viewing perspectives, (2) social interactions with the spectators in the stadium and (3) sharing event experiences in the ‘opposite’ realm, e.g. people at home cannot contribute expressions such as emotions to the event experience in the stadium and vice-versa. We believe that mobile devices in both realms can be used as means for mutually contributing to the event engagement, leading to more immersive and socially connected experiences during live sporting events. Given the ubiquity of camera-enabled smart phones, video streaming tools (e.g. Skype or UStream.tv) have the potential to address the above challenges. However, users need to know each other upfront and they do not embed the experience into the specific event, neglecting crucial information, such as the location of potential video sources (i.e. properly equipped spectators). There already exists some research focusing on user-generated media (e.g. microblogs) to understand and enrich social spectating experiences around events [7, 8]. Yet a few projects have actually investigated the effect of establishing real-time communication between the field and living rooms. In particular, they mainly addressed the challenges of creating live media sharing services in-situ [3, 9] but neglected sharing between remote viewers at homes and spectators in the field. In [3], Dezfuli et al. described CoStream, a system for mobile sharing of user-generated live video, in-situ during sporting events (cf. Figure 1: blue arrows indicating in-situ live video sharing). The field trials showed that real time sharing of different perspectives on the same event has the potential to provide fundamentally new experiences of same- place events. In this work, we go beyond the concept of CoStream [3, 9] by extending the communication channel from in-situ sharing towards remote sharing with viewers at home (cf. Figure 1: green arrow). In order to stimulate social interactions and eventually, enhance user experiences between spectators at home Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. SAM’13, October 21, 2013, Barcelona, Spain. Copyright © 2013 ACM 978-1-4503-2394-9/13/10…$15.00.

Upload: others

Post on 14-Mar-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Figure 1. Two-way real time communication between the field and living rooms.

CoStream@Home: Connected Live Event Experiences

Niloofar Dezfuli Sebastian Günther Mohammadreza Khalilbeigi Max Mühlhäuser Technische Universität Darmstadt

{niloo,guenther,khalilbeigi,max}@tk.informatik.tu-darmstadt.de

Jochen Huber Singapore University of Technology and Design &

MIT Media Lab [email protected]

ABSTRACT Live events can be mainly experienced in two different ways: spectators are either present in-situ, i.e. in the stadium, or they witness the event remotely, e.g. at home or in a pub. Although both experiences concern the same event, they are fundamentally different. In addition, communication between these two ‘spectating realms’ is rather cumbersome, preventing spectators to co-experience events over a distance. In this paper, we contribute a system aiming at bridging this gap by establishing a two-way communication channel between the two realms through sharing user-generated mobile live video. We describe implementation details, depict salient interaction techniques and conclude with promising research directions.

Categories and Subject Descriptors H.5.2. [Information interfaces and presentation]: User Interfaces-Interaction styles;

General Terms Human Factors; Design

Keywords Mobile live video sharing, event, experience, user generated content, multimedia sharing, experience sharing, experience co-construction

1. INTRODUCTION In live sporting events such as soccer matches, sport fans actively attend the event to co-experience something extraordinary. On the other side, there is a large number of people who gather together and watch professional broadcasts in the living room. Although both groups follow the same event, the experiences are different: spectators in the field, who live witness an event through both listening to the atmosphere in the stadium and peripheral vision, perceive the event differently than those people in living rooms who watch the event with the limited perspective of professional broadcasts. People in living rooms may have access to the

additional information such as an audio commentary, scene replays or even information from the Internet (such as event’s Tweets). However, they lack (1) limited viewing perspectives, (2) social interactions with the spectators in the stadium and (3) sharing event experiences in the ‘opposite’ realm, e.g. people at home cannot contribute expressions such as emotions to the event experience in the stadium and vice-versa. We believe that mobile devices in both realms can be used as means for mutually contributing to the event engagement, leading to more immersive and socially connected experiences during live sporting events. Given the ubiquity of camera-enabled smart phones, video streaming tools (e.g. Skype or UStream.tv) have the potential to address the above challenges. However, users need to know each other upfront and they do not embed the experience into the specific event, neglecting crucial information, such as the location of potential video sources (i.e. properly equipped spectators). There already exists some research focusing on user-generated media (e.g. microblogs) to understand and enrich social spectating experiences around events [7, 8]. Yet a few projects have actually investigated the effect of establishing real-time communication between the field and living rooms. In particular, they mainly addressed the challenges of creating live media sharing services in-situ [3, 9] but neglected sharing between remote viewers at homes and spectators in the field. In [3], Dezfuli et al. described CoStream, a system for mobile sharing of user-generated live video, in-situ during sporting events (cf. Figure 1: blue arrows indicating in-situ live video sharing). The field trials showed that real time sharing of different perspectives on the same event has the potential to provide fundamentally new experiences of same-place events. In this work, we go beyond the concept of CoStream [3, 9] by extending the communication channel from in-situ sharing towards remote sharing with viewers at home (cf. Figure 1: green arrow). In order to stimulate social interactions and eventually, enhance user experiences between spectators at home

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. SAM’13, October 21, 2013, Barcelona, Spain. Copyright © 2013 ACM 978-1-4503-2394-9/13/10…$15.00.

Figure 2. Application running on the companion device. (a) the location-based overview of users in the field, (b) the list preview of all

users who are in the field and use the CoStream application along with their application status, (c) the Heat map visualization displaying areas of interests.

Figure 3: The TV application. Three user generated

video streams are visualized. Context menu is activated on the first video stream.

and in the field, we consider gestural information of spectators in front of the TV (e.g. emotional and gestural reactions) in addition to the video sharing communication. We believe that such information can open up novel social interaction possibilities. In this paper, our main objective is to address the experiential gap between users at live events and those engaging remotely. We contribute CoStream@Home, a system to connect both types of spectators through bi-directional mobile live video sharing. The system particularly aids in overcoming limited viewing perspectives, encourages social interactions and facilitates the co-construction of shared experiences across realms. The system interaction concepts and techniques are described in the next section, followed by a brief technical explanation of the system architecture. We conclude by outlining promising research directions to guide future work.

2. SYSTEM DESIGN AND INTERACTION CONCEPTS The system consists of three components: (a) a main application running on a nearby computer connected to a Samsung Smart TV along with a Kinect camera located atop the TV, (b) an application for users at homes running on a companion device (i. e. smart phone) and (c) a mobile application for users in the field. The latter is a slightly modified version of the CoStream application developed in [9, 3], supporting the connection to users at home. Users at homes first need to log into the system on the companion device using their Facebook account. Upon a successful log in, the application connects with the TV application and then the professional broadcast is shown on the TV in full screen mode. The users can further interact with the system using the application on the companion device. We conceptually subdivided the interaction design into three modes described as follows.

2.1 Awareness and Overview across Realms Live broadcasts are commonly restricted to professional camera perspectives. These cameras mainly cover the primary scene of the story but other interesting scenes, such as reactions of bystanders or friends are not covered. Therefore, we developed a location-based real-time video broadcast between users at homes and those in the field. Mobile cameras of the users in stadiums can become ‘remote eyes’ for viewers at homes. They can be used as cameras on-demand so that users can watch from different perspectives and be socially connected trough video. The system initially provides an overview of the remote user on the companion device through the Google satellite map view (cf. Fig 2 - a). Current location and orientation of remote users in the

field (who are logged into the CoStream application [3, 9]) are indicated with a custom-designed marker on the map. The marker decoration shows the Facebook profile image of the corresponding user and reveals whether the user is idle, broadcasting or watching a video stream on her mobile device. To get a quick overview of all users, the user can tap on the “Users” button to open a slider containing a list of all users in the field along with their application status (cf. Fig 2 - b). In order to provide a richer context, we designed a heat-map visualization (cf. Fig 2 - c) for displaying areas of interest during live events. It can be activated by tapping on the corresponding button located on the left side of the interface. The heat-map is visualized if two or more users record videos with an overlapping viewing angle.

2.2 Active Engagement and Social Interaction The current watching practice of a live sporting event is isolated from the sport fans present in the stadium. In effect, remote users cannot contribute to the overall event experience in the stadium and vice-versa. To support this, we developed a push and pull mechanism so that users at homes and in the field can mutually notify each other about the interesting scenes. More specifically, TV viewers can request spectators to start streaming a scene from their perspective (or watching an already broadcasted stream) by tapping on their marker. The stream is then shown on the bottom of the TV screen in a picture-in-picture mode (cf. Fig 3). To interact (play/pause, relocate, zoom, or close) with the video streams shown on the TV screen, we developed a remote controlling (RC) mode on the companion device. This mode is

Figure 4: Gestures currently recognized by Kinect:

(a) cheering, (b) frustration, (c) clapping.

Figure 5: Technical Architecture of the System.

activated when users hold the mobile phone similar to grasping the conventional TV remote control (one-handed in portrait mode and slightly slanted). Once the RC is activated, the mobile screen freezes and a blue frame is visualized around the video stream indicating the current selection. Users can then navigate to other video streams by performing simple swipe gesture on the mobile phone. To play/pause the stream users can simply tap on the screen of mobile phones. Pinch gesture enlarges the stream preview on the TV screen. Long press activates a context menu on the selected screen (cf. Fig 3) offering the following functions:

- Move: users can arbitrary move the video stream window across the TV screen. This feature is particularly helpful when the default position of streams (bottom of the TV screen) disrupts the professional TV broadcast.

- Swap: this function switches the inset (selected stream) with the full screen preview of TV (professional broadcast). In this way, users can quickly watch the live user generated content in full screen mode. The professional broadcast becomes as an inset in the stream list.

- Delete: removes the stream from the TV screen. Conversely, users in the stadium can ask TV viewers to start a video stream. Upon a stream request from the field, TV viewers receive a notification on their mobile phone and then can activate the Kinect webcam mode to start streaming to the field.

2.3 Implicit communication Typical social TV systems enable viewers to explicitly share experiences through text, voice or video chat [2]. However, these means are time-critical and may distract viewers. For instance texting certainly requires a lot of a user’s attention [4]. We believe that in addition to the explicit multimedia sharing, implicit communication means can open up novel experiences: e.g. a viewer's spatial and gestural information in front of a TV such as emotional and gestural reactions. Posture and emotions has a advantage that they do not distract users from the event. In our system, we developed an interaction concept in which the reactions of TV viewers to the precious moments are implicitly transferred to sport fans in stadiums. The system continuously processes the skeleton tracking data coming from the Kinect depth camera and can recognize three common expressions:

- Cheering (cf. Fig 4 - a): when users raise both hands as a sign of cheering and appreciation.

- Frustration (cf. Fig 4 - b): when users quickly move their hands as a sign of frustration and discouraging.

- Clapping (cf. Fig 4 - c): when users clap their hands, quickly and repeatedly to express appreciation or approval.

In addition to the visual notification on the mobile application, sport fans in the field are cued through vibro-tactile feedback using their mobile phones. We argue that such real-time multi-channel communication has a great potential to enrich both social and user experiences while watching events in living rooms. We do not claim that the above expressions are exhaustive examples of the human expressions. We however believe that these expressions are salient and frequently performed. With the advent of more precise and high-resolution depth sensing technology, future work should consider recognizing finer and more detailed facial expression of the TV viewers.

3. IMPLEMENTATION The system implementation is based on three main components: (i) a TV application, (ii) an Android application for the secondary device and (iii) a centralized server (cf. Fig. 5) The TV application runs on a nearby PC connected to a Samsung Smart TV and is implemented in Java using the JavaFX1 framework. To render the video streams and the professional broadcast, we use the vlcj framework2

The application on the secondary device is implemented in Java for the Android platform. It automatically establishes a TCP connection to the TV application over a local WiFi network. We use the Facebook API at startup to authenticate users synchronize their friends. Upon a successful authentication, the application sends the login information to both the server and the TV application.

which provides direct Java bindings to the VLC media player.

The remote controlling mode is an overlay which tracks touch positions and movement of fingers on the mobile screen. If a swipe or touch gesture is detected, the application sends the recognized command to the TV. The last component of the system is the centralized server. It is implemented in Java and handles the communications between homes and in-the-field clients using Remote Procedure Calls. The server is pull-based, so that it avoids issues of blocked ports and firewalls on the client side. Both, the TV and the Android

1 http://docs.oracle.com/javafx/ 2 http://www.capricasoftware.co.uk/projects/vlcj/

application in the living room are connected to the server. Furthermore, the server is responsible for decoding incoming and encoding outgoing video streams. We run a VideoLAN Manager (VLM) 3

4. CONCLUSION AND FUTURE WORK

instance which is responsible for storing the user-generated videos on the server. Currently, we expect RTP streams as input and use VLM to redistribute them as HTTP streams for a better compatibility with different platforms.

In this paper, we advocated the use of user-generated mobile live video sharing and leveraging viewers’ gestural information to bridge the experiential gap between in-situ and remote event experiences. We presented a system to address these issues, along with three interaction concepts as a first step toward supporting fundamentally new live sporting event experiences. The concepts are particularly designed to foster active engagement between spectators located in both realms, such as sharing location-based live user-generated videos between TV viewers and spectators in the field. We plan to conduct field studies using our system. We conclude with several research questions that set the stage for future research directions: (1) How can we establish two-way multi-channel communication in real time without distracting TV viewers or spectators from the actual event? (2) How can these techniques be situated within the existing design spaces of watching a live television broadcast of an event? (3) How can the TV viewer's spatial information such as orientation, postures and gestures be leveraged for designing novel interactions? Finally, (4) how will the whole proposed interaction techniques affect the overall remote watching experience and social interactions?

5. REFERENCES [1] Bentley, F., Groble M., TuVista: meeting the multimedia needs of mobile sports fans. In Proc. MM'09, ACM, 471-480.

[2] Coppens, T., Trappeniers, L., & Godon, M. AmigoTV: towards a social TV experience. In Proc. EuroITV'04.

[3] Dezfuli, N., Huber, J., Churchil, E., and Mühlhäuser, M. CoStream: Co-construction of shared experiences through mobile live video sharing. To Appear in Proc. the 27th BCS Conference on Human-Computer Interaction (BCS-HCI '13).

[4] Geerts, D. Comparing voice chat and text chat in a communication tool for interactive television. In Proc. NordiCHI'06, ACM, 461-464.

[5] Jacucci, G., Oulasvirta, A., Ilmonen, T., Evans, J., Salovaara, A. CoMedia: Mobile Group Media for Active Spectatorship. In Proc. CHI '07. ACM, 1273-1282.

[6] Lux, M., Huber, J., Why did you record this video? An exploratory study on user intentions for video production. In Proc. WIAMIS’12, 1-4.

3 http://www.videolan.org/projects/vlma/

[7] Marcus, A., Bernstein, M. S., Badar, O., Karger, D. R., Madden, S., Miller, R. C. TwitInfo: aggregating and visualizing microblogs for event exploration. In Proc. CHI '11. ACM, 227-236.

[8] Sahami Shirazi, A., Rohs, M., Schleicher, R. Real-Time Nonverbal Opinion Sharing through Mobile Phones during Sport Events. In Proc. CHI '11. ACM, 307-310.

[9] Dezfuli, N., Huber, J., Olberding, S., and Mühlhäuser, M., CoStream: in-situ co-construction of shared experiences through mobile video sharing during live events. In Proc. CHI EA '12. ACM, 2477-2482.