- APPLICATION NOTE
- Open Access
SocioGlass: social interaction assistance with face recognition on google glass
Scientific Phone Apps and Mobile Devices volume 2, Article number: 7 (2016)
We present SocioGlass - a system built on Google Glass paired with a mobile phone that provides a user with in-situ information about an acquaintance in face-to-face communication. The system can recognize faces from the live feed of visual input. Accordingly, it retrieves relevant information about a person with a matching face in the database. In order to provide interaction assistance, multiple aspects of personal information are categorized based on its relevance to the interaction scenario or context. Thus, the system can be adapted to the social context in interaction assistance. The system can be used to help acquaintances build relationships, or to assist people with memory problems.
Face-to-face communication between acquaintances is occasionally hindered by one’s self-consciousness in public places, and lack of understanding of other persons’ background and interests. This may lead to social stagnancy even though one has a decent intention to socialize and build relationships. Interaction assistance may help in alleviating communication barriers, thus facilitating inter-personal relationship (Fiske and Taylor 2007). So far, a few solutions have been proposed but are restricted to prosthetic memories typically implemented on mobile devices (Kane et al. 2012). They barely provide real-time assistance for face-to-face communication.
Wearable solutions, such as Google Glass, make the retrieval of information easier in routine communication (Ha et al. 2014). Relevant information can be retrieved and displayed without much effort from the user. In this app, we attempt to provide in-situ social interaction assistance by automatically retrieving personal information using Google Glass. The system, called SocioGlass, is built with two core techniques, namely (1) dynamic face recognition, and (2) adaptive retrieval of personal information. The system attempts to augment human cognition, particularly memory, in the social interaction context.
There are occasions when a person cannot remember the name or other information of an acquaintance. When this happens, a tool that helps the person retrieve the relevant information may prevent possible diversion or offense. Another use case is socializing in a business context, such as a conference or a business meeting. The app can recognize persons who are of interest to a user and provides relevant information, such as name, affiliation, research interests, etc. This may make it easier for the user to initiate a conversation and to build up rapport or relationship.
The application serves as an external memory of the user. To do so, one needs to pre-register the biographical information of a person together with the facial features. The biographical information consists of a number of items that are organized as six categories (Table 1). Considering that different aspects of personal information are contingent on the social context (e.g. a formal meeting or a casual conversation) (Xu et al. 2015), the biographical information is organized according to how important an item or a category is to the current context. The app makes use of the environment as the context of a social interaction, which is identified based on a method for scene recognition (Li et al. 2014). However, we did not activate automatic scene recognition for context-awareness because the system was trained on specific scenes in a particular building. And thus it may not recognize the context in an unconstrained environment. In the current app, the context is selected manually.
Meanwhile, facial features of a few persons are pre-registered in the system. In particular, for each person, images of the facial area are captured from multiple angles. These facial images are fed to the system to extract key facial features for identification. Details of the face recognition algorithm are given in the subsection Software.
The system consists of a wearable camera (i.e. a Google Glass) and a smart phone running Android OS. The two devices are connected via Bluetooth. The camera provides egocentric vision/perception, i.e. it shares similar viewing angle with the user. In our application, a client-server-cloud structure is adopted. At the front-end, the client (i.e. Google Glass) acquires images, displays results, and issues voice instructions. The server runs on the mobile phone that relays the images to the cloud. The image is processed in the cloud to try to find matching faces. If a person is recognized, the personal information is sent back to the client and displayed to the user. Figure 1 shows a photo of a user demonstrating the glass and smart phone running SocioGlass.
The core technology of the application is face recognition. When the app is started, the Glass camera captures image sequences continuously at a resolution of 640*480 pixels. To reduce the data flow between the devices, images are cropped out from the original images, so that only the central region (320*240 pixels) is used for processing.
SocioGlass leverages on the face recognition algorithm proposed in (Mandal et al. 2014). The algorithm performs face detection using OpenCV, eye localization using OpenCV and ISG (Integration of Sketch and Graph) patterns, and face recognition using an extended eigenfeature regularization and extraction (ERE) approach involving subclass discriminant analysis method (Mandal et al. 2015). This method has been evaluated extensively on numerous large face image databases. For unconstrained face recognition, it has achieved an error rate of 11.6 % with just 80 features on challenging YouTube face image database (Mandal et al. 2015). With similar number of features on wearable device database of 88 people comprising of about 7075 face images, captured mostly with Google Glass, it can achieve more than 90 % accuracy with just 7 images in the gallery (Mandal et al. 2014). Furthermore, to support accurate and fast face recognition in dynamic environments (e.g. varying illumination conditions, motion blur and changes in viewing angles), we adopt a multi-threaded asynchronous structure to leverage the opportunistic multi-tasking on the smart phone (Chia et al. 2015).
The apk files can be downloaded for the mobile phone http://perception.i2r.a-star.edu.sg/socioglass/socioglassserver-debug.apk and Google Glass http://perception.i2r.a-star.edu.sg/socioglass/socioglass-debug.apk. Instructions on the installation and usage of the app are available in the user guide (http://perception.i2r.a-star.edu.sg/socioglass/SocioGlass_user_guide-v1.pdf).
The interaction protocol of SocioGlass is as follows.
Provided that the mobile phone and Google Glass are paired and connected via Bluetooth, and that the mobile phone is connected to the cloud server via Internet. The user starts the app on the phone (Fig. 2 b). He can then place the phone aside because he only needs to interact with the Glass subsequently.
The user starts the app on the Glass, the camera of Google Glass starts to capture live image feeds and sends them to the phone via Bluetooth connection (Fig. 2 a). A textual instruction is shown in the Glass display “Place the target person in the box below”. The user is prompted to position the Glass so that the face of the target person is located in a central region of the Glass display.
The phone receives the images from the Glass and starts to detect faces. If a face is detected, it will match them with pre-registered ones in the database. If a match is found, it will retrieve the person’s biographical information and send it to the Glass. Accordingly, the Glass displays the information together with a portrait photo of that person (Fig. 2 c). The portrait photo is displayed on the left side while the biographical information is displayed on the right side. The portrait photo and the text under it (i.e. name, position, and company) are always visible. A user can navigate between categories (bottom-right tab menus) by swiping forward or back on the Glass touch pad, and browse items within a category (if the category contains more than three items) by swiping up or down. To make the information easily accessible, we include an “All” category, which contains all information items of a person in alphabetical order.
If the system detects a face that is not recognized (e.g. not registered in the database), it displays “Unknown”. This means that the detected face is not registered in the system or his identity is not enrolled in the database.
The procedure is shown in the video that can be downloaded at http://perception.i2r.a-star.edu.sg/socioglass/Socioglass.mp4.
Wearable egocentric vision establishes the foundation for in-situ communication support for social interactions. This app implements an assistive tool for social interaction on Google Glass to support face-to-face communication. With carefully designed system functionality and user interfaces, the system is expected to facilitate the communication among acquaintances. This system serves as as a memory aid to remember, recall and log faces with identities, for people with cognitive decline or those who need to improve their social interaction skills. This system is also of interest to readers who want to exploit the system for interactions, as well as those who are interested in visual computing, face recognition, interaction design for wearable devices.
Chia, SC, Mandal B, Xu Q, Li L, Lim JH. Enhancing social interaction with seamless face recognition on Google glass - leveraging opportunistic multi-tasking on smart phones. In: MobileHCI’15. ACM: 2015. p. 750–7.
Fiske, ST, Taylor SE. Social cognition: from brains to culture. New York: McGraw-Hill Higher Education; 2007.
Ha, K, Chen Z, Hu W, Richter W, Pillai P, Satyanarayanan M. Towards wearable cognitive assistance. In: MobiSys’14. ACM: 2014. p. 68–81.
Kane, SK, Linam-Church B, Althoff K, McCall D. What we talk about: designing a context-aware communication tool for people with Aphasia. In: ASSETS’12. ACM: 2012. p. 49–56.
Li, L, Goh W, Lim JH, Pan SJ. Extended spectral regression for efficient scene recognition. Pattern Recognit. 2014; 47(9):2940–51.
Mandal, B, Chia SC, Li L, Chandrasekhar V, Tan C, Lim JH. A wearable face recognition system on Google glass for assisting social interactions. In: Computer Vision - ACCV 2014 Workshops. Switzerland: Springer: 2014. p. 419–33.
Mandal, B, Li L, Chandrasekha V, Lim JH. Whole space subclass discriminant analysis for face recognition. In: IEEE International Conference on Image Processing (ICIP). IEEE: 2015. p. 329–33.
Xu, Q, Mukawa M, Li L, Lim JH, Tan C, Chia SC, Gan T, Mandal B. Exploring users’ attitudes towards social interaction assistance on Google glass. In: ACM AH’15. ACM: 2015. p. 9–12.
The authors declare that they have no competing interests.
QX proposed system design, data modeling, user interface (UI) and experiment design. SCC developed the Android program to implement the system and UIs. BM provided the core face recognition algorithm. LL contributed to the system design, face recognition algorithm and personal data modeling. JHL proposed the system architecture and ways to improve system performance. MAM contributed to prototype development and experimental testing. CT proposed use cases and interaction protocol. All authors read and approved the final manuscript.