Back to table of contents

A photograph of a library in Goteborg Sweden Credit: Marcus Hansson, Goteborg Sweden

What interfaces mediate

Andrew J. Ko

In the last two chapters, we considered two fundamental perspectives on user interface software and technology: a historical one, which framed user interfaces as a form of human augmentation, and a theoretical one, which framed interfaces as a bridge between sensory worlds and computational world of functions and state. A third and equally important perspective on interfaces is what role interfaces play in society. While user interfaces are inherently computational artifacts, they are also inherently sociocultural entities.

Broadly, I view the sociocultural role of interfaces as a mediating role. Mediation is the idea that rather than two entities interacting directly, something controls, filters, transacts, or interprets interaction between two entities. For example, one can think of human-to-human interactions as mediated, in that our thoughts, ideas, and motivations are mediated by language. In that same sense, user interfaces can mediate human interaction with many things. Mediation is important, because, as Marshall McLuhan argued, media (which by definition mediates) can lead to subtle, sometimes invisible structural changes in society's values, norms, and institutions (McLuhan 1994). In other words, how you design a user interface can change society, in addition to providing functionality. Apple's Face ID, for example, may be a convenient alternative to passwords, but it may also lead to transformations in how society thinks of privacy and identity.

In this chapter, we will discuss how user interfaces mediate access to three things: automation, information, and other humans.

Mediating automation

As the theory in the last chapter claimed, interfaces primarily mediate computational automation. Whether it's calculating trajectories in World War II or detecting faces in a collection of photos, the vast range of algorithms from computer science that process, compute, filter, sort, search, and classify information provide real value to the world, and interfaces are how people access that value.

One can think of interfaces to computation as APIs (application programming interfaces), organized collections of functionality and data structures that encapsulate computational automation. For example, think about what you're using when you're operating a calculator app on your phone: it's really a way of asking a computer to performing basic arithmetic operations on your behalf. Each operation is a function that takes some arguments (addition, for example, is a function that takes two numbers and returns their sum). In this example, the calculator is literally an interface to an API of mathematical functions that compute things for a person. But consider a very different example, such as the camera application on a smartphone. This is also an interface to an API, with a single function that takes as input all of the light into a camera sensor, a dozen or so configuration options for focus, white balance, orientation, etc., and returns a compressed image file that captures that moment in space and time. This even applies to more intelligent interfaces you might not think of as interfaces at all, such as driverless cars. What is a driverless car but one big function called over and over in real time that takes your location, your destination, and all of the visuospatial information around the car as input, and computes an acceleration and direction. The calculator, the camera, and the driverless car are really both just interfaces to APIs that expose a set of computations, and user interfaces are what we use to access this computation.

From this API perspective then, user interfaces mediate access to APIs, and interaction with an interface is really identical, from a computational perspective, to executing a program that uses those APIs to compute. In the calculator, when you press the sequence of buttons "1", "+", "1", "=", you just wrote a program add(1,1) and get the value 2 in return. When you open the camera app, point it at your face for a selfie, and tap on the screen to focus, you're writing the program capture(focusPoint, sensorData) and getting the image file in return. When you enter your destination in a driverless car, you're really invoking the program while(!atDestination()) { drive(destination, location, environment); }. From this perspective, interacting with user interfaces is really about executing "one-time use" programs that compute something on demand.

How is this mediation? Well, the most direct way to access computation would be to write these programs in a computer's assembly language and then execute them. That's what people did in the 1960's before there were graphical user interfaces, but even those were mediated by punchcards, levers, and other mechanical controls. Our more modern interfaces are better because we don't have to learn as much to communicate to a computer what computation we want. But, we still have to learn interfaces that mediate computation: we're learning APIs and how to program with them.

Mediating information

Interfaces aren't just about automation and computation, however. They're also about accessing information. Before software, other humans mediated our access to information. We asked friends for advice, we sought experts for wisdom, and when we couldn't find people to inform us, we consulted librarians to help us find recorded knowledge, written by experts we didn't have access to. Searching and browsing for information was always an essential human activity.

Computing changed this. Because computers allow us to store information and access it much more quickly than we could access information through people or documents, we started to build systems for storing, curating, and provisioning information on computers and access it through user interfaces. We took all of the old ideas from information science such as documents, metadata, searching, browsing, indexing, and other knowledge organization ideas, and imported those ideas into computing. Most notably, the Stanford Digital Library Project leveraged all of these old ideas from information science and brought them to computers. This inadvertently led to Google, which still views its core mission as organizing the world's information, which is what libraries were originally envisioned to do. Whereas librarians, card catalogs, and people in our social networks used to mediate our access to information, now user interfaces do too, and increasingly so.

The study of how to design user interfaces to optimally mediate access to information is usually called information architecture (Rosenfeld and Morville 2002). Accessing information is different from accessing computation in that rather than invoking functions to compute results, we're specifying information needs that facilitate information retrieval and browsing. Information architecture is therefore about defining metadata on data and documents that can map information needs to information. That might mean defining data structures, metadata, tagging systems, controlled vocabularies, thesauri, indices, hierarchies, classifications, and other data structures that facilitate searching and browsing. User interfaces rely on all of these data structures to seamlessly mediate access to information, just as librarians do.

In practice, user interfaces for information technologies can be quite simple. They might consist of a programming language, like Google's query language, in which you specify an information need that is satisfied with retrieval algorithms. Google's interface also includes systems for generating user interfaces that present the retrieved results. User interfaces for information technologies might instead be browsing-oriented, exposing metadata about information and facilitating navigation through an interface. For example, when you open the settings application on a smartphone, all of the labels that describe categories of settings, and the hierarchy in which those settings are arranged, are a carefully designed information architecture to facilitate your search for a control. Whether an interface is optimized for searching, browsing, or both, all forms of information mediation require metadata. When we learn interfaces that mediate information, we're learning metadata schema and how to interpret them.

Mediating communication

While computation and information are useful, most of our truly meaningful interactions still occur with other people. That said, more of these interactions than ever are mediated by user interfaces. Every form of social media—messaging apps, email, Slack, video chat, discussion boards, chat rooms, blogs, wikis, social networks, virtual worlds, and so on—is a form of computer-mediated communication (Fussell and Setlock 2014). Social computing is the study of user interfaces that mediate human communication.

What makes user interfaces that mediate communication different from those that mediate automation and information? Whereas mediating automation requires clarity about what API a user is accessing, and mediating information requires metadata, mediating communication requires social context. For example, in their seminal paper, Gary and Judy Olson discussed the vast array of social context present in collocated synchronous interactions that have to be reified in computer-mediated communication (Olson and Olson 2000). Context includes multiple channels of communication, identity, shared local context, non-verbal cues through gaze, gesture, and non-verbal speech, and spatial reference. For example, think about what Facebook has to do to facilitate effective communication online: it needs to make identity clear, provide context for a conversation, and offer multiple channels, such as replies, likes, and edited status. Now think about what Facebook is missing: no non-verbal cues, few signals of emotion, no sense of physical space, and little temporal context other than timestamps. It's this missing social context that usually leads computer-mediated communication to be more dysfunctional (Cho & Kwon 2015). Many researchers in the field of computer-supported collaborative work have sought to find designs that better support social processes, including ideas of "social translucence," which achieves similar context as collocation, but through new forms of visibility, awareness, and accountability (Erickson and Kellogg 2000)

When we learn interfaces that mediate communication, we're learning how to convey and interpret new cues for social context.


These three types of mediation each require different architectures, different affordances, and different feedback to achieve their goals:

Because of each of these interfaces must teach different things, one must know foundations of what is being mediated to design effective interfaces. Throughout the rest of this book, we'll review both the foundations of user interface implementation, but also how the subject of mediation constrains and influences what we implement.

Next chapter: Declarative mediation

Further reading

Cho, D., & Kwon, K. H. (2015). The impacts of identity verification and disclosure of social cues on flaming in online user comments. Computers in Human Behavior, 51, 363-372.

Erickson, T., & Kellogg, W. A. (2000). Social translucence: an approach to designing systems that support social processes. ACM transactions on computer-human interaction (TOCHI), 7(1), 59-83.

Fussell, S. R., & Setlock, L. D. (2014). Computer-mediated communication. Handbook of Language and Social Psychology. Oxford University Press, Oxford, UK, 471-490.

McLuhan, M. (1994). Understanding media: The extensions of man. MIT press.

Olson, G. M., & Olson, J. S. (2000). Distance matters. Human-computer interaction, 15(2), 139-178.

Rosenfeld, L., & Morville, P. (2002). Information architecture for the world wide web. O'Reilly Media, Inc.