ConsensUs: Computer-moderated Structured Discourse

George E. Mobus, Ph.D.
Don McLane

Computing and Software Systems,
Institute of Technology,
University of Washington, Tacoma
gmobus@u.washington.edu
dmclane@u.washington.edu

Version 0.1 - Draft Only, For limited release only

Date

Name

Revision

2/9/05
Mobus, G.
Revised section 1.2 to better explain the process of problem space decomposition.
10/13/04  Mobus, G.  Added a paragraph in section 2 regarding convergent vs. divergent discourse. Added text to sections 2.4 and 2.5. Provide link in section 7.1 for the current project task list. 
9/10/04  Mobus G.  Started expanding section 2
Started new section 10 for End notes. 
9/6/04  Mobus, G.  Added example Discourse for Global Issues Network for global warming in section 8. 
9/2/04  Mobus, G.  Added sections: 1.6, 4.7, 5.6, 5.7 all dealing with search and data mining 
8/27/04 Mobus, G. Initial release

1.  The Problem Domain

      Globalization means many different things. But one aspect of the phenomenon is the connecting of geographically distributed actors with similar interests into networks for the purpose of identifying and/or solving issues of global importance. This connection is made feasible by the Internet. Today it is possible for a global-scale network of many hundreds, if not thousands, of individual actors to share ideas asynchronously. We call this capability "electronic discourse". Members of a virtual community can interact via software interfaces to conduct on-going conversations on topics of interest. Email listservs, usnet and discussion boards were first generation e-discourse applications. These applications impose a minimum structure on the process of discourse. They basically organize topics and maintain a threading structure that allows for responses to be tracked. Beyond this, they do not provide any internal structure that might further facilitate the discourse process in reaching some kind of conclusion. And, even more importantly, they do not provide any management of the explosion of information that comes from the combined contributions of a huge number of participants. On a global scale, and with possibly millions of participants in a virtual community, it is vital that information be meaningfully consolidated, or pruned (if found unhelpful) in order to keep the discourse focused and productive.

      More recent entries to computer-mediated discourse systems, such as E-Delphi (see below) provide somewhat more structure, but require considerable human intervention for moderation. Even some of the email lists, usenet newsgroups and bulletin boards require a human moderator to filter, edit or summarize if they are to keep productive discourse going. Very little has been done in the area of computer-moderated, structured discourse support. This latter category goes beyond computer-mediated to actually include an intelligent moderator assistant. Such a facility may allow communities of interested parties to conduct true peer-to-peer collaborative, deliberative discourse without the time consuming and burdensome need for human moderators.

      Communities of interested parties or actors often seek to carry on a discourse activity with the intent of coming to some kind of conclusion and consensus. Communities form around common interests and members share similar values, ideas and objectives. Thus, the potential for agreement is built into the nature of the community.

      What is not part of the majority of e-discourse applications is an internal structuring that is designed to optimize the process and assist it in finding consensus views. The tool described in this document is proposed as an assistant to virtual communities in conducting a structured discourse on topics of interest. It supports a top-down group analysis of a subject domain, the accumulation of documentation that supports the process, and the initial stages of a bottom-up development of action proposals (problem solutions). ConsensUs extends the idea of threaded comment structure by introducing the concept of typed discourse structures and specifying an organization of these structures. The introduction of organization (internal structure) fosters keeping the community members "on-topic" without sacrificing flexibility. Members are able to pursue ideas in a semi free-form but kept within the domain of interest.

      There are two new ideas in ConsensUs that facilitate arrival at consensus, or at least, nuclei of consensus. The first idea is based on learning and memory in brains, specifically short- and long-term memory effects. In this approach, each contribution (except for comments - see below) to the discourse is given a specified time to live in the database. If that element is reinforced with additional added contributions, as explained below, then the time to live value is increased and the item persists as part of the overall structure and is accessible to the community. If, however, the item is not reinforced by activity, it means that the community does not support it and its time to live value is decremented (daily). When it reaches zero, the element is archived and removed from the discourse structure. In this way, the structure emerges from the activity of the community and reflects what the community considers important.

      The second new idea is the use of computational semantic analysis to assist finding common themes and what we will call nuclei of consensus. The details of this approach will be given later, but the central idea is to aggregate and collapse or coalesce comments whose semantic content indicates a high degree of similarity. Such an aggregation does two things. It first reduces the information load by providing a weighted comment that represents a commonality of ideas, values, and objectives. Community members can grasp the core of the theme by reading a single comment that represents the nucleus rather than having to read, and recall all of the comments that seemed to agree. The second thing it does is provide members with a sense of how strong the consensus is for a particular theme (nucleus). This may aid in members being swayed in the direction of the perceived consensus. But of course, it may simply invigorate their efforts to move the discourse in a different direction.

1.1.  Community of Interested Parties and Peer-to-Peer Networks

      We often find on-line or virtual communities forming in an ad hoc manner in several media. Usenet provides an ample number of examples of communities of interested parties spontaneously forming in existing usenet groups. Members discover common interests, usually in some sub-topic of the named group, and at some point decide to form a new newsgroup within the usenet hierarchy.

      Increasingly we see ad hoc communities of peer members arise where each member is seen as an equal actor or contributor to the discourse. In addition, the increasing size of some of these communities make it potentially untenable to consider serving the community from a central server. In the last few years we have seen a number of peer-to-peer (P2P) technologies arise that provide a distributed computing environment that supports data and processing cycle sharing across a large number of members. Witness the "success" of networks like Napster, or the clever distribution of a computationally expensive process like looking for signature signals (of extraterrestrial intelligence) in a huge matrix of data in seti@home.

      A P2P platform solves a major problem in the Internet in allowing nodes with dynamically assigned IP addresses to maintain a persistent (even if intermittent) identity presence in the network. Each node becomes both a client and a potential server. Data can be distributed variously throughout the network so that redundant access is possible. Additionally, some platforms, such as JXTA (see below) provide security features that could be useful in maintaining a secure community.

      With this in mind, we propose to build ConsensUs on top of a P2P platform.

      In a given ConsensUs application, the community organizers can decide on the issue of member identity, whether members will be identified by real names (full transparency) or by pseudonyms (moderate anonymity). We do assume that some measure of control over the joining of a peer group will be maintained. But the ability to have "frank" discussions by hiding real names might be a plus.

1.2.  Discovering the Structure of a Subject Space

      In this section we describe the process by which a community of interested parties (or community) can explore a subject space and through a process of discovery, map out the dimensions and positions taken in the space.

       The overall process of structured discourse can be described as being similar to the top-down analysis of a problem space.  It is not dissimilar in form to the process of outlining used in composition.  A top-level subject is broken down into a set of next-level sub-topics through iterative deliberation on the nature of the top-level subject. You ask the question: "What are the main sub-topics that need further refinement?"  In structured discourse, this question is expanded through discourse with multiple participants who themselves, submit candidate sub-topics, ask questions or raise issues, and supply commentary on the issues and sub-topics proposed.

      A subject or topic is identified by a noun or noun phrase, it's name, which to some greater or lesser degree captures the semantic quality of the subject. This identifier (in some sense similar to function identifiers in programming languages) is the abstraction of the full semantic space of the subject. It should capture and convey the collection of elements within that space. For example, the subject "Global Warming/Climate Change" (see example below in Section 8) is a general rubric subsuming a whole constellation of ideas that need to be teased out. Every area of interest can be given a topic heading or name.

      But then, a topic, if it is an abstraction, can be broken down into a set of sub-topics (just as this document is organized). With analysis, careful thought and deliberation, represented in the issues and commentary supplied at a given level, the sub-topics, all being of about the same level of detail, will emerge from the discourse. That is they will represent the right level of structural decomposition of the topic. Again to follow the example, global warming might be broken down into the sub-topics (not meant to be exhaustive, just illustrative):

      Recursively, a community of interest can decompose a subject/topic into increasingly refined sub-topics. The process ends when no one in the group can think of a lower level sub-topic.

      The subject is decomposed into a subject tree where each level of the tree represents a more refined level of detail in the subject space. This tree structure (Fig. 1) is the basis for the structured discourse in ConsensUs.  The figure leaves out the issue and comment objects (which are leaf nodes attached to each topic node shown - also see Figure 5 in section 3.0 to see the structure of a filled-out topic node.)

Topic Tree
Figure 1. Topics and sub-topics form a tree structure.
      Community members (CMs for short) suggest sub-topics for any given topic. Any member is free to suggest (nominate) any sub-topic and add it to the set of sub-topics. A topic data structure is shown in Fig. 2. Each topic object, starting from the top-level (or root of the tree) is assigned a unique ID number. The CM provides a Topic Identifier, which is a short noun phrase giving a name to the sub-topic, and a description, that may further explicate the subject matter.

Topic Object Data Structure
Figure 2. Topic object data structure.
      A sub-topic is just another topic object as shown in Fig. 3. Any topic can have as many sub-topics (lower level topics) as the community deems. Recursively, every sub-topic can be broken down into as many sub-sub-topics as the community deems.

Sub-topic relationship
Figure 3. Sub-topic relationship to higher level topic.
      Every sub-topic listed in the higher-level topic's sub-topic set is represented by a child node of type topic.

      The topic/sub-topic tree forms the skeleton framework for the ConsensUs discourse structure. All other discourse objects (described below) are children nodes from one of the topic nodes. As shown in Figures 2 and 3, there are additional sets of objects associated with each topic node. The addition of nodes (through the same kind of nomination process as described above) add additional children to each of the interior nodes of the discourse (topic) tree. These additional node types provide the basis for analysis of the subject (semantic) space as mentioned earlier. Essentially, over time the topic tree emerges from the discourse of the community.

1.3.  Analysis of a Problem Space - Top Down

      The topic/sub-topic hierarchy forms a skeleton for the discourse that will put "meat on the bones". In this section we describe the dynamics of subject space emergence from group participation. We then describe the "fleshing out", as it were, of the skeletal topic structure. We view this as a top-down analysis of the subject space, an analysis conducted by the community in collaboration. Later, we will consider the nature of a bottom-up process (also collaborative) to find action proposals or solutions to the problems identified in the top-down analysis.

1.3.1.  Evolution of the Topic Structure

      In the following description we will discuss each kind of discourse object seperately.  It is important to note that the evolution of the topic structure (tree) is a very dynamic process whereby much time and energy may be spent on one level of the tree, with members generating many issues and comments before resolving the sub-topics to be later explored.  We envision communities deliberating at a given level to explore most of the important issues under the topic before growing the tree through the addition of new sub-topics.  However, it is also possible to grow one or more branchs of the tree, proceeding to a deeper level, if that is the "path of least resistence".  There are no restrictions on the order in which levels get added to the tree.  Additionally, we plan to provide mechanisms that will allow the restructuring of the tree if the community decides that a given branch should be promoted to a higher point in the tree, or demoted to become a child of some other node that is a descendent of the one to which it is currently an offspring.  The process of discourse should be flexible but always produce a tree structure, whether balanced or not.

      The explication of the subject space is accomplished by iterative decomposition and community nominations. To anticipate the detailed description, later, of one of the mechanisms of consensus, we mention here the dynamics of how the topic tree comes to represent a "best view" of the subject space. In essence, though any CM may nominate a sub-topic under any existing topic/sub-topic, in fact, the community as a whole may not agree with that nomination. We will describe the process of commenting later. But for now, note that when there is little or no support by the community for the nominated topic, then over the course of time that topic will simply disappear. This is accomplished in a straightforward manner.

      Each new topic added to the tree is assigned a default time-to-live (t2l) value (a long integer) measured in days. Each day that the nominated topic does not receive some form of community support (described below) the t2l value is decremented. If it reaches zero, then the nominated item is removed from the set (perhaps after archiving for audit trail support). If, on the other hand, the community provides support for the item, then it's t2l value is incremented by a non-unit value and it continues to live in the structure. In fact, when the t2l value exceeds some agreed upon limit, the item can be made permanent. This means that the community has agreed that the item is a genuine and important sub-topic and that it is at the appropriate level in the tree.

      Support for any given sub-topic obtains from CMs adding structures to the topic. Once such structure is the topic object described before. Adding topic objects to the sub-topic set of a topic indicates interest and support for the parent node. But topics alone do not define the knowledge space of a subject. CMs may also support a topic by adding additional types of objects to the appropriate sets in the topic object. In Figure 2, above, there are four additional sets containing object types: issues, comments, proposals, and a user-defined type.

1.3.2.  Discovering Issues

      Under any topic the community may deliberate on a number of related issues or questions. Often, in fact, these will be expressed as questions. For example, following the global warming topic above, an issue might be: " Is human activity (the burning of fossil fuels) the major source of CO2 increase in the atmosphere?"

      Whereas sub-topics identify the subject structure and act to organize the hierarchy of ideas, issues identify the particular concerns that need to be addressed by the community. These are questions that need to be answered or problems that need to be solved.  The process of discourse will likely produce a number of issues under any given topic/sub-topic.  These will be discussed in commentary (comment objects also linked under the topic alongside the issues).  Issues are seen as the drivers behind further decomposition and will, as well, be instrumental in formulating proposals for action (section 1.5 below).

Issue Object Structure
Figure 4. An Issue object structure.
      A topic may have any number of issues identified with it. Issues are a type of child node to a topic and are always leaf nodes in the sense that they have no topic or issue descendants. As with the process of deliberation that leads to community support for a topic, a similar process works to support or not an issue nominated by a CM. In this case the issue object only contains a comments set (Fig. 4). Once an issue is nominated, CMs may add comments (see below) to this set. The addition of comments will increment the time-to-live variable in the issue object, thus extending its life. If comments are not made, the t2l variable is decremented as above and if it reaches zero the issue is archived and removed from the topics issues set.

      The addition of issues to a topic node are support for that topic and the topic t2l is incremented accordingly. In fact, if the topic is a sub-topic then the incrementing propagates up the topic tree to the root.

      But issues do more than merely support a topic. They begin to add substance and direction to the discourse of the community. Identifying (discovering) the important issues at any level in the topic tree is a major part of the process. Adding comments to the issue is the means by which the community proceeds to this identification.

1.3.4.  Commenting

      Issues are the heart and viscera of the discourse structure. Comments are the muscle, sinew, arteries and veins.

      Comments are, as with any discourse system, free-form text (although in ConsensUs we envision support for graphics, applets and multi-media) that capture all of the relevant ideas, reasoning, observations, data, etc. that support the other structures. Comments are about one of the other structures and therefore, every other structure type has a comments set.

      In ConsensUs, comments are threaded in three dimensions. First, comments are threaded within the level and object to which they are rooted. This is essentially the form of threading one finds in a listserv or discussion board. Secondly, comments may be threaded across object types at a given level. That is, a comment may be about both an issue and a sub-topic. Finally, comments are threaded between levels, so, for example, a comment thread may start in the parent topic node and extend down to comments in the issues, sub-topics or proposals nodes. This threading structure allows a CM to pickup and follow a comment thread from many different perspectives.

      Another feature of ConsensUs is that comments can be subjected to group voting. Not every comment needs to be voted on, but comments that act as propositions of position, fact, etc. may be more valuable to the discussion if the CMs are allowed to weigh in on the content. As such ConsensUs provides a mechanism to let CMs rate attributes such as relevance, importance and validity, on a Likert scale (0 to 10). Comments receiving good ratings on these scales contribute more to the incrementing of the t2l variable for the interior nodes from their parent back to the root. The voting mechanism keeps track of who (see identity transparency/anonymity above) has voted already so that repeat voting does not occur.

1.4.  Arriving at Consensus

      Comments are also the basis for the second kind of consensus building that we are interested in. The aggregate of comments (especially within a thread) may be analyzed for semantic content to determine if some measure of consensus is developing. The theory here is that if many people are saying very similar things (within or across threads) then it may indicate the development of agreement about the discourse object of interest. That is to say, if there are a large number of comments within a given topic that are effectively similar (see below for discussion of semantic analysis) then there could be a growing consensus within the topic.

      Periodically, then, comments are analyzed to determine if a consensus pattern is emerging. As will be described below, a semantic center to a comment cluster (if it is found) is computed and the comment closest in space to that center is put forth as the representative of the idea finding consensus. Perhaps at that point, a designated CM would provide a summary, edited version of the comment, tagged as being representative and weighted in some way to indicate the level of consensus. The visualization of the topic tree would then be modified so that, where they exist, the representative comment is all that appears, from that cluster, when one expands the comment set. The comment appears (on examination) along with a header marking it as a representative and a numerical value or weight indicating how "good" a representative it is (essentially how tight is the cluster). Another value can indicate the proportion of total comments that are within the representative's cluster. Thus a CM has an ability to judge the weight to give to the idea that a consensus is forming. [On double clicking a representative you can get access to the entire cluster for auditing.]

      In effect then, a large number of comments can be collapsed or coalesced into a single representative comment that reduces the amount of text that a CM would have to peruse to get the jist of the consensus idea. It is believed that such a process (which does not throw out information but merely hides it from immediate view) would greatly reduce the information overload experienced by participants in many other kinds of discourse systems.

      We should note that it is more than likely that within the comment set of an object the analysis may identify more than one cluster of similar ideas. Also such clusters may be more tightly or loosely defined. We would advocate a threshold (user-defined) that would have to be met before a cluster is declared. In any event, the existence of multiple clusters in a given set would be indication that more discussion was needed!

1.5.  Proposing Solutions - Bottom Up

      Thus far we have been concerned with the process of discovering the subject space and developing the topical analysis of that space. We have also added considerable detail to the discourse structure as we went. As envisioned, this process in basically a top-down, but also iterative in that CMs can easily revisit and revise upper levels of the tree. There are no constraints on when one can add issues, comments, etc. to the tree.

      But the general purpose of a subject discourse is to come to some kind of conclusion, propose actions to take or solutions to problems (as identified in the issues) or goals to be achieved. In that vein, ConsensUs adds the Proposal object to the discourse structure. Structurally it resembles an Issue object but it can also have a voting attachment like a comment. In this case the attributes are desirability and feasibility [Note: other attributes may be defined by the community.]

      There is no absolute rule about when a proposal can be added to a topic. However, as envisioned, proposals will most likely be added after a substantial amount of discourse under that topic has been completed. And, in fact, one can easily see how proposals could be added starting from the bottom-most level of the tree -- solving the more detailed, immediate problems contribute to solving the higher-level problems.

      In general, then, proposals support a bottom-up approach to answer the issues/questions posed in the topic. As with topics and issues, proposals have a comment set and comments added to the set provide interest support for the proposal. Voting for proposal comments and for the proposals themselves influences the incrementing of the t2l variable propagating back up the tree.

1.6.  Search and Data Mining Tools

      Several tools are available to CMs to further allow the exploration of the evolving subject space. A standard indexed database search engine will be provided. This allows a CM to quickly search the various discourse objects for key words. CMs can then browse pages generated from these objects. Searches can be conducted just within one type of discourse object (say comments), a combination of types, or within all types. The search engine must run on one or more dedicated peer nodes acting as search servers.

      The second tool is a personal search agent. This tool differs from the search engine in that it does not search an indexed database (which requires a dedicated server as indicated above). Instead, this agent conducts an on-going search of the P2P network for objects that contain key words indicated by the CM. The search is conducted on all archived objects (comments), not just visible objects.

      What is unique about this tool is that the agent learns additional terms/phrases that help refine the search process. When a search agent finds an object meeting the key word criteria of the CM, it returns the object for review by the CM. If the CM indicates that the object is particularly useful, then the agent saves additional significant words that if finds in the document. If, it finds these terms in other objects that the CM indicates are worthwhile, then it can recommend these words to the CM as additional key words for future searches. Together, these tools may help CMs to discover further patterns of consensus or trends in thinking of the group.

      Finally, provisions are made to export object contents to a standard relational database format so that conventional and future data mining tools can be used in ways not currently envisioned.

2.   Background and Current Work

      This section discusses some prior work in developing systems of community discourse. This is not a comprehensive coverage. It does not, for example, cover systems such as Robert's Rules of Order for meetings, or public debate, or any number of in-person, synchronous methods for allowing people to participate in meaningful discourse.

      Rather, this will provide a brief review of some of the more important aspects of (particularly asynchronous) discourse systems, with special reference to electronically supported methods.

      Additionally, the concern we have is for the support of what might be described as "convergent" discourse as opposed to "divergent" discourse. The former involves both an intent (on the part of the participants) to reach agreement or consensus and a process to support achieving it.

      One could argue that one purpose of participative discourse is to try to reach general agreement, if not consensus, on the nature of the problem and what should be done to solve it. In essence, then, communities seek to understand the problem(s) they face by asking key questions: