CAVEAT: A Proposal for a Web-Wide Page Registration and Rating System

Marc Demarest
marc@noumenal.com

May 1996

   
          

ABSTRACT

The Web's success as a worldwide self-publication medium brings with it the problems of consumer safety: how does the Web community protect itself from misinformation and disinformation published on the Web? A Web-wide caveat lector is certainly insufficient, since it presumes a worldwide readership capable of separating fact from propaganda from dangerous fiction. A centralized censorship bureau would kill the Web, dead.

What might work is a distributed system of volunary page registration and page rating, according to standard schema and standard criteria.

This paper describes the design and operation of such a system, called CAVEAT.

          
PROBLEM STATEMENT

Elsewhere, I have described one of the significant disadvantages of the unmediated self-publication model that the Web embodies as toxic information. The plain facts are that the quality control functions plays by earlier publication systems - including the current worldwide book and magazine publishing network - are entirely absent from the Web. While this may be politically desirable, it is practically unworkable; most of the information on the Web today is of dubious value, and a lot of it is of detriment to readers, because it is factually incorrect, because it is opinion or propaganda masquerading as fact, or because acting on the information provided leads to either intellectual confusion or to material harm.


POSSIBLE REMEDIES

We could remedy this situation by laissez-faire: by incorporating into the basic culture of the Web a universal caveat lector clause that left it to the reader to form their own judgements on material encountered on the Web. This is, implicitly, what we do today to combat toxic information. After a few years of usage, it's abundantly apparent that some significant number of readers - including people who ought to know better - are unable to form appropriate judgements, either because they are insufficiently educated to do so, or because the material on the Web with which they are interacting has been deliberately presented so as to make some judgements easy and others hard or impossible.

Caveat lector, in short, is not enough now, and will certainly not be enough for the generation that comes after us, one that by and large has had an even less adequate critical education that we have had.

We could remedy this situation by fascism: by creating a centralized vetting agency for Web content. It's obvious, however, that such an agency is unworkable (it would have to be international), dangerous (it would become the moral equivalent of the motion picture ratings agency, with similar effects) and contrary to the basic premise of the Web: cheap, ubiquitous worldwide self-publication.

We could remedy this situation by communitarianism: by creating a distributed system in which (a) publishers of Web pages registered those pages with a central registration agency and were given in exchange unique page identifiers, which these publishers used to (b) provide a facility, within the page, for any reader to rate the page according to a system of standard rhetorical, structural and factual criteria.

This last option appears to be a workable solution, in that (a) control over content does not leave the author, (b) responsibility for making and using ratings still rests with the reader and (c) the system contains no mechanism for censorship other than the informed decision of a reader to refrain from taking a page seriously. Furthermore, the system is voluntary, so those authors/publishers who believe they have reasons for not participating are not obliged to do so, and those readers who encounter pages that do not participate in the system are aware of that fact, and can factor that information into their decisions about the page content's veracity.

CAVEAT: A DISTRIBUTED PAGE REGISTRATION AND RATING SYSTEM

Theory of Operations

The theory of operations for the CAVEAT system is straightforward:

  1. An author wishing to publish a Web page registers that page with the CAVEAT authority, supplying minimal information about the page in question.

  2. The CAVEAT authority, in return, supplies a unique reference number for the document, and instructions to the author on how to embed CAVEAT rating logic into the Web page in question

  3. The author subsequently publishes that Web page with the CAVEAT logic embedded in it, perhaps as indicated in Figure 1 below.

  4. A reader, visiting said page at a later date, wishes to either rate the page visited or see other reader's ratings of that page to help form her judgement about the veracity of the information on the page. To do either, she accesses the CAVEAT logic in the page, and is vectored to the CAVEAT server, where she either requests rating information, or makes a rating.

  5. The author, at any point, can request a CAVEAT report on his page simply by specifying the page's unique identifier, as can any reader.

      
      <HTML>
      
      <HEAD>
      
      <META NAME="CAVEAT_ID" CONTENT="74857349211">
      
      ….
      
      </HEAD>
      
      <BODY>
      
      <A HREF="http://www.caveat.org/cgi-bin/rating?74857349211">
      
      <IMG SRC="http://www.caveat.org/images/logo.gif"
      
      ALT="CAVEAT Rating System Logo"</IMG> 
      
      CAVEAT Rating System
      
      </A>
      
      ….
      
      </BODY>
      
      </HTML>
      
      

Figure 1 -- Embedded CAVEAT Logic in HTML

Rating System Criteria

The rating system should include several kinds of consideration:

  • factual evaluation: is the information contained in the page accurate and complete?

  • rhetorical evaluation: is the information contained in the page presented in a fashion that allows the reader to reach appropriate, independent conclusions?

  • structural rating: is the information contained in the page presented in a logical, structured fashion?

  • use rating: is the information useable and useful?

  • qualitative rating: what does a specific reader think, feel or believe about the page in question?

The first four kinds of evaluation would be captured via numeric sliding scales, with 0 representing the lowest possible rating in the scale and 10 the highest. This data could be provided by any reader anonymously (no information is captured or stored about the reader in the CAVEAT database).

The last kind of evaluation would be captured via text input of limited size (say, 100 words) and would require attribution (the reader's name and electronic mail address would be required for a qualitative evaluation to be stored in the CAVEAT database).

Figure 2 -- Sample Rating Collection Page


Rating Presentation

Quantitative evaluations are presented in aggregate, as minimum, maximum, average and standard deviation values.

Qualitative evaluations are available, with attributions in the case of each qualitative evaluation.

Architecture And Implementation Considerations

CAVEAT could be implemented using either a distributed data store or a centralized data store. The latter is clearly a better option for operation and maintenance, but potentially a single point of failure or performance degradation. However, given careful attention to the size of the data set gathered

  • for each page from the author

  • for each page from a rating reader

it should be possible to build a performant centralized data store the size of which is manageable if a significant percentage (say 50%) of Web authors participate in the CAVEAT system.

The access to this system would be via standard Web protocols and interfaces (e.g., CGI-BIN).

CAVEAT: BENEFITS

Direct

The direct benefits of this system are straightforward: readers can rate Web pages, point one another to high-added-value Web content and warn one another away from bad or scurrilous Web content. Additionally, authors can get structured feedback on their pages and (if they are responsible publishers) make appropriate changes.

Ancillary

Ancillary benefits from the caveat system would include:

  • guaranteed unique reference numbers for Web documents, which could be used for a variety of purposes, including the construction of "card catalog" style access mechanisms

  • the basis for the institution of a Library of Congress/Dewey Decimal System style Web categorization, based again on the unique reference number guaranteed by the CAVEAT reference number

  • the ability of search engines to rank result sets according to reader ratings instead of or in addition to other ranking mechanisms.

CAVEAT: RISKS

The ownership of the CAVEAT system and database are of critical concern. The organization running CAVEAT must be non-profit and non-governmental and must under no circumstances syndicate its data for purposes other than page rating, or capture any information about authors or readers other than that required to do its work. Otherwise, the data captured by the system can possibly be used for purposes contrary to the best interests of reader and author.

          

Last updated on 06-22-97 by Marc Demarest (marc@noumenal.com)

The authoritative source of this document is http://www.noumenal.com/marc/caveat.html