noosphere.org background software discuss links changes (printable)
cmf graph nooron ooperl pyokbc second_tie_proto tie_prototype

Point by Point Response to Request for Technical Documentation

Point by Point Response to Request for Technical Documentation



Point by Point Response to Request for Technical Documentation
##########################################################################
                                                             July 16, 1999


(1) Technological Framework
===========================
The question here is not "What off-the-shelf toolkit to use?" even
though the title of this section might misleadingly suggest that.  The
problem here is really a high level design challenge, specifically:
"What combination of user environments, programming languages,
communications protocols, data representation technologies,
persistence mechanisms and distributed knowledge server management
strategies can be brought together as the foundation for a global scale
distributed knowledge management system?"

Technological Advance Sought
    Devise an architecture for a distributed knowledge representation system
    which has the following characteristics:
      - the system is scalable to the point of coping with a global population
      - the client software offers a high performance, highly graphical
        user experience, emphasizing dynamic graph generation driven
        (in the limit) by live, dynamically updated knowledge
      - the client software is dynamically and automatically extensible
        at runtime with datatype-specific display and editing code being
        acquired over the Internet and (transparently to the user) 
        put into service
      - the server software stores knowledge bases which could well exceed
        the size of system RAM and may well be distributed across servers
      - the client interacts with the server using a programming-language
        neutral protocol which affords efficient transfer of knowledge
      - make judicious use of mirroring and failover technologies to ensure
        high knowledge availability
      - only uses software protected by OpenSource license
Technological Uncertainty
    The chief uncertainties in this aspect of the TIE project were:
      - What combination of technologies and strategies could support the 
        above rather challenging requirements?
      - How to balance the goals of high availability and also
        non-centralized write privileges in a distributed knowledge 
        representation system is a non-trivial topic not covered by 
        any published literature we could discover.
Methodology
    We examined a large number of systems which seemed likely to have
    encountered some of the same issues.  Numerous design sketches and
    thought experiments were worked through before we came up with a 
    combination of technologies which addresses the objectives outlined above.
    Here is a small sample of the systems we examined during this phase:
      Systems:  
        Cyc, Ontosaurus, EcoCyc, Mariposa, Cycic Friends Network, CL-HTTPD,
        JavaSpaces, Linda
      References:
Resolution
    Clients
    - a java applet client 
    - a client in the form of a CGI program running on a webserver
    Servers
    - java and perl knowledge servers using relational datastores for
      persistence
    Protocols and languages
    - KQML (Knowledge Query and Manipulation Language) as meta-language
      for server/server and client/multi-server communication of so-called
      performatives: subscriptions, knowledge server discovery, etc.      
    - GFP (Generic Frame Protocol) as the actual knowledge query language
      for use in direct client/single-server situations (rather like a httpd 
      proxy) and also as the actual query language used in KQML-wrapped
      transactions
    Distributed knowledge server system
    - knowledge servers communicating via KQML to perform load balancing,
      failover and knowledge server discovery using agoric techniques 
      (market-inspired 'bidding' and 'payment' systems)
Effort in fiscal 1998
    This area took a couple of months of effort (a number we can determine if 
    required to).



Distributed Knowledge Architecture
==================================
see Attachment "Marketing Requirements Document" 
  for an overview of these matters.  Note that much of this
  document is out of date, for instance, JATlite has since
  has since had its licensing clarified and is not opensource.
  An opensource KQML implementation will have to be implemented for
  TIE (or at least a sufficient subset.)
see Attachment "Troublesome Questions about TIE naming"
  for some preliminary analysis of naming techniques in support
  of the following objectives.

These are areas which we are fairly confident we have suitable approaches
for, but because we don't yet have a standalone server technology implemented
we have not been able to do other than planning work on these areas.
  (2) Knowledge Failover
  (3) Knowledge Load Balancing
  (4) Distributed Knowledge Versioning
  (5) Automatic Knowledge Mirror Creation

Effort in Fiscal 1998
  A couple of weeks of literature review and conceptual work.




Foundational Ontologies
=======================
It is our position that all of this foundational work ought to qualify
for SR&ED support on account of the fact that the work not only
grappled with scientific uncertainty in its own right but more
importantly, that all of this work was crucial preparatory work for
our other, more clearly qualifying work.

KRS programming in Java
-----------------------
see Attachment "package COM.emergence.tie.gfp2" 
  for a listing of the java classes in the TIE knowledge server
see Attachment "Class Hierarchy" 
  for a listing of all the java classes across the client and
  the knowledge server

To provide a testing environment for all the other aspects of our
research and development work, we needed to create not only a
knowledge representation system but also the foundational ontologies
for the entire system.  The design and programming of the Knowledge
Representation System itself was a complex and risky undertaking with
possible failure modes including:
  1) unacceptable performance as a result of
     - very large data sets and modest bandwidth
     - large client code size resulting in unacceptable load times
     - overambitious inferencing technology
  2) our implementation ran the risk of performing poorly as a
     result of the amount of optimization required to implement
     a full fledged knowledge representation system (which is itself
     a class system offering multiple inheritance) on top of
     a byte-code interpreted environment (note that the only other
     known KRS implemented in Java -- the Generic Frame Protocal
     implementation from the Standford Research Institute -- is an 
     extremely slow 13 megabyte resource hog; largely because of
     the substantial inferencing facilities it implements).
     In short, we were trying to find a minimal frame representation
     system which would support our objectives.
  3) The SQL schema design for the persistence mechanism required
     considerable experimental effort to optimize.  The chief
     alternatives were the frame-per-record approach and the
     slot-per-record approach, each of which offered the promise of
     considerable advantages.  Slot-per-record meant that each frame
     (composed of anywhere from 10 to 40 slots on average) would
     necessitate the management of that many database records.  The
     obvious performance issues were compensated for by the broad
     range of knowledge operations which could actually be implemented
     by SQL statements.  Since frames were consequently examinable and
     selectable on the basis of slot-values using SQL operations
     directly, it presumably meant that fewer frames would have to be
     read into and fully instantiated in the KRS during any particular
     operation.  It seemed that this would offer considerable scaling
     advantages.  In practice though, this approach yielded
     unacceptable performance in our test environment of the time,
     where the KRS was actually running on each user's Java client and
     the network traffic and database latency resulting from the large
     number of database operations meant the system was far too slow.
     This approach might still bear consideration in situations where
     a true stand-alone knowledge server is arranged to run in close
     proximity to the persistent store and provide knowledge services
     to clients connecting and communicating with a knowledge query
     language rather than SQL.  Note that we could find no studies
     which analyzed and tested the merits of such an approach.  This
     was new terrain as far as we could tell.

     Our second effort proved much faster.  It made use of the
     frame-per-record approach and worked nicely with the more
     aggressive caching design of our later KRS implementations.  It
     performed adequately for small knowledge bases being accessed
     directly via SQL from remote clients.  This level of performance
     enabled us to proceed with the higher level development and
     experimentation work on the client-side.  The performance is
     unsurprisingly inadequate for large scale use in this
     configuration, largely because of the very large number of frames
     which end up being required by We are working on moving

GroundKB
--------
see Attachment "Ground_kb.tie"
To make the above described KRS software function there needs to be a
set of foundational ontologies (formal, logical definitions of the
entities contained within the knowledge system).  Some of these
foundational ontologies are found in appendix A.  The technological
uncertainties that we encountered in this process were manifold including:
  1) settling on a minimal subset to implement with the attendant risks of
     - being too simplistic initially and not being able to represent the 
       distinctions required by subsequent ontologies
     - being too complex initially resulting in excessive computation for
       the simplest knowledge operations
  2) there is a very delicate balance that we were trying to find between 
     ontological sophistication and computational tractability in a program
     intended to provide snappy graphical performance where the graphical 
     representation of each knowledge entity might involve a considerable
     number of knowledge operations

Application-specific ontologies for testing
-------------------------------------------
see Attachment "Tie_Name_Game" 
  for a very simple example of one of these test ontologies

In short, the process of composing our core ontologies (and
consequently the supporting Java code) was an iterative one, driven by
requirements revealed by attempting to implement application-specific
ontologies.  Another complicating factor was the challenge of identifying
application domains to ontologize which were simultaneously:
  1) simple enough to analyze and ontologize rapidly
     (this was a merely a preparatory effort so that we could get down
      to the more 'Idea Engine'-specific challenges outlined elsewhere, 
      so we didn't want to spend inordinate amounts of time on these 
      test application ontologies)
  2) representative enough of likely real-wold applications of TIE so that the
     underlying Ground_kb and Tie_Core_kb were being sufficiently exercised
  3) fundamentally interesting enough to our testers so that we could elicit
     genuine personal interest and involvement from them (necessary so the 
     criteria and evaluation investigations could be grounded in real opinions)

Since TIE was in an early developmental phase during this period there
was usually a need to perform user interface programming to make each
of the test applications fully usable.  It is our contention that this work is 
coverable on the basis of being programming required by the efforts to address
technological uncertainties.


(6) Ontologizing Criteria, Evaluations, Worldview and Depictions
================================================================
see Attachment "Tie_Core_kb.tie"

The literature does not document any ontologies for criteria or
evaluations, though there is related work in the Platform for Internet
Content Selection (PICS) effort.  There are several technological
advances sought in this effort.  They include:
     We seek to devise an ontological foundation for humans to provide
     criteria-based ratings of knowledge of any kind of thing in a
     fine-grained way, for broad information discovery, filtering, and
     presentation purposes.
     This is in contrast with the current state of the art, PICS, which
     is limited to the rating of URLs instead of knowledge of any kind,
     and it also suffers from being biased toward blocking pornography
     from children rather than providing facilities for flexibly examining 
     'evaluation-space' for a broad range of purposes.
     There are also a large number of other ratings-based systems on the
     net (for instance www.rankit.com) which use a single implicit 
     criterion: 'goodness'.  Now, as nice as goodness is, it is a
     very limited basis for performing a broad range of evaluation-based
     activities, in particular for hosting an evolutionary system of
     the scope of TIE.

Technological Uncertainty
  - Can we manage to create sufficiently flexible expressions of these ideas
    while at the same time making appropriate ones versionable?
  - How do we keep versions 'close together' in the realm of meaning?

Our efforts have consisted of coming up with an ontology (a collection
of formal set-theoretic definitions) of the classes listed below.
  Criterion 
     Each instance of the class Criterion represents a dimension along
     which evaluations may be performed.  Complicating factors are
     that there is the potential for relations between criteria (in
     particular that one criterion may be a refinement of another
     criterion, the possibility that several criteria together
     constitute another criterion) and that criteria have 'target'
     classes to which they are applicable.
  Evaluation
     An evaluation amounts to a rating by an agent (a person or a software bot)
     of a frame, with respect to a criterion.
  RatingSystem
     A rating-system instance is associated with a set of values which
     constitute potential ratings for use in evaluations where the
     rating system is appropriate to the criterion the evaluation is with
     respect to.  Different rating systems are used by different criteria.
  Depiction (not yet fully ontologized)
     A particular set of filters, mappings from evaluations to display
     characteristics and who-to-heed constraints as applied to a particular
     set of data.
  WorldView (not yet ontologized)
     A family of mappings, filterss and who-to-heed constraints from which
     particular depictions may be generated.  These worldviews are 
     intended as 'attractors' which can be expected to generate depictions
     characteristic of the worldview's essence, e.g. FiscalConservative or
     DeepEcologist.
  Filter    (not yet ontologized)
     A way to exclude or include frames in a set to be further processed.
  Version   (not yet fully ontologized)
     A variation on an original, which shares the original's defining
     characteristics but varies with regard to details which possibly
     make it a superior representative of that essence in certain contexts.



Conceptual, Perceptual and Procedural Requirements 
==================================================

see Attachment "package COM.emergence.tie.gui.util" 

(7) Understandable Navigation of Multi-Dimensional Evaluation Space (M.E.S.)
============================================================================
Little more than thinking about this in fiscal 1998.  Included here for
completeness.

(8) Users Remaining Oriented in M.E.S.
======================================

Only a small amount of work on these issues was done during fiscal
1998.  The chief effort was the programming of a flexible system for
nested window partitioning called BiffyPanels.

Technological Advances Sought
1)  The creation of a user interface environment in which user
  reconfigurable 'cockpits' (they control the selection and placement of 
  display and control panels within a larger window) are capable of
  undergoing evolution (in the standard TIE fashion involving 
  versioning, criteria and evaluations).  Such cockpits are 
  knowledge-context-senitive, that is; the cockpits are different depending
  on what kind of information you are looking at, but make use
  of reusable panels based on the vantages required in the cockpit.

see Attachment "Mappings, Filters, Evaluations Diagram"  
2) The creation of a sort of 'dataflow pachinko parlor' where data source
   selection is done at the top, and then each layer of panels below
   represents finer levels of refinement, restriction or detail.  Lateral
   segmentation of the display into hierarchic vertical collections of panels 
   indicates partitions of the data which visually document differing
   filter sets in different vertical panel sets.

Technological Uncertainties
  There are both serious ontological-complexity as well as implementation-
  efficiency challenges associated with both of these undertakings.

Effort engaged in during fiscal 1998
  Programmed the java classes in the COM.emergence.tie.gui.util package.
  This 'BifurcationPanel' system is a user interface component
  which will figure prominently in the two advances in information display
  functionality described above.


(9) Direct Manipulation Interface for Crit, Eval, WorldView and 
Depiction Creation
===============================================================
Little more than thinking about this in fiscal 1998.  Included here for
completeness.

(10) Taxonomy of Criteria, in Diverse Natural Languages
=======================================================
Little more than thinking about this in fiscal 1998.  Included here for
completeness.

(11) Mapping Conceptual Domains onto Graphical Representations
==============================================================
Little more than thinking about this in fiscal 1998.  Included here for
completeness.

(12) Criteria and Evaluation Visualization
==========================================
see Attachment "package COM.emergence.tie.depiction.graph"

Technological Advance Sought
  The creation of a system for graphically representing essentially any kind 
  of data about any thing, including but not limited to people's evaluations
  of the thing in question with respect to multiple criteria simultaneously.

Technological Uncertainties
  - Can we devise a comprehensive system for mapping rating systems (and
    datatypes typically used for object attributes) onto the discreet
    displayable characteristics of onscreen objects?
  - What is an object architecture (pattern language?) suitable for such
    a system?
  - What is an ontology (including entities such as rating systems,
    datatypes, display characteristics and so on) capable of representing
    such a system?
  - Which aspects of the above objectives are best dealt with using OOD and
    which with knowledge?
 
Methodology
  We created both a preliminary java implementation of such a system 
  and the attendent ontological support.

Effort
  There were several weeks of work on this in fiscal 1998, but the work
  continued over into fiscal 1999.  (Numbers are determinable upon request.)