Iowa State University

Iowa State University

 

 

INtelligent Data Understanding System

 

 

 

Tools Provided by INDUS

 


 

 

      INDUS ontology editor can be used to define attribute types (linear attribute types or tree-structured isa hierarchies over the values of the attributes) or to modify a predefined set of attribute types. When the editor opens the user is presented with the available predefined or previously defined types. However, the user can define new types or load existing types and modify them. Figure 1 shows the graphical interface of the ontology editor. The panel on the left  contains the types that are available in the system. For a selected type, the panel on the right shows an editable description of that particular type. The types defined in INDUS can be exported to XML.

 

 

Figure 1: Ontology Editor (click on the picture for better quality)

 

 

      Future  Work: In the current implementation, only tree-structured hierarchies (e.g., attribute value taxonomies) can be defined over the value of the attributes. However, in the near future support for general directed acyclic graphs (DAG) will be added.

Top


 

 

     The schema editor is used to define the schema of a data source based on the available types. The editor allows the definition of new schemas, but it also allows a user  to load and possibly modify an old schema (stored in a schema database). Figure 2 shows the interface that allows the definition/modification of a schema. The left panel shows the attributes that describe the data, while the right panel shows their corresponding types (chosen from the list of available ontological types). 

 

Figure 2: Schema Editor (click on the picture for better quality)

 

Top


 

  • Mappings Editor

 

     The mapping editor is used to define semantic correspondences (or interoperation constraints) between two ontologies. The mappings can be defined at two levels: schema level (between attributes that define data sources schemas) and attribute level (between values that two attributes can take).INDUS allows the following types of interoperation constraints at both schema and attribute level: semantic equality (e.g., AASequence:O1AAComposition: OU), semantic subsumption (e.g., MIPS:16.19.01:O1GO: 0017076:OU or GO: 0017076:OU MIPS:16.19.01:O1), semantic compatibility (e.g., MIPSFuncat:O1 GOFunctClass: OU), semantic incompatibility (e.g., Gene:O1 Source: OU). In addition to that INDUS allows continuous constraints (defined by a mathematical expression, e.g. degrees Centigrade to degrees Fahrenheit). Figure 3 shows the interface that allows specification of interoperation constraints between two ontology-extended schemas. The leftmost panel shows an extended schema associated with a data source, which includes the hierarchical type ontologies associated with attributes. The second panel shows the available interoperation constraints. The third panel shows the extended schema associated with the user data. The user can select a term in the first schema, desired interoperation constraint, and a term in the second schema. The user-specified interoperation constraints that are used to infer consistent mappings-specified are shown on the rightmost panel.

 

Figure 3: Mappings Editor (click on the picture for better quality)

 

Future Work: In the current implementation, the editor allows forming a mapping by choosing any term in the first ontology-extended schema (on the left), any interoperation constraint  (in the middle) and any term in the second ontology-extended schema (on the right), even if some of these mappings are not consistent with the mappings already existent in the mappings database. A reasoner that will check the correctness of the mappings and their consistency with the existent set of mappings will be implemented. Besides the interoperation constraints showed in Figure 3 and continuous contraints, INDUS will allow procedural constraints (defined by a procedure uploaded by the user, e.g., mapping  AASequence:O1 to AAComposition:OU at attribute level).

Top


 

           The data source editor (registration) is used by owners of data sources for specifying the ontology-extended schema of an existing data source (by selecting from a set of available, previously defined schemas), the type of the data source (Oracle, MySQL, PostgreSQL databases are supported in the current implementation), the location of the data source (i.e., url), the driver that should be used to connect to the data source and user/password information required to access that particular data source. Figure 4 shows the interface that allows registration of data sources. New data sources could be also defined this way but their instances would need to be specified one by one (this feature was created more for testing purposes).

Figure 4: Data Editor (click on the picture for better quality)

 

Future Work: The data source types supported by the current implementation of the editor are already structured as tables. However, iterators that can be used to make a data source look like a table structured according to its schema and its ontology will be implemented in the future  (e.g., with appropriate iterators a web page can be structured as a table, although it is presented as text).

Top


 

           The view editor  is used by a user to specify his or her own view of a domain (local schema), the data sources that are of interest to the user (within the same domain) from a list of previously registered data sources, their associated schemas and mappings from the data source schemas to the user schema.  Once all this information is defined, the user can proceed to ask queries.

Figure 5: View Editor (click on the picture for better quality)

 

Future Work: The user needs to make sure that the data sources chosen for querying are from the same domain (i.e., compatible). Future implementations will automatically will check this.

Top


 

 

     The query editor allows users to pose queries in the user ontology, as all the data was in a single table structured according to the user schema and ontology. A query is specified by selecting (and possibly restricting/conditioning) attributes from the user ontology, through a friendly interface. Only well-formed queries are allowed, the editor rejecting queries that are not well-posed. Once the attributes of interest in the query are specified, the user can choose select or count on the top of that, thus forming SELECT or COUNT queries. The system will answer the queries posed by the user according to the user defined view.  More precisely, queries posed by the user are sent to a query-answering engine (QAE) that decomposes a user query according to the distributed data sources, then maps and re-writes them (possibly doing some reasoning) according to the mappings to the specific data source ontologies. The results of the partial queries answered by the distributed data sources are sent back to the QAE which composes them into a complete answer to the user query, maps it back to the user ontology and presents it to the user.

 

Figure 6: Query Editor (click on the picture for better quality)

 

Future Work: Only SELECT (the instances satisfying the properties specified by user) or COUNT (the number of instances satisfying the properties specified by user) queries can be defined in the current implementation. More complex statistical queries will be supported in future versions. No data source constraints (e.g., privacy constraints, remote code execution constraints, etc.) are considered at this time. These will be added in the future and the optimization component of the current QAE will be improved.

Top