HubbleDotNet Physical View

Physical View

Hubble.net integrates full text search engine and relational database together. It allow full text search using SQL to the data in relational database. Hubble.net component will manage the Inverted index for the full text data in relational database, and save the index under directories. The storage of real data is managed by the relational database. Hubble.net provides an IDBAdapter interface. User can user that interface to custom database adapter.

Section "Database adapter" will explain how to add custom database adapter for Hubble.net

When building Inverted Index, system needs to do word segmentation to the full text data. Hubble.net provides an IAnalyzer interface to custom word segmentation.

Section "Word Segmentation" will explain how to add custom word segmentation for Hubble.net

Hubble.net will run as a windows service after installion. It provides a Hubbl.SQLClient component to communicate with Hubble.net service. The interfaces provided by SQLClient is similar to SqlClient in ADO.net. We will explain the details in SQLClient section.

LogicalView

Logical View
Hubble.net is similar to relational database. It has concepts like "Database" and "Table". The "Database" and "Table" in Hubble.net only provide a mapping to their corresponding parts in relational database. There are not any real entity for "Database" or "Table" in Hubble.net. When user use SQL to query the Database and Tables in Hubble.net, Hubble.net will automatically map to their real part in relational database. From user's perspective, Hubble.net is like a "real" database.
Hubble.net will manage the Inverted index for text fields, and 单值索引 on untokenized fields. Relational database will manage the B+ tree index. We will introduce how to coordinate those indices later on.

 

Table Types

Active Table

The feature of Active Table is that it will synchronize the table operations in relational database while it updates indexes. When doing the operations like create, delete and truncate to Active Table in Hubble.net, Hubble.net will create, delete, truncate the tables in relational database. Active Table is used in the applications with high level of requirements to real time. Index and data are both update at real time.

When creating an Active Table, IndexOnly should be False.

active table
Above diagram is a typical active data update flow chart
SQL about create, delete and truncate operations are triggled by Hubble.SQLClient. They arrive Hubble.Core. Hubble.Core will update the tables in relational database accordingly, then update full text index in Hubble.net. We can understand that relational database and full text index are updated at same time from this diagram.

Passive Table

Active Table has real time advantage. When creating full text index for existing database, we want to build full text index function to existing data, but creating new tables through Hubble.net. In this case, we can not use Active Table. We must use Passive Table.
Passive Table has IndexOnly = True in creation statement.
Passive Table
Above Diagram is a typical passive update flow chart
When relational database is updated, Hubble.net will not update index immediately. User needs to use QueryAnalyzer or other program to read the updated information in the tables in relational database, and convert such change to sql like insert/update/delete, and send them to Hubble.Core. Hubble.Core will update related full text index. Hubble.net may provide a simple update model in future.

Workflow of Full Text Index

Full Text Index

Full Text Index Flow Chart

Above diagram is the flow chart for Hubble.net Full Text Indexing. When index instruction is sent from SQLClient to Hubble.Core, Hubble.Core will optimize the query, parse the query, and dispatch to full text index , 单值索引, or relational database. It will then merge the results to Hubble.Core, sort and return to SQLClient.

This work flow is same to active or passive mode.

Properties for Data Table in HubbleDotNet

IndexOnly

This perperty indicates if a table is active or passive.

Directory

This property indicates the directory where the full text index is saved.

DBTableName

DBTableName is for the name of the table in relational database

DBAdapter

DBAdapter is used for which adapter to use when communicating with relational database

DBConnect

Connection string for relational database

DocId

Only for Passive Table. It is used to set a field in relational database table as DocId field in Hubble.net.
The use of these properties are discussed in "Creating Table" section

Last edited Aug 16, 2011 at 2:33 AM by linkspeed, version 4

Comments

No comments yet.