Saturday, December 28, 2013

Multiple Sybase IQ DB Servers data with extract keys from file (Sybase C++ application)

The challenge ...

High performance C++ application interacting with multiple Sybase IQ servers.
Steps
a) Read some keys K from a file on disk.
b) For K keys, extract the data from DB Server S1 giving result R1.
c) Using the result R1, find the data from second DB Server S2 giving result R2.
d) Do some processing from C++ code on R2 data extracted in c) above.

Time taken in first run (POC) - 2 hours
Time taken in final implementation - 4 mins

Problems identified 
1) Reading the keys from file to C++ application and query the data R1 using these keys:

First query (run in batch):
select * from DBSERVERS1.SCHEMA.TABLE where key in (k1,k2,k3 ..., k1024) -- considering 1024 batch size.

Improved query:
i) Use Load Table to get the keys K1
ii) Join the Load Table with Server S1 data (NO need of "IN" clause).


2) Join of different server data
First query (
select * from DBSERVERS2.SCHEMA.TABLE where key in (k1,k2, k3,k4... ,k1024) -- considering 1024 batch size.
Where keys are obtained after join of K from file with table in DBSERVER S1.

Improved query steps
a) Create a temporary table in IQ DB S2 with result set R1 with the data extracted from using "INSERT INTO" LOCATE Syntax.
b) Perform a join operation of R1 (output of first query) with DB Server S2.

This should fetch the data a lot faster as compared to other methods (in clause or temporary table insert with "select union").

3) Aggregation at multiple levels - considering we wanted to aggregate per date, per month and per year.

First method
a) First query for daily data
b) Second query for aggregated monthly data
c) Third query for aggregated yearly data.

Improved method - using analytic function
Use the aggregation function of IQ (sum/ max) and use the GROUP BY ROLL UP (details)

No comments:

Post a Comment

Hey if you want you can comment below for things you found were wrong here anywhere or anything you want to suggest or say.