|
List of Projects
Sampling and Approximate Queries
·
Approximate
Query Processing
(Gautam
Das, Arjun Dasgupta, Zubin Joseph)
In many OLAP and decision support environments, it is often desirable to
answer complex long-running aggregate database queries approximately,
provided some estimate of the error is also given. For example, when a sales
manager asks give me the aggregate sales of Product X, grouped by the US
states, she/he is probably not interested in getting answers to the nearest
cent. We approach this difficult problem using statistical sampling-based
techniques. Our objective is to propose practical solutions that require
minimal changes to the underlying DBMS systems.
·
P2P sampling
|

Click to enlarge
|
(Gautam Das, Zubin Joseph)
The focus of this project is on sampling and statistics gathering from
unstructured Peer-to-Peer networks. We are currently working on the
DiVE-DeeP project (Distinct Value Estimation with Duplicates across Peers)
which deals particularly with distinct value estimation where there is
duplication of data across the peers in the network. We hope to extend this
work to related problems such as the approximation of duplication on the
network in order to determine trends in the popularity of data on the
network.
|
Recent
Publications
- Surajit Chaudhuri, Gautam
Das, Vivek Narasayya. Optimized Stratified Sampling for Approximate
Query Processing. To appear in ACM Transactions on Database Systems
2007.
- Arjun Dasgupta, Gautam Das,
Heikki Mannila. A random walk approach to sampling Hidden Databases.
To appear in SIGMOD 2007.
- Benjamin Arai, Gautam Das,
Dimitrios Gunopulos and Vana Kalogeraki. Approximating Aggregations
in Peer-to-Peer Databases. HDMS 2006.
- Gautam Das: Approximate
Query Processing. Tutorial, SBBD 2005.
- Gautam Das: Sampling
Methods in Approximate Query Answering Systems. Invited Book
Chapter, Encyclopedia of Data Warehousing and Mining. Editor John Wang, Information Science Publishing,
2005.
- Gautam Das: Approximate
Query Processing Techniques. Invited Tutorial at the 11th International
Conference on Management of Data (COMAD) 2005.
List of Projects
|