Rakesh Agrawal
IBM Almaden Research Center
San Jose, CA 95120, U.S.A.
ragrawal@almaden.ibm.com
The Quest project on data mining at the IBM Almaden Research Center has developed innovative technology to discover useful patterns in gigabytes of data in a short amount of time. This software can be used to solve the following customer problems:
In this tutorial, I will draw upon my Quest experience to present my perspective of data mining, describe current work, and present some open problems.
Clik
here for
further informations on this topic.
Rakesh Agrawal, Sakti Ghosh, Tomasz Imielinski, Bala Iyer, and Arun Swami, ``An Interval Classifier for Database Mining Applications'', VLDB-92, Vancouver, British Columbia, Canada, 1992, 560--573.
Rakesh Agrawal, Tomasz Imielinski and Arun Swami, ``Mining Association Rules between Sets of Items in Large Databases'', SIGMOD-93, Washington D.C., May 1993.
R. Agrawal, C. Faloutsos, and A. Swami, ``Efficient Similarity Search in Sequence Databases'', 4th Int'l Conf. on Foundations of Data Organization and Algorithms (FODO), Chicago, Oct. 1993.
Rakesh Agrawal and Ramakrishnan Srikant, ``Fast Algorithms for Mining Association Rules in Large Databases'', VLDB-94, Santiago, Chile, Sept. 1994. Expanded version available as IBM Research Report RJ9839, June 1994.
Rakesh Agrawal and Ramakrishnan Srikant, ``Mining Sequential Patterns'', 11th Int'l Conf. on Data Engineering, Taipei, Taiwan, March 1995.
R. Agrawal, G. Psaila, E.L. Wimmers, and M. Zait: ``Querying Shapes of Histories'', VLDB-95, Zurich, Switzerland, Sept. 1995.
R. Agrawal, K.I. Lin, H.S. Sawhney, and K. Shim: ``Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases'', VLDB-95, Zurich, Switzerland, Sept. 1995.
R. Srikant and R. Agrawal: ``Mining Generalized Association Rules'', VLDB-95, Zurich, Switzerland, Sept. 1995.
R. Agrawal and G. Psaila: ``Active Data Mining'', 1st Int'l Conf. on Knowledge Discovery and Data Mining (KDD-95), Montreal, August 1995.
M. Mehta, J. Rissanen, and R. Agrawal: ``MDL-based Decision Tree Pruning'',
1st Int'l Conf. on Knowledge Discovery and Data Mining (KDD-95), Montreal,
August 1995.
W. Bruce Croft
NSF Center for Intelligent Information Retrieval
Computer Science Department
University of Massachusetts, Amherst
In this course, I will give an overview of the important functionality provided by an information retrieval system, and then discuss the issues and techniques involved in producing an integrated system. The course will emphasize support for the full range of functionality required in a text-based system, including retrieval, routing, filtering, distribution, feedback, interfaces, and browsing. Areas such as probabilistic retrieval models, persistent object management, indexing, query optimization, and query languages will be covered in detail.
Clik here for further informations on this topic.
Ralf Hartmut Gueting
University of Hagen
Germany
The tutorial aims at giving a coherent picture of the main research
results obtained so far in the areas of modeling, querying, data structures
and algorithms for system implementation, and system architecture.
H.V. Jagadish
AT&T Bell Laboratories
Wide area communications networks have been around for a long time. Ad hoc application-specific solutions have been adopted for data management. With rapid changes the communications industry is currently undergoing, and exponential growth in traffic volume, such ``hard-wired'' solutions are no longer acceptable, and generic database software is desired. However, traditional databases do not provide all the features required in networks, as we will discuss.
There are three major layers at which it is relevant to consider database
needs in a large network: network operation, network management, and network
services. Database requirements differ between the three, and we will consider
each in turn.
Alberto O. Mendelzon
University of Toronto
Everyone who has used a WWW browser such as Mosaic or Netscape knows the frustrations of trying to find information that is definitely out there somewhere, if we only knew where. The ``lost in hyperspace'' syndrome, well known from the early implementations of hypertext, has become more severe with the enormously larger scale and lack of any coherent structure in distributed environments like the Web.
We will survey two approaches to this problem:
Jan Paredaens
Universiteit Antwerpen
We will discuss a number of such data models and we will show that the development of a solid theory for spatial databases depends on a variety of disciplines: database system theory, geography, computational geometry and topology.
We will then focus on two particular data models: the linear model in which exact spatial information is available and the topological data model in which only relative positions of spatial objects are considered.
This course is prepared and will be presented in cooperation with two Ph. D. students Bart Kuijpers and Luc Vandeurzen.
Clik here
for further informations on this topic.
Joachim W. Schmidt and
Florian Matthes
Universitaet Hamburg
In this tutorial, Persistent Polymorphic Systems are presented as a generalization of DBMSs which offer substantially improved generic services like orthogonal persistence, bulk data storage, iteration abstraction, and multi-user access. The generic services of Persistent Polymorphic Systems can be instantiated and customized to specific applications by high-level declarations and statements.
Moreover, database applications make heavy use of additional generic services like data visualization, behavior modeling, network communication or workflow management which are offered by external servers such as GUI toolkits, fourth-generation languages, distributed object managers or workflow tools. Therefore, the future success of database environments depends crucially on their ability to be integrated smoothly and on a high level of abstraction into a larger heterogeneous information infrastructure.
We outline recent progress towards Persistent Polymorphic Systems along the lines of
Clik here for loading the slides of this talk.