Introduction and Overview
Introduction (V1.0)
Protocols
Data Mining Process and Weka (V1.0)
Mining Data from Amazon.com (V1.0)
Recommender Systems (V1.0)
Document Classification / Spam Filter (V1.0)
Documentclustering mit NNMF (V1.0)
Face Recognition (V1.0)
Guidelines for Presentations
Contents of Presentations
Student Presentations
Machine Learning Basics
Weka Introduction
Document Classification and -Clustering
Recommender Systems (from last term)
Face Recognition
Python
Python Tutorial (Maucher) (Version 1.0)
|
General Information on the Lecture, Term SS 10:
| EDV Nr. |
21308 |
| Date: |
Di 14.15h-17.30h |
| Room: |
136 |
| SWS/ECTS: |
3/3 |
|
|
Contents
In this course 6 different data mining and pattern recognition applications are implemented by all student groups. A group contains at most 3 students.
The implementation of each application should be done within one afternoon (14.15h-17.30h).
The applications are:
Data Mining Process: Steps of the entire Data Mining Process in general are demonstrated using the Weka Data Mining Tool.
Mining Data from Amazon.com:Using the Amazon Web Service (AWS) one can access loads of product and review data from Amazon.com. In this exercise we integrate the data in our programms using a python wrapper for AWS. Then we apply various intelligent algorithms for mining interesting knowledge out of this data. E.g. we perform trend analysis or predict price models. Students are free to develop their own data mining applications
Recommender Systems: Recommender Systems are applied in E-commerce for generating customized recommendations.
Well known are the Amazon.com recommendations which are either distributed by e-mail or presented on the Amazon web page after login.
For generating these recommendations the products which have already purchased or reviewed by the user are taken into account.
In this exercise the currently most popular algorithms (Collaborative Filtering) for generating recommendations are implemented, tested and analysed.
Spam Filter: A Naive Bayes Classifier is implemented for filtering spam. It is also shown how to apply this algorithm for document classification in general
Document Classification and Feature Extraction: In this excercise a large amount of RSS-Newsfeeds is collected. All articles coming from the different feeds are clustered using non-negative matrix factorisation. Essential features of each document cluster are extracted
Face Recognition: In this excercise a programm for face recognition is implemented. For a given set of training images (biometrical face photos) the Principal Component
Analysis (PCA) is applied to calculate the space of eigenfaces. Then a photo which has to be recognized is transformed to the space of eigenfaces and the closest training photo is calculated.
All applications are implemented in Python.
In addition, students have to prepare presentations. Goal of the presentations is a sufficient introduction for the lab exercises. The presentations are therefore scheduled before the first lab exercise.
Groups and Dates
| Vortrag |
Python |
Machine Learning |
Weka |
Recommender |
Doc Classification |
Face Recognition |
| Gruppe |
Maucher |
Gruppe 1 |
Gruppe 2 |
Gruppe 3 |
Gruppe 4 |
Gruppe 5 |
| 23.03.10 |
Presentation |
Audience |
Audience |
Audience |
Audience |
Audience |
| 13.04.10 |
Audience |
Presentation |
Presentation |
Presentation |
Audience |
Audience |
| 20.04.10 |
Audience |
Audience |
Audience |
Audience |
Presentation |
Presentation |
| 27.04.10 |
Data Mining Process and Weka |
| 11.05.10 |
Mining Data from Amazon.com |
| 01.06.10 |
Recommender Systems |
| 08.06.10 |
Document Classification / Spam Filtering |
| 15.06.10 |
Feature Extraction & Document Clustering (Newsfeeds) |
| 29.06.10 |
Face Recognition |
|
|
|
Announcements
SS 10
- 16.03.2010
- First lesson in this term: Introduction and Registration
|