HomeNLP

General Information on the Lecture, Term WS 10/11:

EDV Nr. 21309
Date: Do 14.15h-15.45h
Room: 135
SWS/ECTS: 2/3

Announcements:

October, 7th 2010: First lesson of the term

Content

Natural Language Processing (NLP) deals with techniques that enable computers to understand the meaning of text, which is written in a natural language. Thus NLP constitutes an essential part of Human Computer Interaction (HCI). As a science NLP can be considered as the field, where Computer Science, Artificial Intelligence and Linguistics overlap.

NLP enables applications like dialog systems, machine translation or sentiment analysis. Since the advent of Web 2.0 sentiment analysis plays a dominant role. It’s goal is to determine the sentiment on products, components, and properties out of a vast amount of mostly user generated web-content.

In this lecture the basic techniques of NLP will be taught. However, the lecture does not only provide the theory but also the implementation of the relevant NLP procedures. For the implementation Python and the Natural Language Processing Toolkit (NLTK) are applied. As a showcase for the relevant NLP techniques a sentiment analysis system will be implemented and optimized iteratively.

Lecture contents:

Lecture 1 provides an overview on NLP, it's main goals, challenges and applications. Moreover in this lecture the overall structure of the course and it's goals are described

Lecture 2 is a general introduction into the programming language Python. In all of the following lectures Python-implementations of the introduced techniques will be presented

Lectures 3 and 4 are concerned with text preprocessing and normalisation. E.g. how to access data from the web, clean HTML, segment text in sentences and words (tokenization) and transform words to their basforms (stemming and lemmatization). For this the most important encoding types are discussed. Moreover, the application of regular expressions in NLP is introduced and some popular stemmers are applied.

Lecture 5 introduces algorithms for text classification and methods to measure and evaluate performance of these algorithms

In Lecture 6 the application Sentiment Analysis is introduced. This example application serves as a showcase to demonstrate the use of all NLP techniques introduced in the sequel.

Lecture 7 is concerned with the use of corpora and lexical resources in NLP.

Lectures 8 is concerned with morphological- and lexical analysis, Part of Speech tagging and taggsets

In Lecture 9 techniques for syntactical analysis are presented, such as chunking and parsing

Lectures 10 deals with Information Extraction and its subtasks Named Entity Recognition and Relation Recognition.

Lectures 11 introduces techniques for the extraction of meaning (semantics) from text (semantic analysis).