113444b Natural Language Processing

Zuletzt geändert:	12.07.2024 / Maucher
EDV-Nr:	113444b
Studiengänge:
Dozent:
Sprache:	Deutsch
Art:	V
Umfang:	2 SWS
ECTS-Punkte:	3
Workload:	Lecture: 15 x 2 SWS = 22.5 hours Pre- and Postprocessing: 15 x 2 SWS = 22.5 hours Exam preparation: 5 days, 8 hours/day = 40 hours Overall Workload =85 hours
Inhaltliche Verbindung zu anderen Lehrveranstaltungen im Modul:	Sowohl im Data Mining als auch im Natural Language Processing (NLP) werden Methoden aus dem maschinellen Lernen (Teilgebiet der Künstlichen Intelligenz) angewandt um in großen Datenmengen Muster zu erkennen. Die Muster repräsentieren neues Wissen, das in den Daten verborgen ist. Im Natural Language Processing sind die Daten, auf welche die Verfahren angewandt werden natürlich-sprachliche Texte aus Dokumenten oder aus dem Web. Im Data Mining Praktikum besteht diese Einschränkung auf die Datenart nicht. Ausserdem steht im Data Mining Praktikum die selbständige Implementierung der Verfahren im Vordergrund, während die Vorlesung Natural Language Processing eine umfassende Vorstellung aller relevanten Prozesschritte des NLPbietet.
Prüfungsform:
Bemerkung zur Veranstaltung:	Deutsch Teilnehmerbeschränkung
Beschreibung:	Lecture 1 provides an overview on NLP, it's main goals, challenges and applications. Moreover in this lecture the overall structure of the course and it's goals are described Lecture 2 is concerned with text preprocessing and normalisation. E.g. how to access data from the web, clean HTML, segment text in sentences and words (tokenization) and transform words to their baseforms (stemming and lemmatization). For this the most important encoding types are discussed. Moreover, the application of regular expressions in NLP is introduced and some popular stemmers are applied. Lecture 3 introduces algorithms for text classification and methods to measure and evaluate performance of these algorithms In Lecture 4 is concerned with the use of corpora and lexical resources in NLP. Lecture 5is concerned with morphological- and lexical analysis, Part of Speech tagging and taggsets Lecture 6 introduces Markov Chains and Hidden Markov Models and their application in NLP, e.g. for POS-Tagging. Lecture 7 intoduces chunking (shallow parsing) and how this technique can be applied for Entity-Relaction Extraction. The application of chunking provides answers to questions like which objects are mentioned in a text ? and which relations between these objects are described in the text ? Lectures 8 parsing techniques for syntactical analysis are presented. Context Free Grammers, their pros and cons and finally feature based grammars are introduced. Lecture 9 introduces techniques for the extraction of meaning (semantics) from text (semantic analysis). Central in this lecture is the introduction of Lambda-Typen Logic and how the constituentes of natural language sentences can be transformed to this type of logic.
Literatur:	S. Bird, E. Klein, E. Loper; Natural Language Processing with Python; O'Reilly, 2009 C.D. Manning, H. Schütze; Foundations of Statistical Language Processing; MIT Press, 1999 N. Indurkhya, F.J. Damerau (Editors); Handbook of Natural Language Processing; Chapman & Hall/ CRC, 2010 Weitere Literatur finden Sie in der HdM-Bibliothek.
Internet:	http://www.hdm-stuttgart.de/~maucher/