What is natural language processing?
Natural language processing (NLP), or language technology, is a subfield of AI that focuses on the interaction between computers and human language. NLP combines linguistics and computer science to study the rules and structure of language and create intelligent systems that are capable of understanding, analysing and extracting meaning from text and speech. In simplistic terms, NLP allows us to make ourselves understandable to computers through spoken or written language.
NLP can be divided into two major branches based on the form of language: speech technology that focuses on processing human speech and text technology that focuses on processing written texts.
Speech technology focuses on automatically detecting, analysing and understanding the spoken language of people. The main subfields of speech technology are speech recognition, speech synthesis and speaker recognition.
Speech recognition is the automatic recognition of human speech and its conversion into textual form. Some application areas of speech recognition include, among others, document dictation, meeting recording, voice-controlled device interaction, automatic subtitling of TV shows and assistance systems for the hearing impaired.
Speech synthesis allows written texts to be automatically converted into speech. Some application areas of speech synthesis include, among others, the reading of news, books, subtitles and other texts to people who are visually impaired, dyslexic, parents of young children or those engaged in activities that prevent them from reading texts.
Speaker recognition automatically identifies individuals based on their speech and intonation. Speaker recognition is used, for example, in automatic protocol generation, identity verification and criminal investigations.
Text technology focuses on the automatic processing of written texts, identifying patterns and analysing them. The subfields of text technology include text analysis, text mining, machine translation and automatic text generation, among others.
Text analysis is a basic technology of NLP that deals with pre-processing and grammatical and semantic analysis of unstructured text data, e.g. it prepares data for task-specific methods.
Text mining is an automated process that uses NLP to extract valuable insights from unstructured texts. By converting data into machine-readable information, it is possible, for example, to automatically classify texts, determine their sentiment, identify important elements (e.g. named entities) and generate automatic summaries.
Machine translation is an automated process that enables the translation of a source text into the target language using computer software. Modern machine translation goes beyond simple word-for-word translation to convey the full meaning of the source text in the target language. It analyses all the elements of a text and identifies how words relate to one another.
Automatic text generation enables the creation of various types of texts without human intervention. Automatic text generation finds extensive applications in creating reports, producing conversational-style texts, writing essays, answering questions and much more.