The speaker
Dr. Preslav Nakov is a Principal Scientist at the Qatar Computing Research Institute, HBKU. His research interests include computational linguistics, "fake news" detection, fact-checking, machine translation, question answering, sentiment analysis, lexical semantics, Web as a corpus, and biomedical text processing. Dr. Nakov leads the Tanbih mega-project (, developed in collaboration with MIT. The project's aim is to build a news aggregator that limits the effect of "fake news", propaganda and media bias by making users aware of what they are reading. Dr. Nakov is the Secretary of ACL SIGLEX and ACL SIGSLAV, and a member of the EACL advisory board. He also serves on the editorial boards of the Journals of Transactions of the Association for Computational Linguistics, Computer Speech and Language, Natural Language Engineerng, AI Communications, and Frontiers in AI. Dr. Nakov co-authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and many research papers in top-tier conferences and journals. He received the Young Researcher Award at RANLP'2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. Dr. Nakov's research was featured in over 100 news outlets, including Forbes, Boston Globe, Aljazeera, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others. Dr. Nakov received his PhD degree in Computer Science from the University of California at Berkeley (supported by a Fulbright grant), and he was a Research Fellow in the National University of Singapore, a honorary lecturer in the Sofia University, and research staff at the Bulgarian Academy of Sciences.
The talk
Intelligent Question Answering Using the Wisdom of the Crowd
In recent years, community Question Answering forums such as StackOverflow, Quora, and BG-Mamma have gained a lot of popularity as a source of knowledge. These forums typically organize their content in the form of multiple topic-oriented question–answer threads, where a question posted by a user in the past is followed by a possibly very long list of other users' comments intended to answer the question. Many such on-line forums are not moderated, which often results in noisy and redundant content, as users tend to deviate from the question and start asking new questions or engage in conversations, fights, etc. Yet, they represent a rich source of information, which can help answer a number of new questions, as people often ask similar things again and again. I will explore three general problems related to such forums: (i) deciding which answers are good, (ii) finding related/duplicated questions, and (iii) finding good answers to a new question. This will involve models based on deep learning and semantic/syntactic kernels. Part of this work was integrated in a production system, e.g., as thumbs up in a forum for (i), or as part of a smart search for (iii). I will also introduce some promising extensions of this work in directions such as application to Arabic (can we apply these models to Arabic medical forums) and Bulgarian (how about the BG-Mamma forum?), cross-language question answering (can we answer a question in Arabic using a forum that is in English), fact checking (many answers in the forum look superficially good, but which of them are actually factual), trollness detection (can we find the forum trolls and their answers), answer justification (can we find and group contradictory answers), and interactive cQA (can we turn an entire forum into a chatbot, e.g., one that can be integrated in Alexa). This research was performed by the Arabic Language Technologies (ALT) group at the Qatar Computing Research Institute, BHKU. It is part of the Interactive sYstems for Answer Search (IYAS) project, which is developed in collaboration with MIT-CSAIL.