Chair of Baidu Technical Committee

Email: wu_hua@baidu.com
Address: Baidu Technology Park Building No. 1, No. 10 Xibeiwang East Road, Haidian District, Beijing, 100093, China

I joined Baidu in 2010. Now I am the technical leader of Baidu NLP department and knowledge graph department. Before that, I worked for Toshiba (China) R&D Center and Microsoft Research Asia (MSRA). I obtained Ph.D. degree in pattern recognition and intelligent system from the Institute of Automation, Chinese Academy of Science in 2001.

My research interest includes dialogue systems, machine translation, natural language processing and knowledge graph.

News

We are hiring (both interns and employees)! Please drop me an email with your resume if you are interested in working with us on NLP problems, including but not limited to Dialogue Systems, Machine Translation, Question Answering, Distributed Representation, Generation, Knowledge Graph. Experiences with machine (incl. but not limited to deep) learning for NLP are preferred.
We are organizing the Workshop on Simultaneous Translation (2022, 2021, 2020), where there is a shared task on Chinese-English and English-Spanish simultaneous translation.
Our PLATO-2 model was ranked top 1 at DSTC9 tracks 1, 2 and 3 and PLATO-XL was ranked top 1 at DSTC10 tracks 1 and 2.
We launched LUGE (Language Understanding and Generation Evaluation Benchmarks ) on Chinese NLP, which aims to provide researchers with various kinds of data sets and evaluations, and jointly promote the progress of Chinese NLP technology. A recent introduction on this is available here (In Chinese). If you are interested in LUGE or sharing data sets, pls. contact me.

Professional Activities

Program co-chair of AACL 2020, ACL 2014
Action editor of TACL, starting from July, 2021
Area chairs or SPC of ACL, IJCAI and AAAI
Co-organize the first Workshop on Automatic Simultaneous Translation 2020
Co-organize the ICDAR Workshop of Document Image and Language 2021

Research

♠ Open-Domain Dialogue Systems

The aim of the open domain dialogue system is let the machines capable of chatting, answering question and completing tasks, as well as the ability of rapid learning and continuous evolution. Its core competencies are as follows:

Understanding: understand natural languages
Expression: express in fluent natural languages
Emotion: understand emotions and respond with appropriate emotions
Thinking: Context-based calculation, reasoning and decision making
Learning: Capable of learning and evolution

It is not easy to make such a system come true. There are several fundamental problems to be solved: dialogue-oriented knowledge representation, knowledge-grounded policy learning, knowledge-grounded response generation. In order to approach this target, we have conducted some researches:

Large-scale pre-trained response generation model
Based on the available large-scale open-domain conversation, we pre-trained a response generation model PLATO-2 via curriculum learning. We have released our English models and source codes at Github. PLATO-2 was ranked top 1 at DSTC 9 Track 1, Track2, and Track 3 shared tasks. we also trained a model named PLATO-XL with 10 billion parameters.
Knowledge-grounded policy learning and response generation
we leverage graphs to guide policy learning. Different kinds of graphs are used including knowledge graphs, conversation graphs constructed from query logs, event graphs constructed from stories. Several papers were published in AAAI 2020, ACL 2020, IJCAI 2020.
Datasets for knowledge-grounded dialogue system
DuCov: This corpus is designed to facilitate the researches towards building a human-like conversational agent: endowing it with the ability of proactively leading the conversation. In DuConv, one acts as a conversation leader and the other acts as the follower. The leader is provided with a knowledge graph and asked to sequentially change the discussion topics, following the given conversation goal, and meanwhile keep the dialogue as natural and engaging as possible. DuConv enables a very challenging task as the model needs to both understand dialogue and plan over the given knowledge graph. This dataset contains about 270K utterances and 30k dialogues.

DuRecDial: This corpus is designed to facilitate conversational recommendation over multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e.g., QA) to a recommendation dialog, considering user’s interests and feedback. DuRecDial contains about 10k dialogs, 156k utterances. In each dialog, the recommender proactively leads a multi-type dialog to approach recommendation targets and then makes multiple recommendations with rich interaction behavior. This dataset allows us to systematically investigate different parts of the overall problem, e.g., how to naturally lead a dialog, how to interact with users for recommendation.

♠ Machine Translation

Since 2010, we have been working on an online machine translation product named Baidu Translate, which translates among 203 languages. In 2011, we launched the statistical machine translation service. In May, 2015, we launched the world’s first neural machine translation service. Besides text translation, Baidu Translate supports speech-to-speech translation, simultaneous translation, and OCR/image translation.

Simultaneous Translation
We co-organized the first Workshop on Automatic Simultaneous Translation 2020, where we release the first Chinese-English simultaneous translation dataset, which contains about 70 hours of Chinese speech audio, human transcripts, ASR results and English translations. In order to make tradeoff between translation quality and translation efficiency, we proposed several methods including wait-k and adaptive meaningful units segmentation method.
Multilingual Translation
For most of language pairs such as Chinese-Spanish, Chinese-Japanese, Chinese-Thai Language, there exists data sparseness problems. Besides pivot language approaches, we proposed the one to many translation method in 2015, which shares the source language encode, and use individual decodes for each target language.